gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

DS development > dstexcomp: Texture compression tool

#167838 - kvance - Mon Mar 30, 2009 6:21 pm

There's a few forum posts here and elsewhere about texture compression, but I haven't found any working compression tool links. So I took a crack at writing one: dstexcomp. It requires python 2.5, and you may have to install some extra python libraries if you don't have them already.

Hopefully the README should explain how to use the script. I also have some example code to show how to load and display the textures in libnds on the project page. And I have discussions of the texture format and my compressor on my LJ. Let me know if you need any help.

Patches to improve image quality are welcome ;)
_________________
Corner Office: dev LJ
HexxagonDS, dsmzx, more...

#167855 - DiscoStew - Tue Mar 31, 2009 7:28 pm

Nice tool. I tried it out myself, and while I can't say much about the processing speed (I was kinda worried about actually using all 4 cores on my processor for such a long length of time....for the first time without thinking about possible heat problems), I liked the output I got. Placed the output into your NDS example, and I don't think I could be happier.

I have a question about the compressed textures being used on the NDS though. They're not limited to the beginning of the usable VRAM slots, right? I mean, if I understand the format correctly, as long as I place the data in the correct spots so they align with each other, everything should work fine?
_________________
DS - It's all about DiscoStew

#167857 - kvance - Tue Mar 31, 2009 8:00 pm

DiscoStew wrote:
I tried it out myself, and while I can't say much about the processing speed (I was kinda worried about actually using all 4 cores on my processor for such a long length of time....for the first time without thinking about possible heat problems), I liked the output I got.


Heh, I wrote about its slowness everywhere else but I forgot to mention it here. Hopefully it won't cause anyone's CPU to overheat!

DiscoStew wrote:
I have a question about the compressed textures being used on the NDS though. They're not limited to the beginning of the usable VRAM slots, right? I mean, if I understand the format correctly, as long as I place the data in the correct spots so they align with each other, everything should work fine?


Correct. libnds isn't really set up to work with compressed textures, but it won't get in the way if you're careful. You should be able to use glGetTexturePointer() to find a texture's offset, and calculate the index offset from that. You'll also need to make sure to temporarily disable texture slot 1 before calling glTexImage2D() so it doesn't overwrite the index. And of course, you have to choose an appropriate palette size for all your compressed textures so there's enough palette VRAM for all of them.

I don't think you'll be able to share texture slot 1 between index data and other textures without replacing libnds' texture management code though.

Edit to add: Maybe at the beginning of your program, you could enable only texture slot 1, use glTexImage2D() with a phony texture of the correct size, and overwrite it with real index data later. Then libnds should leave the index alone
_________________
Corner Office: dev LJ
HexxagonDS, dsmzx, more...

#167864 - DiscoStew - Wed Apr 01, 2009 8:05 am

At first glance of not knowing any Python scripting, I really couldn't understand it, but I gave myself some more time to look over it, starting with the entry function and going line by line, function by function, to figure out what was going on. While I can't code Python, at least I now have an idea how it works.

I was thinking. Unless you were doing it yourself, I could try to port your script over to C++. Granted I don't have knowledge or access to some of the stuff imported into it (like using multiple cores), the "meat" of the whole process is in the script, and the rest can be dealt with later.

I can give it a try, but I wouldn't expect it to be done very quickly. The RL can be quite the time consumer at times. :P
_________________
DS - It's all about DiscoStew

#167870 - kvance - Wed Apr 01, 2009 3:35 pm

DiscoStew wrote:
I was thinking. Unless you were doing it yourself, I could try to port your script over to C++. Granted I don't have knowledge or access to some of the stuff imported into it (like using multiple cores), the "meat" of the whole process is in the script, and the rest can be dealt with later.


You're welcome to it. At this point, I'm more interested in better image quality than speed (though I did just realize that I should put a border around the image when dithering, so I don't have to do bounds checks) so I'm not planning on doing it.

Some of the meat is definitely in the libraries. Specifically, I use PIL's color quantizer to pick the best colors for each block. Also, the palette reducing stage just calls SciPy's k-means function.

The final "egregiously bad block" stage is hurt more by its algorithm than python's slowness. It's running the color quantizer on each palette (~8000 of them!) until it finds an acceptable one for each bad block. It would be a lot faster if it only searched similar palettes.

Quote:
I can give it a try, but I wouldn't expect it to be done very quickly. The RL can be quite the time consumer at times. :P


This is why I wrote it in python :)
_________________
Corner Office: dev LJ
HexxagonDS, dsmzx, more...

#167879 - kusma - Wed Apr 01, 2009 10:15 pm

I've also made a compressor, you can find it here. I'd like to clean it up and release it properly one day, but for now this is it.

#167887 - kvance - Thu Apr 02, 2009 1:07 am

kusma wrote:
I've also made a compressor, you can find it here. I'd like to clean it up and release it properly one day, but for now this is it.


Wow, that's great! I wish I knew about that 2 weeks ago!
_________________
Corner Office: dev LJ
HexxagonDS, dsmzx, more...

#167893 - DiscoStew - Thu Apr 02, 2009 7:22 am

kusma wrote:
I've also made a compressor, you can find it here. I'd like to clean it up and release it properly one day, but for now this is it.


Just tried your app kusma. It's nice and fast with pretty good quality, but for some reason in some of my test images, I'm getting odd "blank spaces", in which the block's colors are not anywhere near what it should be, or even close to those of the surrounding blocks. I mostly get them as solid black blocks.

Other than that, I like it. Should the resulting images be flipped though?
_________________
DS - It's all about DiscoStew

#167932 - kusma - Fri Apr 03, 2009 6:11 pm

DiscoStew wrote:
Just tried your app kusma. It's nice and fast with pretty good quality, but for some reason in some of my test images, I'm getting odd "blank spaces", in which the block's colors are not anywhere near what it should be, or even close to those of the surrounding blocks.

Thanks for the feedback. Would you care to sent me an example-image so I can debug it? :)

DiscoStew wrote:
Other than that, I like it. Should the resulting images be flipped though?

Thanks for pointing this out! Might be, the code is pretty much proof-of-concept so far. I'll look into it! By the way, is there any kind of definition of the origin of textures on the NDS?

#167936 - DiscoStew - Fri Apr 03, 2009 8:50 pm

Actually, I was kinda wrong in my explanation. Its not black, but more of a dominance of a specific color, that the blocks are filled with that color reference. Most of my tests involved a lot of dark, which is why I assumed it was filling in "black/blank" blocks, but I tried a nice bright image from the internet (which was used in color quantization testing coincidentally, lol), and I get even more odd-ball blocks. You'll notice them right off the bat in this test result.

http://www.mediafire.com/?te3jzl32kzq
_________________
DS - It's all about DiscoStew

#167985 - kvance - Sun Apr 05, 2009 7:57 pm

DiscoStew wrote:
Actually, I was kinda wrong in my explanation. Its not black, but more of a dominance of a specific color, that the blocks are filled with that color reference.


That sounds a lot like the reason I had to add the "egregiously bad block" pass in my compressor. If you run it with -e 9999 you'll probably see some similar problems.
_________________
Corner Office: dev LJ
HexxagonDS, dsmzx, more...

#167986 - DiscoStew - Sun Apr 05, 2009 10:30 pm

Just a quick off-topic question kusma. With the function FreeImage_ColorQuantizeEx(), is there a specific order designed for the resulting palette? Just wondering because with your code, you check the blend possibilities using only the first and last colors, as if they stood on the ends of the palette, and the remaining were between them.
_________________
DS - It's all about DiscoStew

#167987 - kusma - Sun Apr 05, 2009 10:35 pm

DiscoStew wrote:
Just a quick off-topic question kusma. With the function FreeImage_ColorQuantizeEx(), is there a specific order designed for the resulting palette? Just wondering because with your code, you check the blend possibilities using only the first and last colors, as if they stood on the ends of the palette, and the remaining were between them.

I didn't find a guarantee, but I seem to remember observing that the palette is somewhat sorted in reality. This might have been a wrong assumption, though.

#167988 - kusma - Sun Apr 05, 2009 10:41 pm

By the way, I've moved the code to a separate git-repo to make maintaining this a bit easier. The old repo wasn't "supposed" to be used by anyone else than me :P

#167994 - DiscoStew - Mon Apr 06, 2009 7:35 am

Hey kusma, you'll be happy to know that I discovered the problem, but I'm quite surprised by what it was, and even more confused that the compiler felt it was acceptable syntax (to my knowledge).

Starting in the "colorCount" tests, when it equals 3...
Code:
if (colorEqual(pal[1], colorLerp(pal[0], pal[2], 0.5), 31), 2*colorErr)
...should be...
Code:
if (colorEqual(pal[1], colorLerp(pal[0], pal[2], 0.5), 2*colorErr))

Separated out, it would look something like...
Code:
if( statement1, statement2 )

I don't think you meant the code to be like that, but it compiled nonetheless. Would this be similar to ANDing or ORing them?


Anyways, making the change will fix the problem. The "palette" file will grow a little with that same test image I used, but those bad blocks won't exist anymore. The resulting images actually seem to be somewhat improved too from the fix.
_________________
DS - It's all about DiscoStew

#167995 - kusma - Mon Apr 06, 2009 9:47 am

DiscoStew wrote:
Hey kusma, you'll be happy to know that I discovered the problem

Ah, thanks a lot for debugging the issue! This is indeed the real issue, and it works nicely for me as well. I guess this shows how few tiles really fall into this code-path, which is a bit scary.

Quote:
I'm quite surprised by what it was, and even more confused that the compiler felt it was acceptable syntax (to my knowledge).

The syntax is perfectly legal to my knowledge. I'm not sure how usefull it is, though. And I'm not sure how it behaves.

#167996 - kusma - Mon Apr 06, 2009 9:59 am

OK, I checked out how it behaves. It evaluates both expressions, and returns the result of the second one. Comma operator

Anyway, I'll push your change upstream when I get home from work.

#168016 - DiscoStew - Mon Apr 06, 2009 10:07 pm

I had made some additions to your code. I won't say "improvements", because atm, it improves one aspect of the resulting quality, but then makes another aspect worse.

Instead of checking with the first and last colors of a palette for blending, I made a check through all possible color combinations for the block, and took the one with the least error if under the color error (for 4-color, the most of the two, but the least of all blocks) to determine possible blending. The resulting palette size shrunk a little at the current color error, so I reduced that from 8 to 6 so my tests would have the prior palette size. The end result of the image was that there was much less blockiness. That is the good news. The bad news is that now the resulting image is somewhat more grainy and edges don't appear as smooth. It could be the method for how I chose the palettes for that section, so I'll go over it again.
_________________
DS - It's all about DiscoStew

#168022 - kusma - Tue Apr 07, 2009 12:42 am

DiscoStew wrote:

Instead of checking with the first and last colors of a palette for blending, I made a check through all possible color combinations for the block, and took the one with the least error if under the color error (for 4-color, the most of the two, but the least of all blocks) to determine possible blending. The resulting palette size shrunk a little at the current color error, so I reduced that from 8 to 6 so my tests would have the prior palette size. The end result of the image was that there was much less blockiness. That is the good news. The bad news is that now the resulting image is somewhat more grainy and edges don't appear as smooth. It could be the method for how I chose the palettes for that section, so I'll go over it again.

I'd love to see the patch, just for the sake of it. It's good to see improvements to stuff you do :) I'm sorry I wasn't able to push stuff upstream etc today, work turned into some kind of drunktard-celebration.

#168039 - DiscoStew - Tue Apr 07, 2009 11:27 pm

Ok, here's the change I made (I commented out the prior implementation)

At the top, I have...

Code:
int pal4Count = 12;
int pal4A[] = { 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2 };
int pal4B[] = { 1, 1, 2, 2, 3, 3, 2, 2, 3, 3, 3, 3 };
int pal4C[] = { 2, 3, 1, 3, 1, 2, 0, 3, 0, 2, 1, 0 };
int pal4D[] = { 3, 2, 3, 1, 2, 1, 3, 0, 2, 0, 0, 1 };

int pal3Count = 3;
int pal3A[] = { 0, 0, 1 };
int pal3B[] = { 1, 2, 2 };
int pal3C[] = { 2, 1, 0 };


...and in the code itself, I have...

Code:
int blockMode = 2; // if all else fails
         if (colorCount == 4)
         {
            int curErrVal = INT_MAX;
            int curErrIndex = -1;
            for( int i = 0; i < pal4Count; i++ )
            {
               int pal1Err = colorDiff( pal[ pal4C[ i ]], colorLerp( pal[ pal4A[ i ]], pal[ pal4B[ i ]], 0.375 ));
               int pal2Err = colorDiff( pal[ pal4D[ i ]], colorLerp( pal[ pal4A[ i ]], pal[ pal4B[ i ]], 0.625 ));
               if( pal1Err < pal2Err ) pal1Err = pal2Err;
               if( pal1Err < curErrVal )
               {
                  curErrVal = pal1Err;
                  curErrIndex = i;
               }               
            }
            if(( curErrIndex >= 0 ) && ( curErrVal <= ( 2 * colorErr )))
            {
               lerpable++;

               lerpColors[0] = pal[ pal4A[ curErrIndex ]];
               lerpColors[1] = pal[ pal4B[ curErrIndex ]];
               if( pal4A[ curErrIndex ] != 0 )
               {
                  BYTE a = 0;
                  BYTE b = pal4A[ curErrIndex ];
                  FreeImage_SwapColors(tileQuant.getFIBitmap(), &pal[a], &pal[b], true);
                  FreeImage_SwapPaletteIndices(tileQuant.getFIBitmap(), &a, &b);
               }
               if( pal4B[ curErrIndex ] != 1 )
               {
                  BYTE a = 1;
                  BYTE b = pal4B[ curErrIndex ];
                  FreeImage_SwapColors(tileQuant.getFIBitmap(), &pal[a], &pal[b], true);
                  FreeImage_SwapPaletteIndices(tileQuant.getFIBitmap(), &a, &b);
               }
               pal = lerpColors;
               colorCount = 2;
               blockMode = 3;
            }
            else
               nonlerpable++;

            /*
            if (colorEqual(pal[1], colorLerp(pal[0], pal[3], 0.375), 2*colorErr) &&
               colorEqual(pal[2], colorLerp(pal[0], pal[3], 0.625), 2*colorErr))
            {
               lerpable++;
               lerpColors[0] = pal[0];
               lerpColors[1] = pal[3];
               
               BYTE a = 1;
               BYTE b = 3;
               FreeImage_SwapColors(tileQuant.getFIBitmap(), &pal[a], &pal[b], true);
               FreeImage_SwapPaletteIndices(tileQuant.getFIBitmap(), &a, &b);
               
               pal = lerpColors;
               colorCount = 2;
               blockMode = 3;
               
            } else nonlerpable++;
            */
         }
         else if (colorCount == 3)
         {
            int curErrVal = INT_MAX;
            int curErrIndex = -1;
            for( int i = 0; i < pal3Count; i++ )
            {
               int palErr = colorDiff( pal[ pal3C[ i ]], colorLerp( pal[ pal3A[ i ]], pal[ pal3B[ i ]], 0.5 ));
               if( palErr < curErrVal )
               {
                  curErrVal = palErr;
                  curErrIndex = i;
               }               
            }
            if(( curErrIndex >= 0 ) && ( curErrVal <= ( 2 * colorErr )))
            {
               lerpable++;
               lerpColors[0] = pal[ pal3A[ curErrIndex ]];
               lerpColors[1] = pal[ pal3B[ curErrIndex ]];
               if( pal3A[ curErrIndex ] != 0 )
               {
                  BYTE a = 0;
                  BYTE b = pal3A[ curErrIndex ];
                  FreeImage_SwapColors(tileQuant.getFIBitmap(), &pal[a], &pal[b], true);
                  FreeImage_SwapPaletteIndices(tileQuant.getFIBitmap(), &a, &b);
               }
               if( pal3B[ curErrIndex ] != 1 )
               {
                  BYTE a = 1;
                  BYTE b = pal3B[ curErrIndex ];
                  FreeImage_SwapColors(tileQuant.getFIBitmap(), &pal[a], &pal[b], true);
                  FreeImage_SwapPaletteIndices(tileQuant.getFIBitmap(), &a, &b);
               }
               pal = lerpColors;
               colorCount = colorPadCount = 2;
               blockMode = 1;
            }
            else
            {
               colorPadCount = 4; // round up, colors are allocated two by two
               nonlerpable++;
            }
            /*
            if (colorEqual(pal[1], colorLerp(pal[0], pal[2], 0.5), 2*colorErr))
            {
               lerpable++;
               lerpColors[0] = pal[0];
               lerpColors[1] = pal[2];
               
               BYTE a = 1;
               BYTE b = 2;
               FreeImage_SwapColors(tileQuant.getFIBitmap(), &pal[a], &pal[b], true);
               FreeImage_SwapPaletteIndices(tileQuant.getFIBitmap(), &a, &b);
               
               pal = lerpColors;
               colorCount = colorPadCount = 2;
               blockMode = 1;
            }
            else
            {
               colorPadCount = 4; // round up, colors are allocated two by two
               nonlerpable++;
            }
            */
         }
         else if (colorCount == 1)
         {
            colorPadCount = 2; // round up, colors are allocated two by two
         }


Like I said before, it helps with block reduction, but the result appears grainy. Maybe someone can look at it, and see if any improvments can be made.
_________________
DS - It's all about DiscoStew

#168252 - nanou - Fri Apr 17, 2009 1:43 am

This kind of compression must be intended for sky textures. If you think about it, they mostly made up of gradients and most gradients are going to fit nicely within 4 colors for a 4x4 area (no matter which angle they're at) and many of the 4x4 blocks will work well with the blended modes.
_________________
- nanou

#168262 - kusma - Fri Apr 17, 2009 1:02 pm

Actually, they work pretty well for most kinds of textures.

#168275 - nanou - Sat Apr 18, 2009 2:23 am

kusma wrote:
Actually, they work pretty well for most kinds of textures.

Oh I'm sure you will get good results with other textures, but I'm thinking that most sky textures are going to turn out much closer to the original since they're likely to have few if any edges that aren't already smoothly graded.
_________________
- nanou