#177836 - WriteASM - Sun Mar 31, 2013 12:58 pm
I'm trying to compress a plain-text "helpfile" using VRAM-compatible LZ77 compression, but am only getting a 32% compression rate. If I try VRAM-compatible LZ77 on a 189KB ASM source file, I get a 62% compression rate. (Windows ZIP gives 69% on that, and 47% on the former.)
I am using a few non-ASCII characters for indexes and such, but the helpfile is generally plain ASCII (apart from CRLF = $80). Plain text usually gives around 50% compression rate, but do a few non-ASCII characters throw it off that far?
Does anyone have some pointers as to what makes easy-to-compress text data?
Or should I be using apLib or PuCrunch?
#177837 - Bregalad - Sun Mar 31, 2013 10:10 pm
I think I have to point you to my CompressTools.
#177838 - WriteASM - Mon Apr 01, 2013 12:31 pm
Thanks; I'll give it a try. It does look like I need a Java console to use the tools.
Also, my bASMic IDE project for the GBA is nearing completion, and I would like to post it here in the near future especially for beta testing and/or debugging purposes (before I forget my way around 650KB of source code!) Just how/where do I post a ~45KB multiboot binary?
#177839 - Dwedit - Mon Apr 01, 2013 1:41 pm
If you're looking to post files, there are the public sites like mediafire and dropbox, but also several message boards allow attachments. For example, the NESDev board allows attachments, and my message board also does. Nesdev has an "other retro dev" board, so your posts wouldn't be offtopic there, and I'll just let anyone post files on my board.
As for compression, I was getting pretty good results with aplib. I also wrote my own APlib decompressor in ASM, and it ran 4 times as fast as the C version. Aplib is not "vram safe" in that it doesn't do 16-bit writes, but that's not really much of a problem.
Aplib isn't optimized for text though, since it's just for 8-bit data, and ASCII text tends to be 7-bit or 6-bit. I guess you can compare Deflate (Huffman + LZ, used in zip files) to Aplib, and see what wins.
APlib seems to do almost as well as Zip, but not quite, but it decompresses very quickly.
I don't think anyone has ported LZMA (7-zip) to the GBA yet, but that seems to be a holy grail of decompression.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#177840 - WriteASM - Tue Apr 02, 2013 12:48 pm
I don't think anyone has ported 7-Zip to the GBA because it's quite memory hungry. According to the Windows 7-zip encoder, at lowest settings, it would take 3MB of memory to decompress. (Default settings require 18MB for decompression.) You'd need pretty good compression to fit that in 256K of EXTRAM :-)
Thanks for the compression tips. I'll find out which one works best.
By the way, Dwedit, I was quite fascinated by the GBC ROM dump link on your message board. Interesting.
Of course, the GBA's BIOS has already been hacked via what else? BIOS calls :-)
#177848 - WriteASM - Sun Apr 07, 2013 12:50 pm
I will probably end up using ApLib. It gives a 44% compression ratio (where 0%=no compression, 100%=no file out), versus LZ77 at (now) 33%. 7-Zip gives just over 50%.
From what I read, ApLib is a "pure LZ77 compressor". Can the BIOS LZ77 decompress function handle these files? If not, and your ASM decompressor is for the ARM7TDMI, I'd appreciate a copy.
#177849 - Dwedit - Sun Apr 07, 2013 1:32 pm
My aplib decompression code is found in the PocketNES source code as "apack.s". If you are integrating this into your code, you might need to take out a few macros (like b_long, replace it with ldr pc,=xxxx), but the Aplib decompression code should stay in iwram for speed. Code is formatted for Gnu Assembler.
A corresponding compressor can be found in NesPack under the name "apack.exe"
Aplib is not standard LZ77 that the BIOS handles, but it's called "Pure LZ77" because it does not use any additional entropy compression or bit packing other than ways of encoding the distances and lengths. The neatest part of APLIB is how it processes whole bytes: It leaves them whole. Whole bytes are read from the data stream, and bits are read from a variable, then when it runs out of bits, it fetches a new byte from the data stream. Using whole bytes is much faster then reading 8 unaligned bits from the data stream, like Pucrunch does.
Aplib is not random access, the entire data must be decompressed into memory. If you want random access, split the file into blocks first.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#177852 - Bregalad - Sun Apr 07, 2013 2:10 pm
For english text I usually get >50% savings with recursive BPE.
#177853 - WriteASM - Mon Apr 08, 2013 3:07 pm
Bregalad: Can you tell me how to use the ".class" files in your CompressTools? I've installed Java Runtime Environment 7.0.110.21, and get a baker's dozen of errors when I point it to the CompressTools.
7-Zip actually only gives 49% (15,852 bytes -> 8,108 bytes), so I'm not expecting more than that. This is MOSTLY plain text, but has binary characters sprinkled throughout.
Dwedit: Thanks for the link. Looks like a nice and very simple routine.
_________________
"Finally, brethren, whatever is true, whatever is honorable, whatever is right, whatever is pure, whatever is lovely, whatever is of good repute, if there is any excellence and if anything worthy of praise, dwell on these things." (Philippians 4:8)
#177858 - Bregalad - Tue Apr 09, 2013 2:31 pm
Yeah I think most of the problems is because the class in not in the default package and if you don't call java exactly like it supposes you too, it won't fint the class (I know it's dumb).
You are supposed to enter the folder bin, but not into the package folder (I don't remember it's name but it's something like compressTools).
Then you should type
java compressTools.Main
You will get some instruction how to use the compressor if the Main class is found.
I think I'll remove the package in the next release so that it will be easier to run the program in command line (hopefully).
#177864 - WriteASM - Thu Apr 11, 2013 12:25 pm
Quote: |
c:\bin>"C:\Program Files (x86)\Java\jre7\bin\java" compresstools.main
Exception in thread "main" java.lang.NoClassDefFoundError: compresstools/main (w
rong name: compressTools/Main)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(Unknown Source)
at java.security.SecureClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.access$100(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.launcher.LauncherHelper.checkAndLoadMain(Unknown Source) |
That's what happens. If JRE can't find CompressTools, I get only one error.
Quote: |
For english text I usually get >50% savings with recursive BPE. |
I'm assuming that's "recursive byte pair", and not a combination of several compression methods. (I'm looking for a small decompression routine: Aplib certainly is that, but I'm trying to evaluate all the available options.)
P.S. Hoping to release bASMic IDE for the GBA pretty soon, after I get code interrupts working...or not :-/
_________________
"Finally, brethren, whatever is true, whatever is honorable, whatever is right, whatever is pure, whatever is lovely, whatever is of good repute, if there is any excellence and if anything worthy of praise, dwell on these things." (Philippians 4:8)
#177865 - Bregalad - Fri Apr 12, 2013 9:10 pm
Quote: |
That's what happens. If JRE can't find CompressTools, I get only one error. |
Java is case sensitive.
If you tell it to find a package which is named compresstools instead of compressTools it won't find it.
Similarly if you tell it to find a class which is named main when it's called Main, it won't find it.
Quote: |
I'm assuming that's "recursive byte pair", and not a combination of several compression methods. |
Yeah. It is extremely powerful, and the decompression routine is damn small.
#177866 - WriteASM - Fri Apr 12, 2013 11:15 pm
OK, that works. Interestingly, if I misspell the package name as "compressTools.Man" (or similar), there's only one error: can't find the specified package. I can only wonder what programming in Java must be like...
Here's how the compression routine contenders stack up so far:
-Uncompressed- (16,537 bytes)
Recursive BytePair (11,843 bytes)
VRAM-safe LZ77 (10,904 bytes)
Aplib (9,372)
Let's say that I have some compression algorithm choices...
I'll note that although the input file is MOSTLY plain text, it has a fair number of binary characters.
And I did get code interrupts working. Permitting nested interrupts on the GBA requires that much more context be saved.
_________________
"Finally, brethren, whatever is true, whatever is honorable, whatever is right, whatever is pure, whatever is lovely, whatever is of good repute, if there is any excellence and if anything worthy of praise, dwell on these things." (Philippians 4:8)
#177867 - Bregalad - Sat Apr 13, 2013 9:54 am
Quote: |
I can only wonder what programming in Java must be like... |
The main reason which made me choose Java rather than, say, C, is that it is much easier to debug for, and Java leaves an exception every time you try to access an array outside of bounds, instead of doing weird bugs or a segfault.
Not to mention it is easier to read/write to/from files on the hard disk, and that there is a very nice library of things for e.g. maintaining lists, sorting, etc... which I use a couple of times in CompressTools. In C you'd have to download extra libraries for that.
Back to the original topic, you should probably use APlib since it's the one who compress better. If there is enough RAM on the GBA for decompressoin.
I'd be interested at how it compress data by the way ^^
If you need no RAM at decompression, (Recursive) BytePair is for you, only two small lookup tables are required.
#177868 - WriteASM - Sat Apr 13, 2013 1:16 pm
Quote: |
If there is enough RAM on the GBA for decompressoin. |
Aplib only requires 8 CPU registers, (including pointers to the input data and output location), and about 100 lines of ASM code. Truly amazing!
I did modify Aplib's decompression library slightly to permit decompressing to VRAM (for lack of anywhere else), as well as running the main loop in EXTRAM.
If you're interested in adding Aplib to your CompressTools (whoops, I mean "compressTools" :-)), ask Dwedit.
_________________
"Finally, brethren, whatever is true, whatever is honorable, whatever is right, whatever is pure, whatever is lovely, whatever is of good repute, if there is any excellence and if anything worthy of praise, dwell on these things." (Philippians 4:8)
#177870 - Dwedit - Sat Apr 13, 2013 5:38 pm
By the way, I wrote the decompression code, it was ported from the original C code. The compression algorithm and code isn't mine, I'm just an enthusiastic user who loves this simple algorithm and wanted it to decompress faster. I think I might have slightly changed the example compression program to not use the "safe" version of apack (which excludes address and size bytes from the output).
Compression code is from http://ibsensoftware.com/products_aPLib.html
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#177872 - WriteASM - Mon Apr 15, 2013 2:00 pm
OK, that makes sense, as the commandline compression tool acknowledges Joergen Ibsen as the programmer. At any rate, you did a very nice job with the ASM decompression routine.
BTW, I have posted my bASMic IDE on the "Announcements And Comments" forum here, and you are all welcome to check it out.
_________________
"Finally, brethren, whatever is true, whatever is honorable, whatever is right, whatever is pure, whatever is lovely, whatever is of good repute, if there is any excellence and if anything worthy of praise, dwell on these things." (Philippians 4:8)