gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

ASM > Can anyone help me porting x86 asm?

#46663 - Locke - Tue Jun 28, 2005 9:59 pm

I'm trying to port a jpeg decoder to the NintendoDS, but I found some asm functions in the original source and I don't know how to handle them.

I don't want to start learning x86 asm only to translate 20 lines, so if any of you can write this 3 functions in plain C I will be very gratefull (of course I will give credit for the work in the readme)

Code:
#ifdef _MSC_VER
WORD lookKbits(BYTE k)
{
 _asm {
    mov dl, k
    mov cl, 16
    sub cl, dl
    mov eax, [wordval]
    shr eax, cl
 }
}

WORD WORD_hi_lo(BYTE byte_high,BYTE byte_low)
{
   _asm {
        mov ah,byte_high
        mov al,byte_low
       }
}
SWORD get_svalue(BYTE k)
// k>0 always
// Takes k bits out of the BIT stream (wordval), and makes them a signed value
{
   _asm {
         xor ecx, ecx
         mov cl,k
         mov eax,[wordval]
         shl eax,cl
         shr eax, 16
         dec cl
         bt eax,ecx
         jc end_macro
   signed_value:inc cl
         mov ebx,[start_neg_pow2]
         add ax,word ptr [ebx+ecx*2]
       end_macro:
   }
}
#endif


Where BYTE = unsigned char; WORD = unsigned short int; SWORD = signed short int; DWORD = unsigned int; wordval = DWORD; start_neg_pow2 = DWORD;


Thanks for your time.

#46665 - sajiimori - Tue Jun 28, 2005 10:14 pm

It's hard to say what the C equivalent would be because the preconditions and postconditions aren't clear. It's possible that there is no C code that does exactly what is needed of these routines because in C you can't control the exact contents of the registers.

For example, in WORD_hi_lo, I don't know if the upper 16 bits of eax are presumed to be clear on entry, or if they are meant to be retained. In the latter case, a C version cannot be written.

x86 is a pain. :/

#46666 - gladius - Tue Jun 28, 2005 10:15 pm

No guarantee's as my x86 asm is a bit rusty, but I think this is correct.
Code:

WORD lookKbits(BYTE k)
{
 return wordval >> (16 - k);
}

WORD WORD_hi_lo(BYTE byte_high,BYTE byte_low)
{
 return ((WORD)byte_high << 8) | byte_low;
}

// k>0 always
// Takes k bits out of the BIT stream (wordval), and makes them a signed value
SWORD get_svalue(BYTE k)
{
 DWORD tmp = ((DWORD)wordval << k) >> 16;
 if (tmp & (1 << (k - 1))) tmp += start_neg_pow2[k];
 return (SWORD)tmp;
}


sajiimori: I think the intent is clear at least. I can't even remember if the top 16 bits are guaranteed to be preserved when there is a return value in ax though.


Last edited by gladius on Tue Jun 28, 2005 10:20 pm; edited 1 time in total

#46667 - headspin - Tue Jun 28, 2005 10:19 pm

Why re-invent the wheel? There is a working jpeg decoder at http://headkaze.webpal.info/ for the DS that should be more than sufficient for any jpeg decoding needs (It's very fast running from IWRAM).
_________________
Warhawk DS | Manic Miner: The Lost Levels | The Detective Game

#46675 - Locke - Tue Jun 28, 2005 11:23 pm

Thanks, I will try both decoders and use the most suitable for my pourposes.

#46716 - Locke - Wed Jun 29, 2005 11:57 am

headspin, your jpeg decoder doesn't work when loading something bigger than 256 width. I get black lines interpolated with the image data... I'm decoding the full jpeg into main memory and then copying the focused region to VRAM (using framebuffer). Did you know about this issue?


gladius, It seems to be some problem with your code. I get an error in this line:

Code:
if (tmp & (1 << (k - 1))) tmp += start_neg_pow2[k];


which I fix like this:

Code:
if (tmp & (1 << (k - 1))) tmp += ((SWORD *)start_neg_pow2)[k];


But I'm getting weird results...


I tried to understand the asm code by myself, but I found some lines that I can't.

Why incrementing and decrementing cl (k) if you are not going to use it again?
Does bt eax,ecx test eax's bit 0?
add ax,word ptr [ebx+ecx*2] <- ecx is set to 0 from the beggining...

#46719 - headspin - Wed Jun 29, 2005 2:24 pm

Locke wrote:
headspin, your jpeg decoder doesn't work when loading something bigger than 256 width. I get black lines interpolated with the image data... I'm decoding the full jpeg into main memory and then copying the focused region to VRAM (using framebuffer). Did you know about this issue?


It's not my decoder, it's by Burton Radons. But anyway, check out the following lines in gba-jpeg-decode.cpp..

Code:
JPEG_Convert (row [256], YBlock [2 * JPEG_DCTSIZE + 0], Cb, Cr); // 240
JPEG_Convert (row [257], YBlock [2 * JPEG_DCTSIZE + 1], Cb, Cr); // 241


Change those values to suit the width of your image. Let me know how you go..
_________________
Warhawk DS | Manic Miner: The Lost Levels | The Detective Game

#46728 - Lupin - Wed Jun 29, 2005 6:18 pm

SWORD get_svalue(BYTE k)
// k>0 always
// Takes k bits out of the BIT stream (wordval), and makes them a signed value
{
_asm {
xor ecx, ecx //ecx = 0
mov cl,k // lower byte of ecx (cl) = k
mov eax,[wordval] //load wordval to eax
shl eax,cl // shift eax<<cl
shr eax, 16 // shift eax>>16
dec cl // decrement lower byte of ecx (cl)
bt eax,ecx // what is this instruction doing? o_o
jc end_macro //jump if cary to end_macro
signed_value:inc cl //increment lower bit of cl
mov ebx,[start_neg_pow2] //load start_neg_pow2 to ebx
add ax,word ptr [ebx+ecx*2] //ax = ax + start_neg_pow2[ecx*2]
end_macro:
}
}

You should at least understand the basic register layout of x86 CPU... ecx is a 32 bit register, cx is the 16 lower bits of ecx, ch is the high byte of cx and cl is the low byte of cx. I am not sure about the bt instruction but i am too lazy to look it up now :)
_________________
Team Pokeme
My blog and PM ASM tutorials

#46771 - Locke - Thu Jun 30, 2005 11:36 am

headspin wrote:
Locke wrote:
headspin, your jpeg decoder doesn't work when loading something bigger than 256 width. I get black lines interpolated with the image data... I'm decoding the full jpeg into main memory and then copying the focused region to VRAM (using framebuffer). Did you know about this issue?


It's not my decoder, it's by Burton Radons. But anyway, check out the following lines in gba-jpeg-decode.cpp..

Code:
JPEG_Convert (row [256], YBlock [2 * JPEG_DCTSIZE + 0], Cb, Cr); // 240
JPEG_Convert (row [257], YBlock [2 * JPEG_DCTSIZE + 1], Cb, Cr); // 241


Change those values to suit the width of your image. Let me know how you go..


Yes, it did the trick. I've modified my file so it uses the width from JPEG_DecompressImage and now works fine with any image size.


Lupin, thanks for the info. Now I really understand the pourpose of that function. (I think bt is bit test, checks the sign bit)