gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

OffTopic > Any x86 assembly gurus around?

#38303 - sajiimori - Fri Mar 25, 2005 5:44 pm

Here's some functionality I implemented for ARM. I'd like to do the same thing on x86 but I don't really know that architecture very well.
Code:

struct SavedRegs
{
    int r4, r5, r6, r7, r8, r9, r10, r11, r13, r14, r15;
};

asm int exchange(
    int result,                    // r0
    SavedRegs* oldState,           // r1
    SavedRegs* newState)           // r2
{
    stmea r1, {r4-r11, r13-r14}    // Store regs into oldState, except pc.
    add r3, pc, #4                 // Store pc seperately, because it must
                                   // be modified to point at bx lr.
    str r3, [r1, #40]              // pc is 40 bytes from start of struct.
    ldmfd r2, {r4-r11, r13-r15}    // Restore from newState
    bx lr                          // Later, exchanging for oldState puts
                                   // us back here.
}

The routine above swaps out sets of registers, excluding the ones that don't need to be saved between function calls (r0-r3, r12).

Swapping again with the old set should resume execution just after the first exchange was performed.

Just before the new registers are loaded, the register used for return values (r0 or eax) must contain "result".

#38336 - notb4dinner - Sat Mar 26, 2005 12:00 pm

I can't really help with the x86 ASM but I'm just chiming in to say it's interesting to see someone else doing multithreaded stuff on the GBA - I wrote some code that was basically the start of a GBA RTOS for a class I took a while back. If you'd like I can probably get some sample code from a classmate who did the same project on a PC...

#39089 - keldon - Mon Apr 04, 2005 1:29 pm

The best x86 assembler/IDE + community known to man
I recommend the RosAsm assembler for anyone doing x86 coding.

if you're doing this on the x86, wouldn't you also want to keep the local stack variables too?

Anyway here is some code for loading/saving a state how you wanted to. There are seperate methods also for loading/saving a state. "RSLT" was used instead because the register can only store 4 bytes.

Code:

; To help preventing from stack sizes' mistakes (upper "SizeOf#x" fall here):

[SizeOf1 4     SizeOf2 8     SizeOf3 12    SizeOf4 16    SizeOf5 20
 SizeOf6 24    SizeOf7 28    SizeOf8 32    SizeOf9 36    SizeOf10 40]


; your two states
[State0: D$ ? #8 ]
[State1: D$ ? #8 ]

Main:
    ; state 0
    mov ebx 2
    mov ecx 3
    mov edx 4
    mov esi 5
    mov edi 6
    mov ebp 7
   
    push State0
    call SaveRegs
   
    ; state 1
    mov ebx 7
    mov ecx 6
    mov edx 5
    mov esi 4
    mov edi 3
    mov ebp 2
   
   
    push State0
    push State1
    call ExchangeState
   
    push &NULL
    call 'KERNEL32.ExitProcess'

; *************************************************
; param 1 - old state Address
; param 2 - new state
ExchangeState:

    ; **** SaveRegs code
    mov eax D$esp   ; eax = PC
    pushad
   
    mov ebp esp |   add ebp 32  ; ebp = esp at time of call
   
    mov edi D$esp+4             ; dst = state Address
    add edi 32
   
    ; load stack into StateAddress
    mov ecx 8
    L0:
        sub edi 4
        pop D$edi
    loop L0<                    ; sub ecx and jump to L0 if nonzero
   
   
    ; ***** loadRegs code

    ; load stateAddress into stack
    mov esi D$esp+8
    mov ecx 8
    L0:
        push D$esi
        add esi 4
    loop L0<
   
    popad
   
    mov D$esp eax       ; restore return address
    mov eax 'RSLT'
   
    ret SizeOf2
   
   
   
; *************************************************
; param 1 - StateAddress
SaveRegs:
    mov eax D$esp   ; eax = PC
    pushad
   
    mov ebp esp |   add ebp 32  ; ebp = esp at time of call
   
    mov edi D$ebp+4             ; dst = state Address
    add edi 32
   
    ; load stack into StateAddress
    mov ecx 8
    L0:
        sub edi 4
        pop D$edi
    loop L0<                    ; sub ecx and jump to L0 if nonzero
   
    ret SizeOf1                 ; return and pop 1 parameter
   

; *************************************************
; param 1 - state Address
LoadRegs:

    ; load stateAddress into stack
    mov esi D$esp+4
    mov ecx 8
    L0:
        push D$esi
        add esi 4
    loop L0<
   
    popad
   
    mov D$esp eax       ; restore return address
    mov eax 'RSLT'
   
    ret SizeOf1


Now the save method uses the PC of the method that calls it, so if you are implementing this from an interrupt the following code is necessary
Code:

InterruptMethod:
   mov eax D$esp   ; esp = PC of interrupted method
   push newState
   push oldSTate
   push eax
   mov eax 'RSLT'
   jp ExchangeState


P.S. how are you implementing this on x86? for dos?? - and I used the name ExchangeState because there was a macro called Exchange.

#39128 - sajiimori - Mon Apr 04, 2005 10:15 pm

Wow, that's pretty intimidating stuff, especially since I'm not familiar with that particular x86 assembler dialect.

ExchangeState is the pertinent part, right? As far as I know, I shouldn't need additional save and load routines, because I'm never doing a save without a load or vice versa.

I actually don't know what RSLT is... I thought eax was used for return values. o_o

It's ok if return values are restricted to 4 bytes -- the important thing is to provide the same functionality as the ARM version, which requires that ExchangeState takes an argument that will become the return value of the opposing call to exchange.

Ok, so...
pushad = push eax
D$ = double-word dereference

Why is PC immediately duplicated on the stack? After it's duplicated, isn't D$esp+4 still just PC, and not the state address?
Quote:
if you're doing this on the x86, wouldn't you also want to keep the local stack variables too?
If the stack pointer is being moved to point at a completely different stack, then the locals should still be there when I exchange back.

Thanks for the reply -- I only hope you'll humor me as I stumble about this relatively mystical architecture. ^_^

Oh, and this will probably be used in GCC or VC6 or both.