gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

ASM > Atomic Compare-and-Swap operation on the ARM9?

#174964 - shotgunninja - Tue Aug 10, 2010 6:06 pm

Hello, my name is Nick, and I've been working on a game engine for the DS in (gasp) C/C++. However, I'm trying to add Interrupt-Safe (aka Thread-Safe, but for Interrupt Handlers) data structures to my engine, and in order to make them truly safe 100% of the time, I need to have access to an atomic Compare-and-Swap instruction. Is there one in the instruction set of the DS' ARM9 processor?

Essentially, I need to be able to do this without interrupting and WITHOUT DISABLING REG_IME:

Code:

// As a C function:
bool CompareAndSwap(uint32& dest, uint32 oldVal, uint32 newVal) {
   if (dest != oldVal) {
      return false;
   } else {
      dest = newVal;
      return true;
   }
}
// As register pseudocode:
CAS(Rd,Ro,Rn):
   SREG[EQ] := (Rd EQ Ro);
   Rd := Rn if (Rd EQ Ro);

#174968 - elhobbs - Tue Aug 10, 2010 7:58 pm

what about swp(b)?

#174971 - Pate - Wed Aug 11, 2010 5:41 am

The ARM ARM section 9.5 shows a code snippet for creating a semaphore. There is no atomic compare-and-conditional-write in ARM, so achieving similar result needs some additional code.

The example does not produce lock-free data access, but as you mention in the other thread, perhaps you don't actually need to lock the access fully as you know that only the interrupt routine can interrupt the main "thread", not the other way around.

Pate
_________________

#175022 - LOst? - Tue Aug 17, 2010 12:10 am

I would also want to know this!

So far my guessing haven't produced anything that works.

I basically want all the InterlockedExchange, InterlockedCompareExchange, InterlockedIncrement, etc in ARM assembler callable from C.

I tried this for InterlockedExchange:
Code:

   .arch   armv5te
   .cpu   arm946e-s
   
   .text
   .arm
   
   .global   InterlockedExchange
InterlockedExchange:
   swp r1, r1, [r0]
   mov r0,r1

   .end


Code:

extern "C" s32 InterlockedExchange (s32 volatile* Target, s32 Value);


The code sure exchanges the two long values as Target becomes Value, but the returned value is not correct (I think).

Feel free to laugh, as I know 1% of ARM assembler. Mostly trying to patch code from different sites:
http://src.gnu-darwin.org/ports/lang/fpc/work/fpc-2.2.0/rtl/arm/arm.inc.html
http://www.netmite.com/android/mydroid/1.0/external/androidmono/mono/libgc/include/private/gc_locks.h
http://msdn.microsoft.com/en-us/library/ms683590(VS.85).aspx

In the end, I want to implement a semaphore class for the NDS.
_________________
Exceptions are fun

#175023 - Pate - Tue Aug 17, 2010 4:55 am

The code looks correct in principle, but adding a return at the end might help. Like this:

Code:

InterlockedExchange:
    swp     r1, r1, [r0]
    mov     r0,r1
    bx      lr


I haven't actually used .end in my own ASM code, so I am not sure if that is smart enough to add the return code if it is missing, but spelling it out is OK also in that case.

You can also reduce the code to plain

Code:

    swp    r0, r0, [r1]
    bx     lr


if you swap the parameters, but if you want to keep the current parameter order (for compatibility with other software) your current code is fine.

Hope this helps!

Pate
_________________

#175024 - LOst? - Tue Aug 17, 2010 6:37 pm

Pate wrote:
The code looks correct in principle, but adding a return at the end might help. Like this:

Code:

InterlockedExchange:
    swp     r1, r1, [r0]
    mov     r0,r1
    bx      lr


I haven't actually used .end in my own ASM code, so I am not sure if that is smart enough to add the return code if it is missing, but spelling it out is OK also in that case.

You can also reduce the code to plain

Code:

    swp    r0, r0, [r1]
    bx     lr


if you swap the parameters, but if you want to keep the current parameter order (for compatibility with other software) your current code is fine.

Hope this helps!

Pate

The bx lr worked! Thanks!

Because the volatile pointer to the long value is sent through R0, and not R1 as your example did, this code is what finally did it:
Code:

   .arch   armv5te
   .cpu   arm946e-s
   
   .text
   .arm
   
   .global   InterlockedExchange
InterlockedExchange:
   swp      r1, r1, [r0]   
   mov      r0, r1
   bx      lr

   .end

_________________
Exceptions are fun