gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

ASM > Comparing two arrays

#49316 - Shadow Wedge - Thu Jul 28, 2005 5:04 pm

Hi, I've just started ARM assembly and I've made a test algorithm that compares two arrays of bytes and wanted to see if it is decent or if it could be improved further.

The C code is:
Code:
int16 CompareArrays(int8* Array1, int8* Array2, int16 SizeOfArray) {
   SizeOfArray *= SizeOfArray; //input is the root of what it should be

   int8* End = Array + SizeOfArray;
   while (Array1 != End)
      if (*Puzzle++ != *CurrentState++)
         return 0; //failure

   return -1; //success
}


And the assembly that I made was:
Code:
   mul     r3, r2, r2      @r2 contains the square root of the number of items in the array
   and     r5, r3, #3      @r5 stores the remainder of the division
   mov     r4, r3, asr #2  @r4 stores the number of fours in r3
beginloop4s:
   ldr   r2, [r0], #4
   ldr   r3, [r1], #4
   cmp   r2, r3
   movne   r0, #0          @if not equal return 0
   movne   pc, lr
   sub     r4, r4, #1
   bne   beginloop4s

   cmp   r5, #0          @make sure that there's a point in comparing more bytes
   beq   endfunction
beginloop1s:
   ldrsb   r2, [r0], #1
   ldrsb   r3, [r1], #1
   cmp   r2, r3
   movne   r0, #0          @if not equal return 0
   movne   pc, lr
   sub   r5, r5, #1
   bne   beginloop1s
endfunction:
   mvn   r0, #0          @everything's a-ok, can return -1
   mov   pc, lr


The compiler compared it one byte at a time, but I'm comparing the arrays in sets of four bytes. Would this be faster than one byte at a time?

Also, I don't know which registers I can mess with without pushing them.

Any help would be appreciated.

Edit: forgot to rename one of the variables in the C code.
_________________
...


Last edited by Shadow Wedge on Fri Jul 29, 2005 10:14 pm; edited 2 times in total

#49321 - tepples - Thu Jul 28, 2005 5:32 pm

On a typical RISC architecture such as ARM, you can compare arrays four bytes at a time only if the arrays are aligned. Any general purpose memcmp() implementation that uses one machine word at a time will also have to have one byte at a time for comparing non-aligned arrays, so your function will have to have special case code for each possibility. Some implementations just drop the word-at-a-time mode for space optimization. If you have your own reimplementation of memcmp() (as I see here), GCC definitely isn't smart enough to create a word-at-a-time mode by itself.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#49404 - Lupin - Fri Jul 29, 2005 2:08 pm

Why are you passing Array1 and Array2 if they are not used within the function?

I guess "Puzzle" and "End" need to be int32 pointers and not int8
_________________
Team Pokeme
My blog and PM ASM tutorials

#49410 - gladius - Fri Jul 29, 2005 3:30 pm

A few small things:
Code:

   sub     r4, r4, #1
   bne   beginloop4s

That will not work, because the flags are not set by every instruction. You have to specify the 's' flag to set flags. So, it would be like this:
Code:

   subs     r4, r4, #1
   bne   beginloop4s

Secondly, instead of doing the two movne's, it would probably be better to do something like this:
Code:

movne failReturn
...
failReturn:
mov r0,#0
mov pc, lr
goodReturn:
mvn r0, #0
mov pc, lr

Finally, you need to store r4-r12 and lr if you modify them. Everything else is fair game (but if you don't restore the sp to what it was on entering, there will be trouble!)

Add the following to handle that:
Code:

stmfd sp!, {r4-r5}
...
ldmfd sp!, {r4-r5} @ Execute this before you return

#49438 - Miked0801 - Fri Jul 29, 2005 7:06 pm

Also, your initial shift down of num bytes
Code:

mov     r4, r3, asr #2

Works as intended only of the number of bytes in the array is a multiple of 4. Prentend there are only 7 bytes to compare. Your routine will abort after processing only 4 of those bytes. BTW, if the size is less than 4 bytes, you routine is reading off the end of both arrays - another no-no.

#49456 - caitsith2 - Fri Jul 29, 2005 9:44 pm

Also, instead of doing
Code:

mov pc, lr


you may wish to do this

Code:

bx lr


What this allows for, is the code to be called from both thumb and arm mode, and switch the processor mode back properly, when done.

If you use
Code:

mov pc, lr


Then if the code that executed it was arm code, it will return properly, but if it was called from thumb code, then it will return in arm mode, which is not good.

#49459 - Shadow Wedge - Fri Jul 29, 2005 10:26 pm

tepples:
I don't know that much about alignment... Is it when the location of the beginning of the array is divisible by 4? And if so, how do I specifically set an array to be aligned? Otherwise I don't know how to check if an array is aligned...

Lupin: "Why are you passing Array1 and Array2 if they are not used within the function?"
Erm, sorry... From the calling function Array1 is stored in r0, and Array2 is stored in r1.

gladius: Thanks, I'll add those suggestions.

Miked0801: Well, I store the remainder of the division in r5 (the AND statement) and then the second loop is to compare the extra bytes (if there are any). Also, thanks for telling me about the less-than-4-bytes thing, but this function is assuming that the array is at least 4 bytes long.

caitsith2: Thanks, I'll change that.

My updated code is:
Code:
   stmfd sp!, {r4-r5}
   mul     r3, r2, r2      @r2 contains the square root of the number of items in the array
   and     r5, r3, #3      @r5 stores the remainder of the division
   mov     r4, r3, asr #2  @r4 stores the number of fours in r3
beginloop4s:
   ldr   r2, [r0], #4
   ldr   r3, [r1], #4
   cmp   r2, r3
   bne   failreturn        @if not equal return 0
   subs     r4, r4, #1
   bne   beginloop4s

   cmp   r5, #0          @make sure that there's a point in comparing more bytes
   beq   goodreturn
beginloop1s:
   ldrsb   r2, [r0], #1
   ldrsb   r3, [r1], #1
   cmp   r2, r3
   bne   failreturn        @if not equal return 0
   subs   r5, r5, #1
   bne   beginloop1s
goodreturn:
   mvn   r0, #0          @everything's a-ok, can return -1
   ldmfd sp!, {r4-r5}
   bx lr
failreturn:
   mov   r0, #0          @did not work, return 0
   ldmfd sp!, {r4-r5}
   bx lr


Edit: I'm failing at posting correctly today >_>
_________________
...

#49478 - tepples - Sat Jul 30, 2005 1:11 am

Shadow Wedge wrote:
I don't know that much about alignment... Is it when the location of the beginning of the array is divisible by 4?

Yes.

Quote:
And if so, how do I specifically set an array to be aligned?

There's an attribute for that.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.