#44627 - ymalik - Fri Jun 03, 2005 7:40 pm
Hello,
I am new to ARM 32 assembly. For practice, I am implementing many common functions in assembly. I am trying to implement strcat(). Here is the code. When I call the functions, the screen just blanks out:
Code: |
.thumb
.thumb_func
.align
.global strcat2
.type strcat2,function
/*
* r0: first string
r1: string to append to first string
r2: where to store two appended strings
C-code:
strcat2(const char *one, const char *two, const char *result)
{
char *temp_one, *temp_two, *temp_result;
temp_one = one;
temp_two = two;
temp_result = result;
while(*temp_one)
{
*temp_result = *temp_one;
temp_one++;
temp_result++;
}
while(*temp_two)
{
*temp_result = *temp_two;
temp_result++;
temp_two++;
}
*temp_result = 0;
}
*/
strcat2:
push {r0-r3} @ r3 used to store character for comparison
@ r0-r2 store addresses of constant pointers, but they
@ will changed within the function, so push them as well
loop1:
ldr r3, [r0]
cmp r3, #0
beq loop2
str r3, [r2]
add r0, r0, #1
add r2, r2, #1
b loop1
loop2:
ldr r3, [r1]
cmp r3, #0
beq done
str r3, [r2]
add r1, r1, #1
add r2, r2, #1
b loop2
done:
str r3, [r2]
pop {r0-r3}
mov pc, r14
|
And the prototype:
Code: |
void strcat2(const char *, const char *, const char *);
|
Thanks,
Yasir
#44629 - DekuTree64 - Fri Jun 03, 2005 7:55 pm
You should be using ldrsb/strb instead of ldr/str. ldr loads 4 bytes, so you'll most likely miss your single-byte null terminator.
You could use ldrb instead of ldrsb, since all the normal ascii characters are below 128, but just for exactness to the original C code, the chars are signed.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku
#44633 - strager - Fri Jun 03, 2005 8:07 pm
Your declaration is incorrect: remove the last 'const'. Also, you need to use LDRB or LDRSB, like Deku stated in his post.
#44634 - ymalik - Fri Jun 03, 2005 8:08 pm
Thank you. I will try that soon. Question though: I followed the tutorial at http://www.ee.ic.ac.uk/pcheung/teaching/ee2_computing/Lecture_7.pdf, and they use stmda r13!, {r0-r3} to push onto the stack. However, the assembler did not accept that, and I had to use push. Does stmda not work under GCC?
#44636 - ymalik - Fri Jun 03, 2005 8:10 pm
strager wrote: |
Your declaration is incorrect: remove the last 'const'. |
I don't understand. The pointer is not changing; just the value the pointer is pointing at is changing. Is this correct?
#44637 - strager - Fri Jun 03, 2005 8:11 pm
That is an ARM statement. You must replace .thumb with .arm, and have the option -mthumb-interwork in your commandline (you should already). Clear enough?
#44640 - sajiimori - Fri Jun 03, 2005 8:29 pm
Put the const before the * to say that the data the pointer points to is const. Put the const after the * to say that the pointer itself is const. The latter form is rarely used.
#44647 - DekuTree64 - Fri Jun 03, 2005 9:53 pm
You don't even need to push/pop r0-r3, because the calling function expects them to be trashed.
Another little optimization is that you can increment the pointers right in the load/store instructions:
Code: |
ldrsb r3, [r0], #1 @ Load signed byte, then increment r0
strb r3, [r2], #1 @ Store byte (signed or unsigned), inc r2 |
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku
#44648 - strager - Fri Jun 03, 2005 10:10 pm
Hmm. I thought that shifted the register instead. *goes to optimize some more code*
#44664 - ymalik - Sat Jun 04, 2005 12:25 am
I have some more questions about what you guys said, but let me get this over with first. Here is the new code. It still does not work:
Code: |
strcat2:
push {r0-r3} @ r3 used to store character for comparison
@ r0-r2 store addresses of constant pointers, but they
@ will changed within the function, so push them as well
loop1:
ldrb r3, [r0]
cmp r3, #0
beq loop2
strb r3, [r2]
add r0, r0, #1
add r2, r2, #1
b loop1
loop2:
ldrb r3, [r1]
cmp r3, #0
beq done
strb r3, [r2]
add r1, r1, #1
add r2, r2, #1
b loop2
done:
strb r3, [r2]
pop {r0-r3}
mov pc, r14
|
Doing ldrsb gives me the following assembler error:
Error: syntax: ldrs[b] Rd, [Rb, Ro] -- `ldrsb r3,[r0]'
#44675 - strager - Sat Jun 04, 2005 1:25 am
ymalik wrote: |
Doing ldrsb gives me the following assembler error:
Error: syntax: ldrs[b] Rd, [Rb, Ro] -- `ldrsb r3,[r0]' |
Try using "ldrsb r3, [r0, #0]". That might solve the little problem.
#44678 - ymalik - Sat Jun 04, 2005 2:15 am
strager wrote: |
ymalik wrote: | Doing ldrsb gives me the following assembler error:
Error: syntax: ldrs[b] Rd, [Rb, Ro] -- `ldrsb r3,[r0]' |
Try using "ldrsb r3, [r0, #0]". That might solve the little problem. |
That doesn't work either. I think the assembler is giving us an hint: it expects a register to be the third parameter to ldrsb
#44679 - strager - Sat Jun 04, 2005 2:19 am
Try an ldrb instead and see if that works.
#44701 - ymalik - Sat Jun 04, 2005 1:23 pm
Actually, all I had to do was replace mov pc, r14 with bx lr. I don't understand why mov pc, r14 did not work. Doesn't r14 contain the return address?
#44708 - strager - Sat Jun 04, 2005 2:29 pm
Ah: You can only access registers r0-r7 with most assembler statements (to save space). mov is one of those statements.
#44727 - poslundc - Sat Jun 04, 2005 5:56 pm
ymalik wrote: |
Actually, all I had to do was replace mov pc, r14 with bx lr. I don't understand why mov pc, r14 did not work. Doesn't r14 contain the return address? |
BX allows you to exchange instruction sets between ARM and Thumb depending on bit zero of the return address. This is the necessary instruction to use when returning from globally-accessible functions if you are interworking ARM and Thumb code.
strager wrote: |
Ah: You can only access registers r0-r7 with most assembler statements (to save space). mov is one of those statements. |
The reduced register set only applies in Thumb mode, not ARM mode, and the MOV instruction can still access all registers.
Dan.
#44745 - ymalik - Sat Jun 04, 2005 10:43 pm
I actually got the bx lr idea from your posprintf function. We, or at least I, used your function heavily in our senior design project. It is very nice.
I have some questions:
1. How are parameters passed to a function?
2. Dekutree63 said that I don't need the push and pop calls. Since registers represent a global state on a machine, why was it that when I removed the push and pop calls, the strcat function still worked fine? Since I am incrementing the pointers in the two loops, shouldn't the pointers return as pointing to different values without push and pop calls?
3. What memory location does the stack begin?
4. Since code is originally compiled into the ROM, how is that you can have some functions placed in IWRAM? Is the function copied into IWRAM when executed?
5. What is "interworking ARM and Thumb code?"
Thanks,
Yasir
#44746 - Quirky - Sat Jun 04, 2005 11:01 pm
ymalik wrote: |
1. How are parameters passed to a function?
|
The first 4 arguments are in registers r0-r3, the rest on the stack in the order that makes most sense... (in other words I don't remember without looking at an example of something that works!)
ymalik wrote: |
2. Dekutree63 said that I don't need the push and pop calls. Since registers represent a global state on a machine, why was it that when I removed the push and pop calls, the strcat function still worked fine? Since I am incrementing the pointers in the two loops, shouldn't the pointers return as pointing to different values without push and pop calls? |
GCC will save certain register (check the docs, but I think it's r4-r7?) when you call from one C function to some other function. You have to promise to keep the rest of the registers untouched. Or if you do touch them, you need to push them first and pop them after. Would that explain it? Perhaps you don't mess up more registers - or possibly worse, you use them, but then after returning the caller doesn't REUSE them. That's a tricky bug to spot.
ymalik wrote: |
3. What memory location does the stack begin? |
The end of IWRAM, or a few hundred bytes from the end IIRC. And it counts back down to the star tof IWRAM.
ymalik wrote: |
4. Since code is originally compiled into the ROM, how is that you can have some functions placed in IWRAM? Is the function copied into IWRAM when executed? |
The code is compiled to a section. This section is usually .text, but you can specify .iwram section (or eram). crt0 copies your code and data to the correct place in IWRAM or EWRAM when the program starts, the linker and link script has made sure the addresses are all correct in the actual code (IYSWIM).
ymalik wrote: |
5. What is "interworking ARM and Thumb code?" |
Calling arm from thumb, or vice versa, and then returning to the arm or thumb routine. The processor needs to know if it should change modes. This is basically summed up by the "bx lr" op, (anything else required?) which if you only go arm-arm or thumb-thumb can be substituted by any pc altering operation (in arm you could do ldmia r0!,{r9-r12,pc} for example instead of ldmia and bx lr).
#44747 - DekuTree64 - Sat Jun 04, 2005 11:39 pm
The compiler will preserve r0-r3, r12 and lr (r14) before calling a function. The APCS (ARM Procedure Call Standard) says so.
That also means that if you want to call a C function from your assembly code, you have to preserve those too (or just make sure your code doesn't expect them to be the same afterward), because the compiled C function expects you to follow the APCS like it does.
I'd recommend looking at the Crt0 to get a better feel for how everything starts up. Basically when you turn the system on, the CPU starts executing from address 0x8000000, in ARM mode. All the initialization of the stack pointer and memory is done manually through code, using lots of constants that are defined by the linker according to the linkscript.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku
#44749 - ymalik - Sun Jun 05, 2005 12:10 am
Much thanks for your clear responses.
poslundc wrote: |
BX allows you to exchange instruction sets between ARM and Thumb depending on bit zero of the return address. This is the necessary instruction to use when returning from globally-accessible functions if you are interworking ARM and Thumb code. |
Since the last bit determines whether returning to ARM or Thumb code, ARM (or Thumb) functions have only even (or odd) return addresses, correct? Why is this so? Isn't ARM and Thumb code aligned on even addresses?
#44790 - poslundc - Sun Jun 05, 2005 9:42 am
ymalik wrote: |
Much thanks for your clear responses.
poslundc wrote: | BX allows you to exchange instruction sets between ARM and Thumb depending on bit zero of the return address. This is the necessary instruction to use when returning from globally-accessible functions if you are interworking ARM and Thumb code. |
Since the last bit determines whether returning to ARM or Thumb code, ARM (or Thumb) functions have only even (or odd) return addresses, correct? Why is this so? Isn't ARM and Thumb code aligned on even addresses? |
Yes, all code is aligned on even addresses. Thumb code is aligned on 2 bytes, and ARM code on 4 bytes. This means that the least-significant bit of the address doesn't serve any actual purpose, since you can't have code located at an odd-number address anyway. So instead it's made to serve a purpose: that extra bit of information is used to encode whether the code at the destination address should be run in ARM or Thumb mode.
So in an odd-number address, the least-significant bit is zeroed when determining the actual address to branch to, but its presence is used to determine that the code at the new memory location is Thumb code and not ARM code.
Dan.
#45055 - FluBBa - Tue Jun 07, 2005 11:12 am
Can this be more like it?
Code: |
strcat2:
@ r3 used to store character for comparison
@ r0-r2 store addresses of constant pointers
loop1:
ldrb r3, [r0],#1
cmp r3, #0
strneb r3, [r2],#1
bne loop1
loop2:
ldrb r3, [r1],#1
strb r3, [r2],#1
cmp r3, #0
bne loop2
bx lr
|
_________________
I probably suck, my not is a programmer.
#45061 - strager - Tue Jun 07, 2005 1:24 pm
FluBBa wrote: |
Can this be more like it? |
From what I see, yes. Good job, it is both shorter and readable. :-)
#45318 - ymalik - Thu Jun 09, 2005 4:09 pm
The above only works in ARM mode. Is something similar possible in Thumb mode?
#45324 - poslundc - Thu Jun 09, 2005 5:10 pm
ymalik wrote: |
The above only works in ARM mode. Is something similar possible in Thumb mode? |
Have a look at posprintf, which, among other things, is a string parser/copier, and includes Thumb source when you download it.
Dan.
#45338 - Miked0801 - Thu Jun 09, 2005 7:37 pm
Quote: |
The above only works in ARM mode. Is something similar possible in Thumb mode?
|
Just need to replace the 1 conditonal compile (strneb) statement with a cmp and branch and yes it it'll be thumb compliant.
#45364 - ymalik - Fri Jun 10, 2005 12:13 am
And you also can't have ldrb r3, [r0],#1. But then the code degenerates to my old code.
#45371 - Miked0801 - Fri Jun 10, 2005 1:00 am
You sure? Oh wait, putting the immediate on the inside doesn't update the pointer does it. Hmmm.
#45494 - tepples - Sat Jun 11, 2005 5:48 am
Miked0801 wrote: |
Just need to replace the 1 conditonal compile (strneb) statement with a cmp and branch and yes it it'll be thumb compliant. |
Shouldn't a Thumb assembler see 'strneb' and know to generate a 'beq' followed by 'strb'? MIPS assemblers like to break up complicated instructions using a predefined set of macros; in fact, the MIPS architecture specifies a register '$at' that's reserved for use by macros.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#45679 - Miked0801 - Mon Jun 13, 2005 7:37 pm
$at - now that brings back memories of programming for the PS1. Lol.
And yes, a great assembler might do this. I doubt our would though.