gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

C/C++ > GCC - unnecessary add/sub sp,sp,#4?

#129374 - Dwedit - Tue May 22, 2007 5:19 am

I'm using the compiler switches "-Os -mthumb -mcpu=arm7tdmi -mtune=arm7tdmi"

GCC is generating this code for one function:
Code:

print_2_func:
   push   {lr}
   sub   sp, sp, #4
   bl   strmerge_str
   bl   text2_str
   add   sp, sp, #4
   @ sp needed for prologue
   pop   {r1}
   bx   r1

Why is it unnecessarily adding and subtracting 4 from the stack pointer?

Also, the adding and subtracting of 4 to and from the stack pointer looks like it's interfering with the compiler's ability to do tail calls.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

#129375 - tepples - Tue May 22, 2007 5:26 am

Are you making sure to -fomit-frame-pointer?
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#129378 - Dwedit - Tue May 22, 2007 5:50 am

Same code regardless of that flag. -Os supposedly turns on -fomit-frame-pointer.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

#129391 - keldon - Tue May 22, 2007 9:55 am

Oh; I always thought that push decremented SP so it should not stop it from performing tail calls right?

#129427 - Miked0801 - Tue May 22, 2007 7:31 pm

Compilers in general like to add unneeded crap to functions. The worse the compiler, the more crap that is added. We have functions that the compiler pushes 2 extra registers that are unused. Or, the famous subs r0,r0, #1 cmp r0, #0 type stuff.

If performance is 100% needed on a function, just hand code it and be done with it :)

#129433 - Dwedit - Tue May 22, 2007 8:36 pm

I don't care about performance here, I care that I'm using a stack under 512 bytes in size, and the compiler is just destroying it with that crap.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

#129451 - sajiimori - Tue May 22, 2007 10:47 pm

The same philosophy applies, I think. If you've got a tiny stack, write the code by hand.

#129481 - tepples - Wed May 23, 2007 5:25 am

But then how is one supposed to rewrite the whole of newlib and libfat by hand?
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#129504 - keldon - Wed May 23, 2007 10:03 am

tepples wrote:
But then how is one supposed to rewrite the whole of newlib and libfat by hand?

Well that shouldn't be a problem since this only affects you in the case of the random stack wastage occurring in within a sets of methods that will create deep nested calls to each other.

Other than that it wouldn't pose that much of a stack problem providing it's not already at the end of it, right?

#129546 - sajiimori - Wed May 23, 2007 6:12 pm

Indiscriminate use of the standard library seems out of the question with less than 512 bytes of stack.

Anyway Dwedit, also try -fno-exceptions if you haven't already. Exception support is another reason GCC might force a frame pointer. -fomit-frame-pointer is more of a hint than anything; GCC will still use one if it thinks it's necessary.

#129550 - Miked0801 - Wed May 23, 2007 6:28 pm

Which begs the question - what are you doing that only allows a 512 byte stack? We cameclose to that on GBC with hand coded Z80.

#129553 - Dwedit - Wed May 23, 2007 6:45 pm

It looks like it may be conformance to the "Procedure Call Standard for the ARM Architecture":
Quote:

5.2.1.2 Stack constraints at a public interface
The stack must also conform to the following constraint at a public interface:
* SP mod 8 = 0. The stack must be double-word aligned.


There really should be a way to get rid of that restriction.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

#129559 - sajiimori - Wed May 23, 2007 7:27 pm

Cool, I didn't know that.

(GCC should just push 2 registers instead of 1 to eliminate the add/sub...)

Since the calling conventions only apply to C (and languages with C interoperability), again, hand-written code can ignore that restriction, as long as it doesn't call C code.