gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

DS development > Thumb-related stack corruption with devkitARM/gcc

#127329 - goruka - Tue May 01, 2007 7:20 pm

Hi! I'm porting a pretty large 3D app to the DS, which is written in C++. I am, of course having some issues with NDS.

First as question, why are the nds examples all compiled with -mthumb ? I dont see much reason to use thumb when the DS cpu is 32 bits.
If i compile with thumb, my app (that works fine on PC, windows/linux/solaris/osx) will randomly crash on NDS in functions, and with symtoms of stack corruption.

Without using -mthumb, most of the places where the app crashes work. (by the way, i check the stack pointer and it's always fine and dandy inside DTCM)

Any ideas about this?

Thanks!

#127337 - simonjhall - Tue May 01, 2007 7:52 pm

Wotcha.
There are advantages and disadvantages to using thumb mode over arm mode.
The thinking (from the ARM camp) is that thumb will give you greater code density - you'll probably need more instructions to do the work you want to do, but each instruction is half the size of the equivalent arm instruction. So this means that you normally get a smaller binary that does the same thing :-)

Speed can be an issue, but as ever it depends on whatchu are doing. As thumb code is generally smaller, you need less instruction cache and less memory bandwidth to keep your processor busy. However the instruction set isn't as rich and the compiler may have to use more instructions to get the job done.
However (I'm sure) thumb mode can't use the full register set but it does process data in 32-bit quantites (rather than 16-bits, like you'd expect).
Try both ways and see which one is faster for you.

Anyway, regarding the problem you've got - the data type sizes aren't gonna change based on the mode you're in - so it sounds like it 'just happens' to mess up your stack in thumb mode. It may be doing it in arm - you just don't notice it.

We're gonna need code to diagnose this one, mate :-D
Btw: are you mixing arm with thumb code? If so you're gonna need -mthumb-interwork to keep it happy.
_________________
Big thanks to everyone who donated for Quake2

#127361 - Cydrak - Tue May 01, 2007 11:04 pm

What simon said.

Basically, thumb has all the important stuff for program logic. You lose a few arm features like the long multiplies, conditional execution, the "free" bitshifts or a full set of 15 regs to play with. But code that needs it is a minority (say, fancypants math or graphics stuff with tight loops). If this makes sense to you... you're probably writing assembler, or you already singled those one or two files out for arm compilation, anyway. Which is not hard to do.

It goes without saying that if your code is small or fast enough, don't worry about it, use whatever you want. :)

I agree on the bugs, thumb and arm do very similar things so something is prolly wrong. Pointer alignment likes to trip me, since most PCs don't care--but if you misalign on the ARM you won't write where you expect...

One last fun thing--only arm ops with the high 3 bits set get executed ALL the time--most are conditional, so if the ARM jumps into random data, only a fraction (depending on current CPSR) will actually execute. If it's one of the literals in a code section, like a small int or a pointer, then the high 4 bits are all set (never exec) or clear (exec if zero). You might very well skim back into valid code. I've never seen this myself, but it could have... "interesting" implications as to the crash frequency in arm mode.

#127374 - HyperHacker - Wed May 02, 2007 12:41 am

Pardon my butting in but I have some related questions.
simonjhall wrote:
Btw: are you mixing arm with thumb code? If so you're gonna need -mthumb-interwork to keep it happy.
Is that in place of, or along with -mthumb?

Cydrak wrote:
Pointer alignment likes to trip me, since most PCs don't care--but if you misalign on the ARM you won't write where you expect...
What does happen on ARM? Doesn't it ignore the low 2 bits of the pointer for 32-bit access and low 1 bit for 16-bit access? Or do reading/writing/execution do different things?
_________________
I'm a PSP hacker now, but I still <3 DS.

#127382 - tepples - Wed May 02, 2007 2:02 am

When you bx lr to return from a function, it reads the ARM/Thumb state from the low bit of the return address. If this bit gets overwritten with a buffer overflow, boom.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#127390 - Cydrak - Wed May 02, 2007 4:31 am

Per ARM spec, misalignment can be unpredictable. Otherwise it's supposed to ignore the low bits as you said, and with reads you might get a byte rotation. Even so, the defined behavior could break things in subtle ways.

Consider if you had a local u16 sitting on the stack, and a u8 array right after it--GCC could legitimately do that because you can only STRB to the u8[] without a cast. If you or anything you call (hi there, void*!), tried to store a u32 at the first or second u8, it could hit the u16 too, if it's part of the same word.

Now if the poor u16 were a loop count... well, I guess that's all up to the optimizer.

#127560 - OOPMan - Thu May 03, 2007 7:50 am

Indeed.

Goruka, You should probably get your hands on an ARM CPU manual and some ARM assembler tutorials to get a better idea of how best to use thumb. Percieving thumb as being pointless, however, is naive.
_________________
"My boot, your face..." - Attributed to OOPMan, Emperor of Eroticon VI

You can find my NDS homebrew projects here...

#127634 - goruka - Thu May 03, 2007 6:54 pm

Thanks for the replies! I didn't know thumb had so many advantages, I never bothered to read much on it because I remember being told it was like comparing the 286/386 instruction sets, but now i'll have to do more reading on it.

About the stack corruption in thumb, I really can't figure out much.. maybe it's because i'm using C++ and the vtable thunks or some other variable is limited to 16 bits (as it is in some architectures).. Will need to do more research (and debugging sucks on you know which slow ass emulator, can't wait until my flashcart arrives)