gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Coding > problems calling ARM from Thumb

#147976 - bpoint - Mon Dec 31, 2007 12:04 pm

Happy New Year everyone! (well, almost ;) )

I'm having a bit of a problem getting ARM/Thumb interworking working properly with devkitARM r21.

I can currently compile and execute both ARM and Thumb code (my interrupt handler is compiled as ARM and is located in iwram). I can also call Thumb code from ARM, and return back properly to ARM, but I *cannot* call ARM code from Thumb. The interworking glue being generated is using an invalid address.

To show what's going on, here's a small dissassembly of the actual code (compiled in debug mode, no optimizations enabled):
Code:
080345F0: F012    bl       
080345F2: FF72    bl      ___ZN7CFSound13DeviceGBABase6updateEv_from_thumb

080474D8: 4778    bx      pc
080474DA: 46C0    mov     r8, r8
080474DC: EABEE6C7    b       0x07001000


And so the CPU jumps into OAM memory, and everything goes downhill from there. The area of code where the interwork glue is located seems to be correct, as there are other functions where the offsets are perfectly valid.

Looking at the map file, the actual address of the update function which should be called is half-way correct:
Code:
 .iwram         0x03001000      0x194 c:/CF/lib/gba\libDebug-CFSound.a(gbaBase.arm.o)
                0x03001000                CFSound::DeviceGBABase::update()


Why 0x0700 is being used instead of 0x0300 is what I don't understand.

Just to make sure I've done everything right:
1) ARM code is compiled with -marm -mthumb-interworking
2) Thumb code is compiled with -mthumb -mthumb-interworking
3) The update() function has __attribute__(section (".iwram"), long_call) specified, and is being compiled as ARM
4) The function calling the update() function is just normal Thumb code

Is there something I've missed?

#148014 - Dwedit - Mon Dec 31, 2007 6:31 pm

I've also experienced this problem. To work around it, I just created stub routines in the same memory area, then manually created long jumps in ASM (with ldr pc,=xxxx), but that's a very very hackish solution.

The impression I get is that the linker is broken. It throws out the highest bits of the address, then fails to raise a warning or link error indicating that it has done such.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

#148040 - bpoint - Tue Jan 01, 2008 7:15 am

This is not good. I'm even experiencing this problem calling Thumb code from ARM code as well. I'm glad I'm not the only one who's having this problem though. :)

After a bit of testing, it doesn't look like the problem is the linker -- it's actually the compiler. The compiler is ignoring the long_call attribute on a function, and generating PC relative branches to the interworking glue code. When crt0 copies this code into iwram, the addresses are no longer valid. Sadly, compiling the entire source file with -mlong_calls doesn't help either.

This is a pretty serious bug, as there's no real good workaround. I could either avoid interworking code entirely (completely use either ARM or Thumb in the entire application), or simple not relocate code to another section, but neither one of these is an acceptable solution.

Is there some way to instruct gcc that the calls to the generated interworking glue should have the long_call attribute?

#148052 - tepples - Tue Jan 01, 2008 3:05 pm

For GCC-specific problems, if you have a SourceForge.net account, you will likely get a quicker response if you ask on the devkitpro-arm-users mailing list.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#148264 - wintermute - Fri Jan 04, 2008 12:40 am

bpoint wrote:

After a bit of testing, it doesn't look like the problem is the linker -- it's actually the compiler. The compiler is ignoring the long_call attribute on a function, and generating PC relative branches to the interworking glue code. When crt0 copies this code into iwram, the addresses are no longer valid. Sadly, compiling the entire source file with -mlong_calls doesn't help either.


That would suggest to me that you're trying to mix iwram & rom/ewram code in the same source file. Unfortunately gcc assumes that all functions within the same TU can be reached with a relative branch so you need to separate the code.

If that doesn't help any then try to put together a small testcase which shows the problem & I'll have a look.
_________________
devkitPro - professional toolchains at amateur prices
devkitPro IRC support
Personal Blog

#148273 - Dwedit - Fri Jan 04, 2008 3:48 am

wintermute wrote:
Unfortunately gcc assumes that all functions within the same TU can be reached with a relative branch so you need to separate the code.


GCC's failure to replace short jumps with long jumps is a bug. How do I go about reporting this bug without getting brushed off?
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

#148279 - bpoint - Fri Jan 04, 2008 8:40 am

wintermute wrote:
That would suggest to me that you're trying to mix iwram & rom/ewram code in the same source file. Unfortunately gcc assumes that all functions within the same TU can be reached with a relative branch so you need to separate the code.


I'm not. What I am trying to do is call a function from a class that is located in iwram to a function in a derived class located in ROM. They are both in separate source files, and both functions have the long_call attribute specified.

I've found if I build with -mlong-calls for both ARM and Thumb, the code actually builds properly -- actually, the call from ARM no longer tries to jump to the interworking code, but just does an ldr/bx and gets the job done. However, this changes pretty much everything to a long call and I really don't like increasing the compiled code size when it doesn't need to be.

I will see if I can put together a test case, though. Shouldn't be too hard...

#148487 - bpoint - Sun Jan 06, 2008 10:24 am

Sorry for the late reply -- finally getting settled down after the New Year break.

Anyway, I've finally put together a stripped-down test case which shows the interworking glue jumping to an invalid address (OAM) when it should be jumping into iwram. Below are the files:

base.h
Code:
class Base
{
public:
   Base();
   
   void   doit();

private:
   int      x;
};

base.cpp
Code:
#include "base.h"

Base::Base() : x(0)
{
}

void __attribute__ ((long_call)) Base::doit()
{
   x = 5;
}

derived.h
Code:
#include "base.h"

class Derived : public Base
{
public:
   Derived();

   void   func();
};

derived.cpp
Code:
#include "derived.h"

Derived::Derived() : Base()
{
}

void __attribute__ ((section (".iwram"), long_call)) Derived::func()
{
   doit();
}

main.cpp
Code:
#include "derived.h"

int main()
{
   Derived d;

   d.func();
   return 0;
}

build.bat
Code:
c:\Devel\devkitARM\bin\arm-eabi-gcc -DGBA -mcpu=arm7tdmi -mtune=arm7tdmi -ffunction-sections -fdata-sections -fno-exceptions -fno-rtti -fno-threadsafe-statics -mthumb -mthumb-interwork -g -O0 -c -o base.o base.cpp
c:\Devel\devkitARM\bin\arm-eabi-gcc -DGBA -mcpu=arm7tdmi -mtune=arm7tdmi -ffunction-sections -fdata-sections -fno-exceptions -fno-rtti -fno-threadsafe-statics -marm -mthumb-interwork -g -O0 -c -o derived.o derived.cpp
c:\Devel\devkitARM\bin\arm-eabi-gcc -DGBA -mcpu=arm7tdmi -mtune=arm7tdmi -ffunction-sections -fdata-sections -fno-exceptions -fno-rtti -fno-threadsafe-statics -mthumb -mthumb-interwork -g -O0 -c -o main.o main.cpp
c:\Devel\devkitARM\bin\arm-eabi-gcc -Wl,-Map,out.map -Wl,-gc-sections -specs=gba.specs base.o derived.o main.o -o out.elf


The disassembly of main() is pretty much straightforward:
Code:
(3):    int main()
0800044C: B580    push    {r7, lr}
0800044E: B082    add     sp, #-0x008
08000450: AF00    add     r7, sp, #0x000
(4):    {
(5):        Derived d;
08000452: 1D3B    add     r3, r7, #0x4
08000454: 1C18    add     r0, r3, #0x0
08000456: F000    bl       
08000458: F811    bl      ___ZN7DerivedC1Ev_from_thumb
(6):   
(7):        d.func();
0800045A: 1D3B    add     r3, r7, #0x4
0800045C: 1C18    add     r0, r3, #0x0
0800045E: F000    bl       
08000460: F811    bl      ___ZN7Derived4funcEv_from_thumb
(8):        return 0;
08000462: 2300    mov     r3, #0x00
(9):    }


Now when main() calls d.func(), things start to get interesting:
Code:
___ZN7Derived4funcEv_from_thumb:
08000484: 4778    bx      pc

___ZN7Derived4funcEv_change_to_arm:
08000488: EABFFEDC    b       0x07000000


...and the CPU goes off into OAM memory instead of iwram. Looking at the generated map file shows that Derived::func() has been properly placed into iwram, however:
Code:
 .iwram         0x03000000       0x38 derived.o
                0x03000000                Derived::func()


So, any ideas on what might I be doing wrong here?

#148500 - Cearn - Sun Jan 06, 2008 3:55 pm

Each source file is a separate entity. As far as main.cpp is concerned, Derived looks like this:
Code:
#include "base.h"

class Derived : public Base
{
public:
   Derived();

   void   func();
};

In other words, it doesn't know it's supposed to do a long call. So it doesn't. Adding the __attribute__ stuff to the declarations in the class definitions will fix this.

#148556 - bpoint - Mon Jan 07, 2008 4:39 am

Cearn wrote:
Adding the __attribute__ stuff to the declarations in the class definitions will fix this.


I'll admit, I didn't consider this to be the problem, but it was.

When I went back to my original code and added the long_call attribute to the definitions in the header file, gcc started to give me compilation errors because I was trying to cast a function with the long_call attribute as a parameter to a function pointer without the attribute. After fixing the underlying code, everything works properly now. Also, it seems the section attribute isn't necessary in the header file -- just the long_call.

<rant>It would have been nice if gcc had given me some kind of compilation error to begin with saying that the function declaration and definition didn't match in the first place.</rant>

Anyway, thanks for all the help everyone! I'm glad I finally got this problem sorted out.

#148568 - bpoint - Mon Jan 07, 2008 7:26 am

One other interesting thing I came across, in case if anyone else runs into this problem. It seems gcc only allows one __attribute__ tag per function definition. If you need to specify multiple attributes, they must be separated with a comma. In other words, this doesn't work:

Code:
void __attribute__ ((section (".iwram"))) __attribute__ ((long_call)) func()
{
}

...while this does:

Code:
void __attribute__ ((section (".iwram"), long_call)) func()
{
}

I was trying to use preprocessor macros to seprarate iwram section declarations and long_call attributes, and separately combining them didn't work. gcc simply ignores the second attribute definition, and does not give any kind of warning or error.

Edit: Sorry, I should have tested a bit more before posting. The above is actually not true -- you can use multiple __attribute__ tags on a function definition. And there isn't any BBCode for strikeout text! :)