gbadev.org forum archive

BEWARE: I'm a complete noob to assembling/assembly/GBA development

I would like to learn how to program the GBA with complete control over the hardware. No automatic boot code, no stack dictating where I store my memory, no functions I didn't write, etc. I know I need arm assembly, and I know some of the basic mnemonics (and have the resources to learn the rest - that isn't the issue).

What I'd like, is a link to something that can teach me the necessary code for booting up the machine, for hardware and software interrupts, and where memory is being used and where it isn't. Does anyone know how to program the GBA completely from scratch?

Also, could anyone devise a make file to assemble such ASM programs properly with DevKitAdv, without introducing any foreign code? (please excuse the noobish question)

Thanks in advance...

Look at this example ;)
_________________
My homepage!

Curious - why do you want complete control over said system? It's hard enough developing for it with the little gems created here and there...

Because it's fun. :) I did it myself, then threw out the result and used prefab stuff. The hardest (and most boring) part is learning GCC specifics, and in retrospect I would have been better off writing my own assembler, from an educational standpoint.

To see how devkitpro sets up the GBA, look at devkitarm/arm-eabi/lib/gba_ctr0.s
It is specific to devkitpro, so to see one for a different compiler, look at boot.s which comes with PocketNES.

What gba_crt0.s does:
* Contain the required GBA header
* Disable interrupts
* Initialize the stacks (IRQ and USER)
* If program should run from RAM, copy it from the cartridge to RAM.
* Copy the data sections from ROM to RAM, both IWRAM and EWRAM.
* Clear the rest of ram
* Some stuff specific to devkitpro, like setting up the heap, and initializations for libc.
* Call "main"

And that's it. Nothing else, just your usual required startup code.

There is no way to bypass the GBA Bios. You GBA will always display "Game Boy", play the sound effect, allow select+start for multiboot mode, etc. You will need a valid ROM header and Nintendo logo. The BIOS also has its own interrupt handler which always pushes r0-r3 onto the interrupt stack, then finally calls your interrupt handler.

As for "where memory is being used", here are the "reserved" memory areas:
IN ROM:
* Cartridge Header at 0x08000000 (224 bytes long)
* Optional illegal instruction trap at 0x9ffc000

IN IWRAM:
* The interrupt vector at 0x03007FFC (this is the last word of IWRAM)
Everything else is free for whatever you want to use it for.
* Usually the stack is at 0x03007F00, and grows downwards, you don't need to put it there.
* Usually the interrupt stack is at 0x03007F00-0x03007F9F, you don't need to put it there.
* 0x03007FA0-0x03007FFB is used by some bios functions.

All of EWRAM and VRAM is free and you can do whatever you want with it.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

Thanks for the help, guys. I appreciate it. I certainly like the idea of doing my own compiler, so I might try that.

I also looked through a bit of CRT0.S, and it constantly wanted to copy things to the EWRAM or the stack. I don't know the code, what it does, what it uses in memory. I don't know if I want interrupts disabled. If anyone could help me understand not only what it does, but how it does it, I'd be very grateful.

Incidentally, I'm also in the market for a programmable cartridge to put my binaries on. I had my eye on this:
SuperCard SD [state the product's name, not a retailer's URL -- MOD]
I'm wondering if it will act like a cartridge (and run from ROM), or a linker cable (and copy the code into the EWRAM). The latter could interfere with my code. Could anyone help me there?

SuperCard copies the entire program into a RAM on the adapter. This RAM fills the same function as a traditional GBA flash card.

The adapter that runs only GBA multiboot programs is the GBA Movie Player v2.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

why is there a second stack for irq?

This is a question I've wondered about myself and would like to hear a good answer for. In the past and on other platofrms, one stack is fine. Why 2 on ARM?

The only reason I've come up with is to keep with the philosophy that all registers can be used as general-purpose if you want, including the stack pointer. So if you have the user mode r13 storing your loop counter or whatever, an interrupt can still safely occur.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

DekuTree64 wrote:

So if you have the user mode r13 storing your loop counter or whatever

Then where would you store the stack pointer during the loop?
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

tepples wrote:

Then where would you store the stack pointer during the loop?

Store it in a global variable, do your loop, and then load it back. And if you're executing from RAM, you can save a couple more cycles by putting that global variable near the loop so you can use PC-relative load/store.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

Each of the different processor modes has it own set of "banked" registers and for interrupts the link register and stack pointer are the ones banked. There are actually 37 registers and some of these other registers "become" standard registers based on the processor mode.

Some of the modes:

Code:

User and System
Fast Interrupt Request
Interrupt Request
Supervisor
Undefined
Abort

Some of the ARM models may not have all of these modes but the basic architecture is this way. My paraphrase may not be completely accurate but this is from the ARM System Developer's Guide ISBN 1-55860-874-5. I use this book frequently because it covers the entire ARM core set.

DekuTree64 wrote:

tepples wrote:

Then where would you store the stack pointer during the loop?

Store it in a global variable, do your loop, and then load it back.

And lose reentrancy, meaning you can't safely call such a routine from within an interrupt handler.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

If you use two stacks then you can be sure that when an interrupt occurs there's no chance of trashing the regular user stack.
If you build with (I think it's) -fomit-frame-pointer then leaf functions don't properly follow the ABI (in order to save a few instructions) and the stack is then handled slightly differently.
If you happened to have an interrupt in one of these functions you can't be sure where the actual bottom of the stack is. By having two stacks you never have to deal with this sort of problem.

Does this make any sense? :-)

tepples wrote:

DekuTree64 wrote:

tepples wrote:

Then where would you store the stack pointer during the loop?

Store it in a global variable, do your loop, and then load it back.

And lose reentrancy

Then reserve a larger area for that global, and treat it like a stack itself.

Simon has a good point too, that it does allow you to break the rule of "anything below the stack pointer is unpredictable". Still, there must be some reason for it beyond allowing you to get away with dirty hacks.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

I guess also on other architectures the interrupts are handled in supervisor mode, which has its own memory space. So I'm sure there is more than one stack.

And this is probably overkill: interrupt handlers can often deal with 'sensitive' data and if there was only one stack then it'd have to be cleared off first before returning to the user program. Else the user program could look through the stack for something juicy!

simonjhall wrote:

Else the user program could look through the stack for something juicy!

That is not very plausible. How would the program know when it has been interrupted and whether or not it will be interrupted again in between reading the stack? Although interrupts can be disabled how can it be ensured that an interrupt has occured and how to interpret any data that is found? How can it be detirmined what is random, what is from a called function, and what is from an interrupt?

Conceptually two stacks operate almost like two seperate CPU states. But again what is the need? What cannot be achieved by one stack that can be achieved with more? The only reason I can think of [so far] is a failsafe for errors that occur during interrupts meaning that an incorrect return will always return to the right line of code! Exceptions can be issued on the event, but the code can still resume from the correct place.

keldon wrote:

That is not very plausible. How would the program know when it has been interrupted and whether or not it will be interrupted again in between reading the stack?

Ooh, I dunno! If I knew that the operating system shared the same stack as the user program then this'd be one of the first places I might look for some way into a system. And you're right - you wouldn't know when an interrupt occurs. So you could continually scan until something good appears - and you'd roughly know the format of the data you were looking for.

Although it's not an issue with homebrew on the DS, if the operating system were to share its stack with a user program that'd be a big hole for people to exploit. But as you say, it's not too plausible :-)

One more suggestion - by having a second stack you can be sure of the size of that stack. If you use the user stack there may not be the minimum amount of memory available for your interrupt to function. I've definately seen this problem before on consoles.

simonjhall wrote:

Ooh, I dunno! If I knew that the operating system shared the same stack as the user program then this'd be one of the first places I might look for some way into a system. And you're right - you wouldn't know when an interrupt occurs. So you could continually scan until something good appears - and you'd roughly know the format of the data you were looking for.

But you can never be sure that you are seeing data and not random numbers or something that is irrelevant. You would also have to know where in the stack to find your data. Was it the first parameter? Second parameter? If it was the second parameter, how many bytes were the previous parameters?

You're right: that's a good point. Although I was really just thinking about strings of text, like char arrays.
Aaaaanyway....

Ok, for OS security I can definately see a use for 2 stacks. Let the user stack go to hell or whatever, but the supervisor stack is secure and can recover. Not an issue on GBA/DS, but ARM procs are used elsewhere as well :)

regarding the initial post, asking for 'complete control'. I'd rather have it in C.
Soo...
Is there any good tutorial on how to put data and functions in different memory areas, how to put them in specific spots in memory, how to load them there temporarily (does 'sizeof' work on functions?) and how to keep memory areas clear so that I as a programmer can mess with them as i want without corrupting data that was put ther by the gcc. I remember reading some information regarding __atttr___, but all this information seems really scattered about.

A few ways. The easiest is to physically create a pointer to memory and "copy" the function to that area. You can also create a memalloc system and do it that way. Another way (probably the correct way) is to use the link script to create sections and tell the compiler via the section attrib where to place the code.

well, and how would one go about that...
why is it called 'link' script, anyway? I thought its run before calling main, and should be a 'load' script?

Ant6n wrote:

well, and how would one go about that...
why is it called 'link' script, anyway? I thought its run before calling main, and should be a 'load' script?

The link script is used as part of the build process; it's never "run" on target. It's called the link script because it informs the linker of things it needs to know, such as what memory sections are in what physical address locations. So the link script knows that section .ewram starts at 0x02000000 and is 256Kbytes long, section .iwram starts at 0x03000000 and is 32Kbytes long, etc.

(crt0 is probably what you were thinking of; it generally contains the setup code that is run before main.)

The only advice I can give for attempting to set up custom memory sections (other than reading the GNU man pages) is to examine the link script and try to understand how it works. It's not impossible to do, although there is some serious voodoo going on in there if you are unfamiliar with the process.

Dan.

One thought though; how does ASM give you more control if you do not have much control over your ASM?

is there no simple way to reserve memory areas in gcc?
something like
'reserve 0x02000000 - 0x02000100 for user junk'
how would one move a function during runtime?

Ant6n wrote:

is there no simple way to reserve memory areas in gcc?
something like
'reserve 0x02000000 - 0x02000100 for user junk'

Part of the whole idea behind sections is to avoid the code having to know the implementation specifics of where a region is located.

That said, when it comes to EWRAM there's nothing to prevent you from "reserving" that section of memory so long as it isn't being used for some overlapping purpose. That would include: compiling in multiboot, dynamic allocation with malloc/free, and anything you declare as being in section .ewram. If you aren't using any of those features, then feel free to grab a pointer to 0x02000000 and do what you will with it.

If you want to be compatible with GCC's comprehension of the GBA's memory regions, though, you're pretty much consigned to modifying the link script to add a section.

Quote:

how would one move a function during runtime?

Grab a pointer to the function, and copy from the pointer to your destination. Knowing the length of the function is a bit tricky if the function isn't in assembly, though. One technique is to declare a function immediately following your function that marks the end of it, although you have to be careful that function doesn't get optimized out. In sum: don't do it unless you absolutely have to.

Dan.

poslundc wrote:

Quote:

how would one move a function during runtime?

Grab a pointer to the function, and copy from the pointer to your destination. Knowing the length of the function is a bit tricky if the function isn't in assembly, though. One technique is to declare a function immediately following your function that marks the end of it, although you have to be careful that function doesn't get optimized out. In sum: don't do it unless you absolutely have to.

Even if the dummy function doesn't get optimised out, there's no guarantee it'll be placed immediately after the desired function. The way I do it is get gcc to output the memory map, use that to calculate the length of the function, then hardcode that into the program as a constant. Even if you get the function moved correctly, you'll still have trouble with calls to other functions and using data in the literal pool. It is probably easiest to compile it as -fPIC.
_________________
http://chishm.drunkencoders.com
http://dldi.drunkencoders.com

Asserts will catch a change in function length if one uses constants for size - but it's sooo much easier to just declare a function to live in a certain section and letting crt0.s do the copy for you on bootup.

Code:

int foo(void) __attribute__ ((section (".iwram")));

What if you only want to have a function around in RAM while it is executing, such as if you want to load GSM or S3M or JPEG or video decoders at various times? Are there any code examples involving the support for overlays in devkitARM's GBA link script?
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

it seems at the end of the day, when one wants functions temporarily in ram (i.e. on the stack) one could always compile with -o0 -s and count how long the function is in assembler. then insert some padding (i.e. 2x) and that can be copied, cuz it doesn't matter if one copies too much (as far as i know). I do believe it could make sense at times to have functions use up precious fast memory only temporarily (i.e. for big library functions).

I, too, am interested in having "complete control" over the DS using ASM. I'd love to code everything up from scratch (not that I'd do anything serious with it; it'd be purely for educational purposes). What's the best way of going about it? If I have a .s file (or some .s files), how do I link them all and drop them on the DS, without anything else?

keldon wrote:

One thought though; how does ASM give you more control if you do not have much control over your ASM?

.s files can be compiled and linked like .c files with gcc. So it shouldnt make much of a difference whether you use c or asm.

Point is, dkA and libnds still add things behind the scene. Can I write the whole shebang, from startup to power off, in assembler?

nornagon wrote:

Point is, dkA and libnds still add things behind the scene. Can I write the whole shebang, from startup to power off, in assembler?

If you are capable of coding this in complete assembler then it should not be too difficult to write your own assembler that can deal with this.

Sure; but if there's already one that can do it, why?

Because you can make yourself something better for you that can help you to write better more managable code. Take a look at rosasm for an example of what an assembler can be.

It looks like a glorified IDE with NASM-like syntax to me. Am I missing something?

Check out its macros!

Man, the website is horrible...

I found some short examples of the macro syntax, but I'm not quite sure what's going on. Enlighten me?

macros on asm?
C is bunch of asm macros.

The macro syntax is so powerful I managed to write one that allows you to reference data like in c.

Code:

d = b . c . d . e . f

Code:

Proc mainWinProc:
Arguments @hWnd @uMsg @wParam @lParam
Local @MyScreenText

Those 3 lines are done via macros. So you can usually make use of just the default macros, but the versatility of them allows you to make macros as powerful as that.

The syntax has funny characters for good reason, once you get past that it would [obviously] make complete sense. The nessie NES emulator was written entirely using this IDE.

If you use what others have programmed ,
you must learn their methods .
This is slow way .
You must learn their secrets , even Linux
is hard to work with .

BUT ! If you start your own Op System ,
from scratch , you will be able to perfect it
quickly , because no dependencies on others.

Thats what im doin .
I have 10 NDS lites and many ARM EVB's

I will start with zero , no ARM loader , no
ANGEL debugger , no assembler , no CAPS
no C++ , not even Basic !
I tried to use ATMELs Sam-Ba to load
code to my Olimex ARM H256 .
It does NOT work , and ATMEL has 100 excuses.

So i'll boot external , and load my hand assembled
500 bytes of code , to boot to SRAM and scan
for a 16 keypad .
The 500 bytes will be able to create a dictionary .
If i test a small asm code , and it works ,
i'll save it to dictionary . Now i can enter
a hex code to execute that code fragment !

This is called software "leverage" .

I'll hook 32 LEDs to a port .
Or maybe a BW LCD .

There is no faster nor more accurate method
to learn op codes . Studying assemblers , will take
longer and you will not understand op codes
as fast because assemblers make you wait til
it's linked and then you load the debugger
and then , and then , and then , and then ....

My method crashes and you look at the
12 op codes you tested and guess at one
and try again . There is no better way
to learn Machine language !
Naturally you will start to learn the
easiest codes , first , so you can use
them to leverage the others .

Now you create 64 bit images for the stuff
in the Dictionary , so you can assemble ML
w/o ASCII text . Your keypad is only 16 keys
it can't do ASCII text .
You can create a simple MRU , MostRecentlyUsed
stack , to display your Images on the LCD .

Here is where many make the mistake to
think they must copy C++ or Basic methods .

In only 8 KB , you have a Op System , that
can create ML in minutes , with typing text at a
QWERTY keyboard .

It is missing some OpCodes , but no problem ,
and the dictionary is expanding into the 64KB SRAM
in my Olimex AR91SAM7S256 ARM7 EVB .

When you perfect the lower part of the dictionary
you can flash it into the 256KB of NOR flash .
I will not be copying FAT32 nor NDS OpSys
everything will be new .

More on Dictionary ...
When you have 100 "primatives" that help you
to code ML , next create "mid levels" , that can
"Macro" the Primatives together , then create an
image so you can run it .
Now you have more leverage .

You dont need to study IEEE F.P. , because
math , in cpu's is separate EXP and Mantissa .
We call it scaling and normalizing .
Your numbers , will be in a "context" , and
NOT in a standard form , as C++ forces .
So each set of numbers will have its own
std , depending on where they are used .

I.E. , you are encoding MPG , and need fast
Cosines . Experiment with normalizing to LOG .
Now you can divide numbers by simply subtracting
them ! And the LOG looks a bit like an exponent !

No matter how much you pay your school , they
will NOT teach you this kind of programming!

My OpSystem will use NO text , only Images .

I think everyone who likes to program ARM
should buy the $48 Olimex "H256" from
[URL removed by MOD]
It has USB "B" connector to start talking to
PC host , using Hyperterminal . No power
supply needed , it uses USB power .
I have 3 . Im hooking them to LCD's i
got for $10 .

I will test code , then move it to NDS Lites .
Im not a gamer , so i'll completely replace
the Op System , to handle a 120 GB HDD .
But NOT FAT32 ! It will be my own protocol .

This looks like something I used to write in a journal right after I woke up from a dream with a 'great idea'. Good luck you rebel you.

gbadev.org forum archive

ASM > Complete Control with ARM ASM

#110428 - alwbsok - Tue Nov 28, 2006 4:08 pm

#110437 - Legolas - Tue Nov 28, 2006 5:15 pm

#110485 - Miked0801 - Tue Nov 28, 2006 9:11 pm

#110490 - sajiimori - Tue Nov 28, 2006 9:36 pm

#110493 - Dwedit - Tue Nov 28, 2006 10:21 pm

#110537 - alwbsok - Wed Nov 29, 2006 1:58 am

#110557 - tepples - Wed Nov 29, 2006 5:54 am

#110559 - Ant6n - Wed Nov 29, 2006 6:29 am

#110604 - Miked0801 - Wed Nov 29, 2006 9:16 pm

#110620 - DekuTree64 - Wed Nov 29, 2006 11:28 pm

#110621 - tepples - Wed Nov 29, 2006 11:36 pm

#110622 - DekuTree64 - Wed Nov 29, 2006 11:43 pm

#110629 - gmiller - Thu Nov 30, 2006 12:19 am

#110635 - tepples - Thu Nov 30, 2006 12:33 am

#110638 - simonjhall - Thu Nov 30, 2006 12:47 am

#110642 - DekuTree64 - Thu Nov 30, 2006 1:15 am

#110645 - simonjhall - Thu Nov 30, 2006 1:23 am

#110651 - keldon - Thu Nov 30, 2006 3:21 am

#110669 - simonjhall - Thu Nov 30, 2006 8:46 am

#110674 - keldon - Thu Nov 30, 2006 9:43 am

#110678 - simonjhall - Thu Nov 30, 2006 10:17 am

#110741 - Miked0801 - Fri Dec 01, 2006 2:44 am

#110748 - Ant6n - Fri Dec 01, 2006 6:16 am

#110840 - Miked0801 - Sat Dec 02, 2006 3:01 am

#110856 - Ant6n - Sat Dec 02, 2006 7:10 am

#110858 - poslundc - Sat Dec 02, 2006 7:55 am

#111231 - keldon - Tue Dec 05, 2006 12:59 am

#111242 - Ant6n - Tue Dec 05, 2006 4:20 am

#111253 - poslundc - Tue Dec 05, 2006 8:22 am

#111262 - chishm - Tue Dec 05, 2006 10:20 am

#111353 - Miked0801 - Tue Dec 05, 2006 7:51 pm

#111392 - tepples - Wed Dec 06, 2006 4:45 am

#111398 - Ant6n - Wed Dec 06, 2006 6:09 am

#112525 - nornagon - Sun Dec 17, 2006 2:40 pm

#112529 - keldon - Sun Dec 17, 2006 4:18 pm

#112537 - Ant6n - Sun Dec 17, 2006 7:33 pm

#112550 - nornagon - Sun Dec 17, 2006 9:54 pm

#112552 - keldon - Sun Dec 17, 2006 9:59 pm

#112554 - nornagon - Sun Dec 17, 2006 10:00 pm

#112557 - keldon - Sun Dec 17, 2006 10:06 pm

#112560 - nornagon - Sun Dec 17, 2006 10:19 pm

#112570 - keldon - Sun Dec 17, 2006 11:31 pm

#112581 - nornagon - Mon Dec 18, 2006 5:03 am

#112582 - Ant6n - Mon Dec 18, 2006 5:54 am

#112591 - keldon - Mon Dec 18, 2006 11:04 am

#119555 - werty - Sat Feb 24, 2007 7:53 am

#119612 - Miked0801 - Sat Feb 24, 2007 7:07 pm