#177757 - Bregalad - Sun Feb 24, 2013 10:57 pm
I use Devkit pro,
how do I know when my program is run from ROM, and when it is run from IWRAM or EWRAM ?
Most game carts run their program (or at least a part of it) from ROM directly, as if you remove the cartridge, the game immediately crashes.
I tried this with my program and it continues to run like normal, which means it runs either from IWRAM or EWRAM, even though I never used any function to copy anything there, let alone execute it.
Is there a way to decide the ROM, IWRAM and EWRAM management with devkitpro ?
Normally I think it should be the following :
- ROM : Most of the code
- EWRAM : Parts of the code who should be faster or code for multiplayer games where a cartridge is not required
- IWRAM : Data or parts of the code which should be the fastest
Did I get it right ?
#177758 - Dwedit - Mon Feb 25, 2013 3:17 am
By default, devkitpro generates either a cartridge version or a "multiboot" version that runs from EWRAM. If the filename ends with _mb.gba, it's the multiboot version. You can see which build it's creating by whether it says "Linking Cartridge" or "Linking Multiboot".
Where things go in memory is determined by the link scripts.
There are several sections:
.text
.data
.rodata
.bss
.sbss
.text is for most code, it will go into ROM by default. .data is for predefined global variables, this is put into IWRAM by default. .rodata is for data declared as const, and is put in ROM by default. .bss is for zero-filled variables or arrays, and is put in IWRAM by default. .sbss is for zero-filled variables that have been explicitly declared for that section, and they go into EWRAM by default.
The stack goes into IWRAM, and the heap for malloc-ed variables starts in EWRAM. It's located after all the variables declared to be in EWRAM.
The linkscript is the file that determines where those sections go when the binary is built. If you want to change the settings, you need to edit the linkscript.
There are also several more sections defined in the linkscript, such as .iwram and .ewram, these explicitly declare what kind of memory you want a variable to be in.
One thing that the default linkscript does NOT allow you to do is create sections inside video RAM, which is also very fast RAM just like IWRAM. I've made modified link scripts to get around this.
The code responsible for putting the correct data in its place at bootup is found in the crt0.s file. Code and data for the other memory sections gets copied from ROM into IWRAM or EWRAM, and zero-initialized data gets filled with zeroes.
In a multiboot build, stuff that would go into ROM goes into EWRAM instead. Copies of the initial data that gets copied to the other memory sections stays in EWRAM wasting space, so if you're making a multiboot game and need more memory, you can edit the linkscript and crt0 file to reclaim the predefined data so the heap can use it.
When you want to declare that code or a variable should go into a particular section, you use a gcc __attribute__ command, like this:
__attribute__((section(".iwram"), long_call))
Or use the macros defined in gba_base.h:
Code: |
#define IWRAM_CODE __attribute__((section(".iwram"), long_call))
#define EWRAM_CODE __attribute__((section(".ewram"), long_call))
#define IWRAM_DATA __attribute__((section(".iwram")))
#define EWRAM_DATA __attribute__((section(".ewram")))
#define EWRAM_BSS __attribute__((section(".sbss")))
|
Also, EWRAM is not particularly fast. It might be slightly faster than ROM, but not by much. But EWRAM is faster on flash cartridges like the Supercard, which require the slowest access timings.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#177759 - Bregalad - Thu Feb 28, 2013 12:50 pm
Thansk, so basically if I'd like my program to run from ROM, I'd have to edit the link script ?
Also now my ROM is multiboot, but what about if I actually want to multiboot it ? I only have a single GBA but I should buy a second one + a link cable to test this.
I'd also wonder how actual multiboot games works. Suposedly there is an option in a menu of the game that allows sending the data over the link cable. I wonder if the data sent is the complete game or just part of it enough for the 2-player mode.
#177760 - Dwedit - Thu Feb 28, 2013 4:20 pm
You actually edit the makefile first, devkitpro's magic makefiles look at the target name to determine whether to make a multiboot build or not. If the target ends in _mb.gba, it makes a multiboot file, otherwise it makes a cartridge file.
A multiboot file will auto-detect if it was booted from a cartridge or not, and it will copy itself to EWRAM if it was booted from a cartridge.
Editing the linkscript is rarely needed. I did it when I wanted to put code into fast VRAM.
To test a multiboot program, rename it to .mb, and run it in an emulator. Or run it on a GBA and pull out the cartridge. If it keeps running, it's multiboot.
You can also test it with some GBA-PC link cables, such as the MBV2 multiboot cable, or some cables that come with flash cartridges.
But to actually transfer it to another GBA, I'm not 100% sure how to do that. There is some code in PocketNES that does link transfers, but I haven't gotten it to work since making a ton of changes.
A commercial game will include a smaller multiboot version of a game just to send that, I don't think any full commercial cartridge games were multiboot builds themselves.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#177761 - Bregalad - Thu Feb 28, 2013 11:03 pm
Quote: |
A commercial game will include a smaller multiboot version of a game just to send that, I don't think any full commercial cartridge games were multiboot builds themselves. |
Very interesting. Apprently the smaller cartridges are 4 MB, but if it was a multiboot it would be limited to 256kb which is quite a lot smaller.
Quote: |
Editing the linkscript is rarely needed. I did it when I wanted to put code into fast VRAM. |
It sucks you have to do it to do that but I'm not there yet anyways.
Also, how can you see how the C code was compiled into assembly ? I tried adding -S to the makefile but it did not work :(
#177762 - elhobbs - Fri Mar 01, 2013 1:54 am
-S is the correct option. did you rebuild the whole project? the output files will still have .obj extension unless you change it too. did you try opening the .obj files?
#177763 - Dwedit - Fri Mar 01, 2013 4:44 am
I usually use arm-eabi-objdump to do disassemblies. If you do disassemble the ELF file, you'll get the benefit of the original symbol tables.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#177764 - Bregalad - Fri Mar 01, 2013 9:50 pm
What I meant is, adding -S in the make file did only give me errors, and I can't see the source that way.
arm-none-eabi-objdump --disassemble -S works, although I get some weird things as the source is desassembled, so the quality is poorer than seeing the result of compilation, even with the original code included.
Anyways for what I want to do (have an approximate idea of how well the code is optimized at -O1 -O2 and -O3) it is enough for me.
Oh and, here is my GBA port of my former NES Vector wireframe graphics demo.
Not extremely interesting, but at least I pulled something out of the GBA.
#177765 - elhobbs - Fri Mar 01, 2013 10:15 pm
are you sure it is not the linker that is throwing all of the errors - because the obj files now contain asm text that the linker cannot handle. -S does not produce both an object file and asm output - it only produces asm text. since the template makefiles also use the -o option to name the files it will still have the .obj extension, but contain asm text. If you were expecting it to just spit out an additional .s file - that is not how it works.
#177766 - Bregalad - Fri Mar 01, 2013 10:34 pm
Mmh, yeah you are totally right ^_^
It's just that the (assembly) file was still called ".o" instead of ".s" so I didn't find it, forgive me for being this dumb.
EDIT 2 : I found a better way : -save-temps, now I can have a .s assembly file for curiosity AND get the thing to compile !
EDIT : since I'm in the basics questions, what am I supposed to do when I want to put the CPU in iddle mode (= main thread sleeping, but still looking for interrupts).
All I found was :
Is this correct ?
#177767 - elhobbs - Sat Mar 02, 2013 12:30 am
do you need to wait for other interrupts or just vblank? I think you can do something with IntrWait if you need to catch other interrupts. I am not really sure how it works though.
#177768 - Bregalad - Sat Mar 02, 2013 6:15 pm
Quote: |
do you need to wait for other interrupts or just vblank? |
I don't really know, I just asked because I think if everything is done in the interrupts vector, then the main program can simply rest and this way energy can be saved, can it ?
This aside, since the GBA is a system where VRAM access is completely free to be done anytime and it has von neuman architecture where VRAM is on the same bus as RAM, it should be possilbe to execute code in VRAM, isn't it ?
Of course it would not be very interesting as it would be slow, but still this is a funny possibility.
Also, it's possible with devkit pro to simply name the file
.iwram.c
or
.ewram.c
in order to have it run on IWRAM and EWRAM repsecively. That's a nice feature !
Also it apprently automatically compiles ARM code when put into IWRAM and thumb code when in EWRAM (apparently EWRAM is 16-bit bus like cartridge, and thumb code should be used else you'd need 2 fetches per op-code which is not already in the instruction cache, which is horrible).
EDIT : Executing code from VRAM is after all not so much of a crazy idea, considering it is as almost fast as IWRAM for 16-bit accesses according to nocash. For 32-bits it's slower than IWRAM but still faster than EWRAM.
#177769 - Dwedit - Sat Mar 02, 2013 7:56 pm
Executing code in VRAM is very fast, almost as fast as IWRAM. Takes 2 cycles per instruction fetch instead of 1, but it's much faster than ROM or EWRAM.
And when you wait for interrupts, you usually only need to wait for vblank. Other interrupts get handled by their handlers. The main thread shouldn't need to be affected by the other interrupts. It's not like you're waiting in your main thread for a certain scanline to appear or something.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#177770 - Bregalad - Sat Mar 02, 2013 8:37 pm
OK so I'm not the first one who had the idea to execute code in VRAM... it's just a matter to know how to do it, but I think I should experience more "standard" stuff first.
(The last thing I wrote was a HSL to RGB converter because the GBA uses RGB and I hate using RGB colours... makes no sense for the human brain.)
Since the ARM CPU is modern, it probably does have an instruction cache, does it ?
If so, then for example in a loop, the speed of the RAM/ROM only matters on the 1st execution of the loop, because once all the opcodes are fetched, they are kept in cache and are not fetched again.
So if you have a loop that is executed 50 times, it won't matter much in which kind of memory it is stored, because the loss of performance in the slow ROM or RAM is only made the first time, and ends up negligible.
Is this right ?
#177771 - Dwedit - Sat Mar 02, 2013 8:42 pm
GBA has no caches at all, that's why they needed the IWRAM. The NDS does have caches though, but that's on the ARM9 only.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#177772 - Bregalad - Sat Mar 02, 2013 9:14 pm
OK, so there would be a DRAMATIC (5x) performance difference between ROM and IWRAM. That's good to know !
When it comes to VRAM, is there a way to compile code in VRAM with devkit pro ? Unlike IWRAM, EWRAM and PRG-ROM it becomes really important where the code is linked, as it will change what can't be used for graphics.
What demoes or games does uses code in VRAM ?
#177773 - Dwedit - Sun Mar 03, 2013 12:35 am
I used code in VRAM for PocketNES, but I've never seen anyone else do it. (download the source code)
You need to change the makefile to use a different specs file when linking.
The specs file is a tiny text file that tells the linker what linkscript to use, and which file is the entry point.
I used a customized specs file, linkscript file, and crt0 file. You can compare them against the original versions. If you look at the modified linkscripts I used in PocketNES, you'll see that I added a .vram1 section, which is 4k large, and begins at 0x06003000. Then there's code in main.c that copies the code from the binary into VRAM so you can run that code.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#177774 - Bregalad - Sun Mar 03, 2013 8:46 pm
OK, I'm trying to do something to remove the lagging in Final Fantasy Advance games. (as a possible add-on for future version of my sound restore hacks)
All games using the sappy engine runs the software sound mixing code in EWRAM (0x030xxxxx region), but since not all of IWRAM is used, I thought about forcing them to run it from IWRAM instead.
This should normally work as all of the branch offset are relative, and the binary of the code is exactly the same in all games using the sappy engine, only 2 common variants exists (mono and stereo).
However, when I do that, the game runs extremely slowly and there is graphical and sound glitches everywhere.
The same happens if I force the code to execute in ROM. However if I move to a different location of EWRAM, it's ok.
It had the opposite effect of what I wanted to do. What could possibly have gone wrong ?