gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Coding > SDT 2.51 GBA Angel port ?

#14601 - lak - Fri Jan 09, 2004 3:31 am

Anyone has ported Angel (sdt2.51) for GBA using uart ?
Possible to share ?

Thanks ..lak

#14688 - torne - Sun Jan 11, 2004 2:35 pm

You can't run a debug monitor on the GBA because the BIOS traps all the exceptions. You need to be able to catch your exceptions to run Angel. Sorry.

#16789 - KevinW - Wed Feb 25, 2004 3:35 am

But nothing says you can't write your own limited debuger that communicates over the serial port.

Of corse you won't have access to CPU debug functions, but it's something..

If nothing else an output terminal to print "trace" statements, etc.
And, or, you could have a little debug monitor built into your interrupt handler that you would invoke with a specail combination of buttons.

#16814 - torne - Wed Feb 25, 2004 1:41 pm

KevinW wrote:
But nothing says you can't write your own limited debuger that communicates over the serial port.

Yes, you could, but it would be far less useful because it would be incapable of regaining control of the machine if interrupts were disabled or the serial interrupt was not functioning correctly.

Quote:
Of corse you won't have access to CPU debug functions, but it's something..

ARM7tdmi has no CPU debug functions to speak of. The BKPT instruction is not implemented (it's only present on later ARM versions) and thus all calls to debug monitors must be made through standard software interrupts; this requires the debug monitor to be the master SWI handler, which is not possible on the GBA (the BIOS ROM is).

Quote:
If nothing else an output terminal to print "trace" statements, etc.
And, or, you could have a little debug monitor built into your interrupt handler that you would invoke with a specail combination of buttons.

You could make that but it's not a debug monitor as it does not control the machine; it's just a helpful debug interface. A debug monitor needs to be able to control the processor. Nothing like the capabilities of Angel can be developed.

#16847 - ampz - Wed Feb 25, 2004 9:17 pm

The serial port interrupt can be used to gain control of the GBA. Of course, the thing will not work if your code screws up your interrupts, big deal.. It would still be very useful.
I think it should be quite possible to implement breakpoints by simply using BL instructions. (A bit complicated to rewrite a flash sector everytime a breakpoint is added or removed, but it can be done).

As for the debug functions in the ARM7tdmi core... There are two incredibly flexible hardware breakpoints in the ARM7tdmi cores, but I think they can only be accessed by the JTAG interface. (and the GBA has no JTAG interface..)
The ARM7 hardware breakpoints are nothing but incredible. You can set them to trigger on almost _any_ event! Your code is fucking up a register or an address in memory somewhere... And you don't know which part of the code that does it... No problem! Just set up one of the hardware breakpoints to "Break on write to memory address 0x12346". It's awesome!

#16858 - torne - Wed Feb 25, 2004 11:26 pm

ampz wrote:
The serial port interrupt can be used to gain control of the GBA. Of course, the thing will not work if your code screws up your interrupts, big deal.. It would still be very useful.

Leaving interrupts disabled after an interrupt handler is a pretty common bug, but yes, it's still useful.

Quote:
I think it should be quite possible to implement breakpoints by simply using BL instructions. (A bit complicated to rewrite a flash sector everytime a breakpoint is added or removed, but it can be done).

Yes, this could be implemented, though supporting all the different write unlock mechanisms for different cart types (some of which may not be known) would be a pain =)

Quote:
As for the debug functions in the ARM7tdmi core... There are two incredibly flexible hardware breakpoints in the ARM7tdmi cores, but I think they can only be accessed by the JTAG interface. (and the GBA has no JTAG interface..)

Yes, the EmbeddedICE debug controller is great. Yes, it can only be accessed via JTAG (if it was accessible to the CPU, then the CPU could invoke debugger functions by accident when executing broken code, which would disturb an external debugger session). It's also not present on the GBA at all.
The GBA does not have an ARM chip in it; it has the synthesisable ARM core embedded on its ASIC along with other macroblocks like the video controller and BIOS. Most parts of the synthesisable ARM core are optional, even those which are not optional on a hard core, and the debug controller has been omitted to save space. The system control coprocessor, CP15, is also missing, rather uniquely amongst ARM-based devices; without it there is no way to find out the chip revision number. =)

#16860 - ampz - Thu Feb 26, 2004 12:04 am

Quote:
Quote:
As for the debug functions in the ARM7tdmi core... There are two incredibly flexible hardware breakpoints in the ARM7tdmi cores, but I think they can only be accessed by the JTAG interface. (and the GBA has no JTAG interface..)

Yes, the EmbeddedICE debug controller is great. Yes, it can only be accessed via JTAG (if it was accessible to the CPU, then the CPU could invoke debugger functions by accident when executing broken code, which would disturb an external debugger session). It's also not present on the GBA at all.
The GBA does not have an ARM chip in it; it has the synthesisable ARM core embedded on its ASIC along with other macroblocks like the video controller and BIOS. Most parts of the synthesisable ARM core are optional, even those which are not optional on a hard core, and the debug controller has been omitted to save space. The system control coprocessor, CP15, is also missing, rather uniquely amongst ARM-based devices; without it there is no way to find out the chip revision number. =)


Nintendo would be crazy to develop such a complex ASIC without running a JTAG chain through it. I bet JTAG is there, it is just not available on production GBAs (either the pins are not bonded, or perhaps the pins are simply tied to ground and kept very secret by N).
Regarding the difference between a "ARM chip" and a "synthesisable ARM core".. There is no difference! ARM (the company) sell synthesisable ARM cores to whoever want it. They do not produce any "ARM chips" themself. Any company is welcome to buy the ARM core, and build a chip based on it (like Nintendo have).

I know of two additional standard ARM7 features that are not present in the GBA... The GBA memory bus manager cannot handle 16 and 32bit operations on 8bit busses. All ARM7s I know of can. Also, there is no cache.. A simple cache is generally standard on ARM7. Rather stupid decision to remove the cache... it would have allowed ARM code to be executed from ROM at (very close to) IWRAM speeds. A small 8kByte cache would do wonders to GBA performance and flexibility.

#16867 - tepples - Thu Feb 26, 2004 5:10 am

ampz wrote:
Rather stupid decision to remove the cache... it would have allowed ARM code to be executed from ROM at (very close to) IWRAM speeds.

I think of IWRAM as a software-defined cache. The Atari Jaguar had a similar memory structure; programs running on the "Tom" and "Jerry" CPUs moved code and data in and out of cache manually.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#16871 - poslundc - Thu Feb 26, 2004 5:39 am

tepples wrote:
I think of IWRAM as a software-defined cache. The Atari Jaguar had a similar memory structure; programs running on the "Tom" and "Jerry" CPUs moved code and data in and out of cache manually.


That's an interesting idea. GCC's Thumb code generation is so piss-poor that you may as well generate ARM code for all of your processor-intensive stuff and just DMA the routines into IWRAM before calling it. Could be a good "pre-optimization" technique if it looks like you need to boost a routine's speed but you don't want to have to recode stuff in assembler.

Here's a question: is there a way (preferably from within C, without having to use -S and then count the lines) to determine the size of the object code generated, so you would know how many instructions to DMA in?

Dan.

#16873 - torne - Thu Feb 26, 2004 6:25 am

ampz wrote:
Nintendo would be crazy to develop such a complex ASIC without running a JTAG chain through it. I bet JTAG is there, it is just not available on production GBAs (either the pins are not bonded, or perhaps the pins are simply tied to ground and kept very secret by N).

The chip may well have JTAG for testing purposes (at the fab plant), but it doesn't have the EmbeddedICE core (this is not required to test chip functionality).

Quote:
Regarding the difference between a "ARM chip" and a "synthesisable ARM core".. There is no difference! ARM (the company) sell synthesisable ARM cores to whoever want it. They do not produce any "ARM chips" themself. Any company is welcome to buy the ARM core, and build a chip based on it (like Nintendo have).

There are standard hard cores made by a variety of companies from ARM's synth code. These follow a pinout..etc specified by ARM and are thus all the same whoever they are manufactured by. All hard cores are the same; synth cores may be different (for example, all hard cores have a CP15.)

Quote:
I know of two additional standard ARM7 features that are not present in the GBA... The GBA memory bus manager cannot handle 16 and 32bit operations on 8bit busses. All ARM7s I know of can.

That's not a feature of the chip, but of the memory controller. There is no memory controller in the hard or synthed ARM core. That way, things such as support for multiple-busword reads, unaligned reads, memory protection, memory mapping, are all optional (use a mem controller with as many or as few features as you like). The only part which is in the core is the settings which dictate which of these features are available; normally set through CP15, but on the GBA, hardwired.

Quote:
Also, there is no cache.. A simple cache is generally standard on ARM7. Rather stupid decision to remove the cache... it would have allowed ARM code to be executed from ROM at (very close to) IWRAM speeds. A small 8kByte cache would do wonders to GBA performance and flexibility.

I've used a number of ARM based devices and none have any cache, so it may be less common than you think. As another poster pointed out, IWRAM can be effectively used as an out-of-line cache.

#16874 - torne - Thu Feb 26, 2004 6:29 am

poslundc wrote:
Here's a question: is there a way (preferably from within C, without having to use -S and then count the lines) to determine the size of the object code generated, so you would know how many instructions to DMA in?

Compile the code with -ffunction-sections (you may need a custom link script to be able to link code compiled that way) and then use the linker-provided section start/length/end symbols.

#16878 - tepples - Thu Feb 26, 2004 6:57 am

BSS (uninitialized data) overlays are straightforward: just put all affected global variables into a union of structs.

Code overlays, on the other hand, need linker support. Jeff F's linker script and the modified version that ships with DevKit Advance are supposed to have support for IWRAM code overlays. The linker will export symbols representing the start and address of one of ten overlay sections in ROM as well as where the overlays are supposed to go in IWRAM.

No, I haven't seen a demo of IWRAM code overlays.

If you're considering swapping code into IWRAM several times per vblank, consider that a DMA copy from ROM to IWRAM takes five cycles per four bytes (wait, read 16 bits from ROM, wait, read 16 bits from ROM, write 32 bits to IWRAM), or just over one scanline per kilobyte. You may be able to get away with swapping a simple mixer, lossless data decompressor, division/square root package, or something else small like that, but you might not want to try repeatedly swapping something big like a GSM decoder (8 KB).
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#16879 - torne - Thu Feb 26, 2004 7:19 am

tepples wrote:
If you're considering swapping code into IWRAM several times per vblank, consider that a DMA copy from ROM to IWRAM takes five cycles per four bytes (wait, read 16 bits from ROM, wait, read 16 bits from ROM, write 32 bits to IWRAM), or just over one scanline per kilobyte. You may be able to get away with swapping a simple mixer, lossless data decompressor, division/square root package, or something else small like that, but you might not want to try repeatedly swapping something big like a GSM decoder (8 KB).

A scanline per kilobyte sounds pretty fast to me. I've not used the overlay support because having a fixed number of large overlays doesn't seem particularly useful. It would be much more flexible to swap individual functions in and out of IWRAM, using a cache manager. All you need to do is build your code with -fPIC and -ffunction-sections, and it will drop every single function into a seperate text section in the object file, all of which can be relocated to wherever you like in memory without problems. You could even make the original function symbols point to a routine that loads the given function into the cache if it's not already loaded, and then branch to it after finding the location through the cache manager (which would make use of the cache totally transparent), and you could approximate LRU cache overwrites.