gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

DS development > DS's main processor

#30181 - phantom-inker - Mon Nov 29, 2004 5:18 am

I had one other useful tidbit to post here. The DS's main processor is an ARM946E-S, as shown here. It's running at 67 MHz, or four times the speed of the ARM7TDMI in the GBA.

However, the ARM9 is pipelined better than the ARM7; it includes a read and write cache, which the ARM7 does not, has a 32-bit bus, and it supports branch prediction, and has a few additional useful instructions such as CLZ that make certain performance-critical operations (such as interrupt processing and audio/video decoding/mixing) much more efficient.

Thus, practically speaking, the DS's main processor, programmatically, can be considered to be pretty much the same as what you're used to with the GBA, but approximately six to eight times faster. The exact ratio can only be determined by programming it, of course, but that's probably a good rough estimate.

Details about the ARM946E-S can be found here. The "ARM946E-S Product Overview" document is a rough overview of the chip, and the "ARM946E-S Technical Reference Manual" goes into a fair amount of depth about everything the processor can do, including the caches, the bus, and the new "protection mode", which can selectively block access to certain regions of memory (and seems to have been designed to prohibit exactly what the homebrew developers want to do).
_________________
Do you suppose if I put a signature here, anyone would read it? No? I didn't think so either.

#30195 - leonard_ - Mon Nov 29, 2004 11:26 am

the technical reference manual said the cache can be from 0byte to 1Mb.

Does anyone know what implementation Nintendo choose ? What is the cache size of the ARM9 on DS ?[/quote]

#30198 - rapso - Mon Nov 29, 2004 11:30 am

leonard_ wrote:
the technical reference manual said the cache can be from 0byte to 1Mb.

Does anyone know what implementation Nintendo choose ? What is the cache size of the ARM9 on DS ?
[/quote]
maybe they call the IWRam "cache".

greets
rapso

#30199 - leonard_ - Mon Nov 29, 2004 11:57 am

Maybe a new info. It seems the ARM9 don't have branch prediction as said in previous post.
Quote:
The ARM9TDMI and ARM9E-S cores do not implement branch prediction, because
branches on these CPUs are fairly inexpensive in terms of lost opportunity to execute
other instructions.


the branch time seems to be exactly the same as ARM7 ( ie 3 cycles for branch taken and 1 cycle if not), even if ARM9 have 5 pipeline stages instead of 3.

#30203 - phantom-inker - Mon Nov 29, 2004 1:14 pm

leonard_ wrote:
Maybe a new info. It seems the ARM9 don't have branch prediction as said in previous post.

No, it's not really new. The ARM946E-S does not support branch prediction, as far as I know; I believe I misspoke above. Most of the ARM9 series does, though, and I'm used to the branch-prediction behavior of its larger brethren: Forward is never taken, and back is always taken. When you get used to that, you tend to code that way on every ARM chip, even when it's actually unnecessary ;)
_________________
Do you suppose if I put a signature here, anyone would read it? No? I didn't think so either.

#30205 - benjamin - Mon Nov 29, 2004 2:04 pm

http://www.auby.no/dill/specs.txt lists the arm9's instruction cache as 8KB and its data cache as 4KB. which means I guess that the arm9 is built on the harvard architecture of seperate caches for data and instructions.

i do not think the iwram is the cache, since that would preclude developers from having a fast scratchpad of memory.

#30216 - jgkspsx - Mon Nov 29, 2004 4:45 pm

phantom-inker wrote:
the new "protection mode", which can selectively block access to certain regions of memory (and seems to have been designed to prohibit exactly what the homebrew developers want to do).


How restrictive is Nintendo's implementation, though? Games running from the Game Boy port are locked into user mode, but are we sure DS roms aren't granted privileged access? Would homebrew coders necessarily be any more locked-in than official developers even if they weren't?

#30231 - ampz - Mon Nov 29, 2004 6:36 pm

Six times faster than the GBA is not realistic.

First of all, a deeper pipeline will decrease performance at a given clock speed.
The few new instructions will not make any measureable impact on performance except for a few very specific tasks optimized in asm code.

I assume the main memory in the DS is SDRAM? SDRAM have good throughput, but poor non-sequential access time. Cache misses are very expensive in terms of performance.

The DS ARM9 processor will not be more than 4 times faster than a properly written GBA program. I'am thinking closer to 3 times the GBA performance, somewhat depending on the application.
Data-intensive tasks could improve on the DS, due to it's larger memory, but it depends.. SDRAM non-sequential access times are worse than any memory on the GBA. And the card have worse throughput as well as _alot_ higher non-sequential penalty.
Pure processing tasks will only be about 3 times faster.

#30233 - benjamin - Mon Nov 29, 2004 6:47 pm

ampz wrote:
Six times faster than the GBA is not realistic.

I assume the main memory in the DS is SDRAM? SDRAM have good throughput, but poor non-sequential access time. Cache misses are very expensive in terms of performance.


Right, but at least the ARM9 has a cache, one each for data and instructions. Properly taking into account this cache may counteract the drawbacks of the SDRAM's poor non-seq access time.

In addition, there is an opportunity to improve code that was written on the gba without much concern for locality of reference, increasing the value of the cache.

#30239 - pipomolo42 - Mon Nov 29, 2004 7:38 pm

phantom-inker wrote:
the new "protection mode", which can selectively block access to certain regions of memory (and seems to have been designed to prohibit exactly what the homebrew developers want to do).


Well to me, it sounds just like the protected mode you can find on workstation processors (http://www.x86.org/articles/pmbasics/tspec_a1_doc.htm). It's just related to MMU (Memory management unit) which role it to prevent one process to access the memorythat was allocated to another process by the operating system.

By the way, here is another guess : I heard that the DS embedds 2 CPU's, one ARM9 used as the CPU, and one ARM7 used as a DSP (that handles graphics and audio)... So the GBA compatibility is maybe just an application that runs the game in protected mode (which is used to present the correct memory mapping to the gba game). Thus, it should be fairly easy to "escape" from protected mode with 2 or 3 asm instructions embedded in a gba image...

Of course, this is complete theroy, and i may be totally wrong. So feel free to teach me :)

#30240 - pipomolo42 - Mon Nov 29, 2004 7:41 pm

pipomolo42 wrote:
Of course, this is complete theroy, and i may be totally wrong. So feel free to teach me :)


Damn, I completely forgot that the ARM processors don't have a MMU... So, it won't be that simple :-(

#30246 - allenu - Mon Nov 29, 2004 8:31 pm

pipomolo42 wrote:

Damn, I completely forgot that the ARM processors don't have a MMU... So, it won't be that simple :-(

Are you sure? This page says there is one: http://www.arm.com/products/CPUs/families/ARM9Family.html I've never directly worked with ARMs, so I don't know myself.

#30247 - benjamin - Mon Nov 29, 2004 8:52 pm

The ARM9's do have an MMU in their ref manual, the question is does the DS use/have one? The ARM9 is custom implemented by each hardware vendor to their specification.

I think of MMUs as being useful with virtual memory, which in turn is usually useful on an operating system, especially when running multiple processes simultaneously and/or loading code dynamically. Since the ARM9s are geared in large part at Symbian OS and Palm OS, etc., its no surprise they have MMUs. The question is does the DS need one and if the answer is probably not, then that is probably why there isn't one or why we have heard speculation that there isn't one.

However, as far as the protected mode capability, I am not sure how that works and whether that requires the use of an MMU to do the special memory windowing, or if its done some other way.

#30249 - allenu - Mon Nov 29, 2004 9:02 pm

benjamin wrote:
However, as far as the protected mode capability, I am not sure how that works and whether that requires the use of an MMU to do the special memory windowing, or if its done some other way.


Yeah, similarly, if the protected mode on this is similar to the ones found on the x86, I can only see it being useful for multi-threading and virtual mem as well. Seeing as we don't have a disk medium, would virtual mem be useful here? Actually, the protected mode would be useful on its own, I guess, if they do allow you to download apps off the 'net as it will help sandbox things you download.

#30256 - phantom-inker - Mon Nov 29, 2004 10:00 pm

benjamin wrote:
The ARM9's do have an MMU in their ref manual, the question is does the DS use/have one?

The ARM946E-S does not. Other ARM9 models do.

benjamin wrote:
However, as far as the protected mode capability, I am not sure how that works and whether that requires the use of an MMU to do the special memory windowing, or if its done some other way.

The protected mode system is very straightforward.

The system software (BIOS) may define up to eight regions of memory. They may overlap. Regions must be aligned to 4K boundaries, and their sizes must be a power of two, 4K or larger: 4K, 8K, 16K, ... 4M.

Each region may be individually marked as being cacheable (incoming data) or bufferable (outgoing data). In addition, the following read/write flags may be separately applied to each region:

Code:
       System software   User software
       ---------------   -------------
0000   No access         No access
0001   Read/write        No access
0010   Read/write        Read-only
0011   Read/write        Read/write
0100   Undefined         Undefined
0101   Read-only         No access
0110   Read-only         Read-only
0111   Undefined         Undefined
1xxx   Undefined         Undefined


Thus it appears that the regions serve a dual purpose: First, to ensure that certain regions of memory (like VRAM and the registers) don't get cached or buffered; and second, to ensure that user programs are locked inside a sandbox. I would guess that the BIOS region (and you can be sure there is a BIOS-specific region) is marked 0101 --- read-only by the BIOS, and no access at all by user software.

This is not massively different from the protection circuitry on the GBA, which was eventually bypassed by a BIOS bug; it just appears to be more uniformly enforced.
_________________
Do you suppose if I put a signature here, anyone would read it? No? I didn't think so either.