gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Coding > DMA Timings and Wait States

#169629 - Karatorian - Mon Jul 27, 2009 3:32 pm

I'm working on the core graphics engine for my latest project and I've got some questions about DMA timing. Basically, I want to know how much data I can realistically DMA transfer to VRAM before I run out of v-blank time.

According to the GBAtek, the v-blank is 83776 cycles. So, if I understand how the DMA timing works (the only docs I could find on this was the dev'rs FAQ), the theoretical maximums are as follows:

Code:

Source      Dest   Total      Speed

IWRAM       VRAM   ~55850      4 bytes every 6 cycles
EWRAM       VRAM   ~33510      2 bytes every 5 cycles
ROM (4/2)   VRAM    23936      2 bytes every 7 cycles
ROM (3/1)   VRAM   ~27925      2 bytes every 6 cycles


However, I think I've calculated the most of these values wrong. After the first access, DMA should be using sequential addressing, so it should be faster. More like:

Code:

Source      Dest   Total      Speed

ROM (4/2)   VRAM   ~33510      2 bytes every 5 cycles
ROM (3/1)   VRAM    41888      2 bytes every 4 cycles


I don't know what exactly the timing on the EWRAM is. Is it's wait always 2 cycles or is it faster on sequential access? So anyway, are these numbers correct, or am I misunderstanding something? I'm a little confused because none of the docs I found explain wait states in a comprehensive way, especially with regards to DMA.

The other thing I was wondering about was 32 vs 16 bit DMA. From what I understand you can use 32 bit DMA even when one of the buses is 16 bit. What I'm not sure about is if both of the buses are 16 bit. Can you do it and is there any drawback to doing so?

If I grok the timing correctly, 16 to 16 with 32 bit DMA should be the same speed as with 16 bit DMA. The reason I'm asking is that 16 to 32 (or vice versa) or 32 to 32 transfers should be faster with 32 bit DMA and it'd simplify things if I could just use 32 bit transfers for everything.

Obviously, these are theoretical maximums and it's probably impossible to actually reach quite these speeds. However, I want to know if I'm doing the math right, so I can figure the timing on what I do need to transfer.

As for what that is, well most of it's pretty basic stuff, but my engine is a little bit complicated in some areas. First up is 4k of tile map buffers (for mode 0 layers). As I'm using a varible width font I have to transfer a bunch of text tiles and then I need to transfer about 3k of spirite tiles for characters, plus some more for attacks. (Due to the large number of characters and the many possible poses, I can't simply stuff it all in VRAM at once). Finally, there's the OAM data. I'm also thinking of tranfering all or most of the palette data to make pallet effects quick and easy.

All told, it'll probably be 20 to 24k of DMA every v-blank. Is that reasonable, or do I have to rethink my design?

#169631 - eKid - Mon Jul 27, 2009 3:57 pm

20-24k seems like a bit much, but it might be reasonable.

EWRAM speed is 3 cycles per 16bit read/write, 1 cycle for the 16bit access plus 2 waitstates. The sequential timing is the same so it's really pretty slow...
3,1 waitstates on ROM give 2 cycles per each 16bit transferred. Transferring this data to VRAM will be 3 + 1 cycles per 16bit hword from EWRAM and 2+1 cycles when transferring from ROM (3,1).

When transferring a 32bit word from a 16bit bus it will take two accesses. The first access will be nonsequential and the second one will be sequential. When doing 32bit dma transfers from a 16bit memory region then only the first 16bit will be nonsequential. You will save with 32bit DMA if either the source or the destination area (or both) have a 32bit bus.

#169633 - Karatorian - Mon Jul 27, 2009 4:42 pm

Could you clarify a little more about using 32 bit dma on a 16 bit bus? I get that the first 16 bits will be non-sequential and that the next 16 bits will be sequential. I assumed that the third, forth, etc. 16 bits would be sequential also, but what you wrote seems to imply that it alternates every other word? Which interpretation is correct?

#169636 - eKid - Mon Jul 27, 2009 5:37 pm

Er yeah only the very first 16bit hword of the first 32bit word will be nonsequential in the transfer. With a reasonable amount transferred with DMA you don't really need to take the nonsequential timing into account.

#169641 - Karatorian - Mon Jul 27, 2009 7:03 pm

Thank you.