gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Coding > DMA question

#3734 - Vortex - Thu Mar 06, 2003 10:56 pm

Hello

I was wondering which is the minimum chunk size for which using DMA transfer makes sense. For example: is it worth using DMA to copy 2 or 4 words of data?

Thanks

#3735 - DekuTree64 - Thu Mar 06, 2003 11:08 pm

I think it would be faster to do that normally. Since you have to write the src/dest adresses, the count and control reg, you've already written 2 words and 2 halfwords. I think it's about 32 bytes where DMA really starts to improve over normal copying. Also, there's a SWI for copying large chunks of data 32 bytes at a time (0xC), but I haven't done any speed tests on it myself. It might be even faster than DMA, since it's in 0 wait-state BIOS ROM, and copies 8 words at a time, but it would have to push/pop 8 regs onto the stack to do it, plus one to use for a loop counter, so I don't know exactly how big of a chunk you'd need to be worth all that.

#3740 - ampz - Fri Mar 07, 2003 2:16 am

I can't see how any software copying routine could be faster than DMA. zero waitstate, yes, but that's still 1cycle per instruction. DMA use no instructions at all.

#3741 - DekuTree64 - Fri Mar 07, 2003 4:02 am

Yes, probably so, but then why would they have put it there? I guess it would be the next best thing if you were doing something like DMA1/2 for sound, and DMA0/3 for different things on HBL, and needed a fast way to copy something without DMA. Not sure. I'l do some speed tests sometime and see how it goes. Probably won't get the chance tonight, but maybe tomorrow

#3748 - tepples - Fri Mar 07, 2003 4:19 pm

DMA may be faster for copying memory (memcpy()), but an unrolled loop of STMIA instructions may be faster for clearing large blocks of memory to a constant value (memset()). DMA must read and re-read the same 32-bit value from the stack (1 cycle) for each 32-bit store, whereas STMIA reads only one instruction (1 cycle) for every set of 8 32-bit stores.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#3753 - peebrain - Fri Mar 07, 2003 6:03 pm

How do you guys know stuff like that?

~Sean (*wannabe*)
_________________
http://www.pbwhere.com