#177776 - Bregalad - Wed Mar 06, 2013 6:05 pm
I'd like to transfer data from RAM to VRAM during VBlank to avoid tearing or other kind of artifacts on the screen.
Since I think DMA1 and 2 should be reserved for sound (in a future usage) and DMA0 for HBlank effects (for a future usage), I think the best option is to use DMA3 for all kind of VRAM updates.
As you can set the DMA channel so that the transfer automatically starts on interrupt, it is very easy to use. The problem is if I want to use multiple DMA transfers. For example if I want to update some graphics and the palette during the same VBlank.
What is the best way to do it ?
1) Start DMA in the VBlank interrupt, poll the complete flag, then start the other DMA.
2) Only do the most consequent (=largest) transfter by DMA and do other transfters by software
3) Use other DMA channels too
Bonus question : What is the advantage to use a 16-bit DMA transfter over a 32-bit one ? Since you transfer 32 bit at a time, it seems like it'd be 2x more efficient to use the 32-bit DMA, isn't it ?
#177777 - Dwedit - Wed Mar 06, 2013 6:43 pm
You can have DMA transfers happen immediately, and they will halt the CPU, so you don't have to do anything besides start the transfer. Set the destination, source, word count, and control registers. Then the transfer just happens. No need for any polling anywhere.
So since DMA halts the CPU, you can keep doing DMA copies as much as you want. You only need the other channels for DMA events that happen at particular times. (sound, hblank, etc.)
You can also use a fast memcpy loop, and it will be 72% as fast as DMA. So if you need all 4 DMA channels for things like HDMA and sound, you can still do fast memory transfers in software. Setting/Clearing memory is faster when done in software than with DMA on the GBA. (But not on the NDS, which has DMA fill mode)
(72% as fast assuming it's a transfer from IWRAM to IWRAM. This margin improves when the memory reads and writes are slower than executing the loop code)
You can also transfer to VRAM outside of vblank time, but you can get tearing if the parts updated are above the current scanline.
On the NDS, you need to flush the cache before using any DMA, otherwise DMA will pull outdated memory out of RAM instead of the newest data from the data cache. You also need to invalidate the data cache if you will be reading memory back. You don't need to do this on the GBA which has no cache. Also on the NDS, DMA does not halt the CPU if the CPU is not accessing RAM.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#177778 - elhobbs - Wed Mar 06, 2013 8:16 pm
both the gba and the nds support a flag to auto start dma on vblank (and hblank too) - so you do not need an interrupt handler just to start it at the right time.
#177785 - Bregalad - Wed Mar 06, 2013 10:29 pm
Quote: |
On the NDS, you need to flush the cache before using any DMA, otherwise DMA will pull outdated memory out of RAM instead of the newest data from the data cache. You also need to invalidate the data cache if you will be reading memory back. You don't need to do this on the GBA which has no cache. Also on the NDS, DMA does not halt the CPU if the CPU is not accessing RAM. |
This probably explains why I failed to use DMA on the DS last year... now I'm going to try it on the GBA since it seems much much easier (ironically....)
So in the end it sounds very easy, if I have a single transfer to do in VBlank I don't need an handler I can just setup the DMA to happen on thext VBlank, otherwise I just have to make a handler that spams VRAM updates with a single DMA channel and that's it.
Sounds really convenient.
EDIT :
Can't the DMA be used to fill a part of the memory by setting the source address to "fixed" ?
EDIT 2 : If I want to use both direct sound channels in the future, do I need to have BOTH DMA1 and DMA2 used for sound, or just one DMA can do both direct sound channels ?
#177787 - Dwedit - Thu Mar 07, 2013 12:04 am
You can use a fixed source address to do memory filling, but it will read that word every time, so using an optimized memset function is actually faster.
I think you need both DMA channels to do two channel DMA sound.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#177788 - sverx - Thu Mar 07, 2013 1:48 pm
I don't see why an handler should be needed :| Isn't is sufficient to have a main cycle that simply does someting like that?
Code: |
loop {
doEverything();
waitForVBlank();
dmaTransfer1();
dmaTransfer2();
...
dmaTransferN();
} |
_________________
libXM7|NDS programming tutorial (Italiano)|Waimanu DS / GBA|A DS Homebrewer's Diary
#177790 - Bregalad - Thu Mar 07, 2013 2:08 pm
Yes, I imagine it is sufficient, but a handler could still have 2 purpose :
1) In the (rare) case when your game is lagging and you'd want the CPU to continue to do calculations and avoid the waitForVblank() dead time.
2) If you are doing this in a lot of different places in your game engine, it can be very ROM-inefficient to have all those DMA transfers list copied and pasted everywhere. This can also easily be solved by having a function which does this and no handler though.
#177791 - Dwedit - Thu Mar 07, 2013 2:30 pm
With the right techniques, you can make a game run at a more flexible framerate, so it won't slow down to 30FPS when it can't run at 60FPS. Even when it runs at 55FPS, it still moves smoothly and runs fine. With triple buffering, you can even do this without any tearing at all.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."