gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Coding > DMA Screen Clearing

#45866 - AkumaATR - Wed Jun 15, 2005 10:34 pm

I know this is nothing new, but after having read through related threads, I'm still stuck.

Initially I was thinking that I wanted to clear (to 0x00, black) my back buffer (mode 4) during each VBLANK.This didn't seem to work for some reason it was only clearing the screen for like every 4 of my new frames displayed (not sure why yet). (caused short lines to be drawn instead of points moving across the screen)
So now I'm simply trying to clear the back buffer to black after page flipping.

When I give count as ((240 * 160) / 2) / 2 for the DMA transfer (16-bit), it clears half the screen. However, when I set count to 0x4B00 to clear the entire screen, it doesn't work. It works for clearing 1/3 or 2/3 of the screen as well... so I can't understand why trying to clear all of it only appears to clear like 1/6 - 1/5 of it.

Relative code:

#define REG_DMA0SAD *(volatile uint32 *)0x40000B0
#define REG_DMA0DAD *(volatile uint32 *)0x40000B4
#define REG_DMA0CNT_L *(volatile ushort16 *)0x40000B8
#define REG_DMA0CNT_H *(volatile ushort16 *)0x40000BA
.
.
.
ushort16 black = 0x0000;
.
.
.
while(TRUE)
{
for (i = 0; i < TITLE_NUM_STARS; ++i)
{
if (stars[i].x == 0)
stars[i].x = 239;
else
--stars[i].x;
}
waitVBlank();
//flipPage();
REG_DMA0SAD = (unsigned int)&black;
REG_DMA0DAD = (unsigned int)video_buffer;
REG_DMA0CNT_L = 0x4B00;
REG_DMA0CNT_H = 0x8100;
//clearBackBufferMode4(0);
for (i = 0; i < TITLE_NUM_STARS; ++i)
plotPixelMode4(stars[i].x, stars[i].y, 1);
}


Note: I essentially turned off double-buffering by not page flipping and always working out of first half of vram (one page) in order to simplify things to figure out what's going on with my program.

ClearBackBufferMode4 was my ghetto slow routine I'm trying to replace with the DMA transfer.

Thanks Again,
Jason

#45873 - strager - Wed Jun 15, 2005 11:22 pm

The reason that 0x4B00 is not working is because it is above the 0x4000 limit. Use 32-bit transfers to fix this problem.

#45874 - AkumaATR - Wed Jun 15, 2005 11:24 pm

Interesting. I'll give it a go. I thought that the choice of whether to use 16-bit or 32-bit transfer was based on the type of memory being targeted, and according to the stuff I've been looking over VRAM sits on the 16-bit data bus? Thanks for the info... this was really confusing me.

#45877 - AkumaATR - Wed Jun 15, 2005 11:36 pm

Worked. Just halved 4B00 to 2580 (after changing to 32-bit transfers). Thanks muchos. Much faster now.

The materials I have (Jonathan Harbour's text) claims that video memory can only be written 16 bits at a time. But then it says that the bus is capable of a full 32 bits. So maybe he meant that it can only be written to 16 bits OR 32 bits at a time (I understand that his main goal was to point out that it can't be written 8 bits at a time)?

- Jason

#45879 - AkumaATR - Wed Jun 15, 2005 11:42 pm

For clarity to anyone who might be reading this thread in the future:

The reason that there is a 0x4000 limit on transfer size for 16-bit DMA transfers is that the register that stores the number of halfwords to copy only uses bits 1-14 (0-13).

2^14 = 16384

- Jason

#45889 - sajiimori - Thu Jun 16, 2005 12:38 am

Using the CPU to fill is faster than DMA, because (stupidly enough) DMA re-reads the fill value for every word. There have been some threads about this before, but I don't remember their titles.

#45890 - AkumaATR - Thu Jun 16, 2005 1:13 am

fill value? can you elaborate? thanks.

jason

#45891 - AkumaATR - Thu Jun 16, 2005 1:32 am

just another comment -- i built two nearly identical versions one using dma and one with a for loop for filling the background. dma does appear to be much faster.

jason

#45892 - gladius - Thu Jun 16, 2005 1:42 am

The fill value is the value you are using to clear your screen. Dma actually is always doing a copy, so for a dma fill operation what it really does is a copy with the source location fixed. Unfortunately, the dma controller simply re-reads the source location every time.

If you use something like CpuFastSet(), as opposed to just a for loop for doing the filling, you should find that the cpu is 10-15% faster or so (iirc). The reason a for loop is so slow is that it is not optimised to do one thing (filling) extremely well, and probably in thumb code to boot. In any case, it's not something to worry about unless it is truly time critical.

#45902 - Cearn - Thu Jun 16, 2005 9:36 am

sajiimori wrote:
Using the CPU to fill is faster than DMA, because (stupidly enough) DMA re-reads the fill value for every word. There have been some threads about this before, but I don't remember their titles.

Like this one?