gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Beginners > Whats a good way to gain a bit of speed in Mode 3

#31326 - ghostils - Fri Dec 10, 2004 2:25 am

I don't really need sprites for my app (a car mp3 computer control console), but I would like to get my pixels on the screen a tad faster. Could copying images over DMA while in VSYNC wait mode be an effective way to do it?


Thanks,

-ghost[iLs]

#31330 - Miked0801 - Fri Dec 10, 2004 3:27 am

Switch to Mode 4 and blast twice as many pixels at a time :)

#31359 - Lupin - Fri Dec 10, 2004 12:25 pm

use 32 bit writes, in mode 4 you can draw 4 pixels at a time by using 32 bit writes

What exactly are you doing?
_________________
Team Pokeme
My blog and PM ASM tutorials

#31369 - ScottLininger - Fri Dec 10, 2004 4:06 pm

Are you compiling with the "optimize" flags?

If you call GCC with the -O3 parameter, it will attempt to optimize its output, and it can make a huge difference in performance speed. (At least, I think that's the tag... a quick forum search would find some other examples.)

Anyway, I had a Mode4 program that was performing poorly. I added that one tag to my compile batch file and Voila! Instant speed inmprovement.

It's worth a try.

Also, there's probably just one or two drawing functions that are slowing the whole thing down. I'd post your code from those and let the peanut gallery suggest optimizations. That's been really helpful to me in the past.

Cheers,

Scott

#31434 - ghostils - Fri Dec 10, 2004 11:41 pm

What I need to do is display my main background full screen image faster than it is currently... I'm using the standard nested for/loop method to display it now.... (really slow in mode 3) I've heard you can use DMA to write a full screen to a 2nd buffer in EWRAM... I would like to know how to do this...

I don't need to constantly update the screen with this background it just needs to load quicker when the rom boots thats all. The rest of my routines are satisfactory for that I need to do in mode 3. (The area of the screen that will be updated is a solid color so my janky code can just replace the 0 bits of my text and small selector bitmaps with the appropriate bgcolor) to create psuedo transparancy.


But if you really need to see my bg draw routine here we go:


Code:

void draw_bg(u8 *gfx, u16* palette);

//-Draw a full background with full color:
void draw_bg(u8 *gfx, u16* palette)
{
   int x = 0, y = 0, i = 0;


   for(y = 0; y < 160; y++)
   {
      for(x = 0; x < 240; x++)
      {
         theVideoBuffer[x + y * 240] = palette[gfx[i]];
         i++;
      }
   }
}





Thanks,

-ghost[iLs]

[/code]

#31448 - Lupin - Sat Dec 11, 2004 12:33 am

why do you use a palette for a full color background? Then you can also use paletted bitmap mode... but if you need full color mode i would store the image as a full color image and copy it with DMA - if you want to do it really fast you can first copy the image to EWRAM and then copy it over to VRAM (though this is only needed if you need to do it more than once)


Code:

#define DMACopyCH3(source,dest,wc,mode)      REG_DMA3SAD = (u32)source; \
                                   REG_DMA3DAD = (u32)dest; \
                                   REG_DMA3CNT = wc | mode;

/////COPY IMAGE TO SCREEN
DMACopyCH3(imgdata,VideoBuffer32,19200,DMA_32NOW);


I don't know how to declare an array in EWRAM using DKA but I am sure it is possible using DKA, so when you have an array in EWRAM just copy the image there first and then copy it to video Buffer when you need it.
_________________
Team Pokeme
My blog and PM ASM tutorials

#31449 - sajiimori - Sat Dec 11, 2004 12:33 am

There are many things to say about that code, but the bottom line is that your data doesn't match the video mode you're using, and the runtime translation from 8-bit palettized graphics to truecolor prevents you from doing any significant optimizations.

Well, the compiler may not notice that x+y*240 is always equal to i, so removing x and y might make a dent. It's insignificant compared to using data that matches the video mode, though.

#31457 - ghostils - Sat Dec 11, 2004 12:51 am

I used the -O3 compiler opts as suggested above and that sped it up almost 2x... But I'm stilll looking for more hehe. As far as palette I was just using the data that gfx2gba1.03 created for me. As far as I know gfx2gba can only do 8bit, however the value it created in the palette array is a 16bit value so the data type should match the mode I"m in, Mode 3 is 16bit color correct? Wordsize shouldn't be an issue there. The only issue that still remains is how to create an EWRAM pointer that I can use to copy my data to via DMA. I found a good doc on the DMA register values. Now to solve the pesky EWRAM pointer.


Thanks,

-ghost[iLs]

#31460 - tepples - Sat Dec 11, 2004 12:58 am

Using devkitARM? Make an array in section ".sbss".
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#31462 - Lupin - Sat Dec 11, 2004 1:12 am

If you don't want to change to 16 bit (it would look better in 16 bit though...) you can just use your old function to copy into EWRAM then copy from EWRAM to VRAM using DMA, this would actually save you ROM size - at the cost of image quality of course...
_________________
Team Pokeme
My blog and PM ASM tutorials

#31470 - ghostils - Sat Dec 11, 2004 2:18 am

The DMA transfer is exactly what I was look'n for. almost no pause before the image is displayed. Also I got the EWRAM buffer working properly as well:

Code:

u16* Ewram_Ptr = (u16*)0x02000000;


Though I can see this creating issues when I try and boot this on a GBA if I overwrite the program code after multibooting. So its really that base address + my program size + 10K bytes for my app to grow... thats prolly a little overkill but it works for now., I tested it with VBA with pointer starting after the 100KB mark in EWRAM and it loads and dma's the data to video memory perfectly.


Thanks, for all the help and pointers I've recieved so far,

-ghost[iLs]