gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Coding > sb plz help me to optimize this code further more...

#3658 - jammin.won - Tue Mar 04, 2003 9:18 am

/* i was trying to filling all(maybe part) the color as fast as possible under MODE_3... but yet it is the best algorithm i can figure out in my knowledge ,and i guess that must not be the best ,so plz help me ,bros =)*/


void PlotPixel(u16 x,u16 y,u16 color){
u16* vram=REG_VRAM;
vram[(y)*240+(x)]=(color);
}

int main(void){

SET_DISP_MODE(DISP_MODE_3 | DISP_BG2_ON);

u16 x,y,r,g,b,count;

for(y=0;y<160;y++){
r=(int)(y*0.199);
b=31-r;
count=4;
g=0;

for(x=0;x<240;x++,count++){
if(count>7){ g++;
count=0;
}
PlotPixel(x,y,SET_RGB(r,g,b));
}
}
}


//thx thx thx =)

#3660 - pelrun - Tue Mar 04, 2003 12:19 pm

Here's my quick stab at it. And it is definitely sub-optimal - further improvements are an exercise for the reader :) You should be able to rework this to only require one loop...

The r variable does not need a multiplication each iteration, since all that happens is it gets incremented by 0.199. The r2 variable is there so I don't have to perform an int cast every pixel when it's only needed once each line.

The count and g variables (and the if statement using them) are unnecessary as g can be directly calculated from x using just an add and a shift.

Code:

void PlotPixel(u16 x,u16 y,u16 color){
    u16* vram=REG_VRAM;
    vram[(y)*240+(x)]=(color);
}

int main(void){
    SET_DISP_MODE(DISP_MODE_3 | DISP_BG2_ON);

    // CHANGE: Remove g and count vars, add r2
    u16 x,y,b,r2=0;
    // CHANGE: Define r as float, initialise to zero.
    float r=0;

    for(y=0;y<160;y++){
        // CHANGE: r=(int)y*0.199; removed
        b=31-r;
        // CHANGE: count=4; g=0; dropped

        // CHANGE: count++ removed
        for(x=0;x<240;x++){
            // CHANGE: calculate g from x directly
            PlotPixel(x,y,SET_RGB(r2, (x+4)>>3, b));
        }

        // CHANGE: r+=0.199; r2=(int)r; added
        r+=0.199;
        r2=(int)r;
    }
}

#3662 - gb_feedback - Tue Mar 04, 2003 2:19 pm

Or...

Code:
void ClearScreen()
{
   
   u16 paperColour = RGB(255,0,0);

   u16* ScreenBuffer = (u16*)0x6000000;
   //this is the start of video memory

   REG_DM3SAD = (u32) &paperColour;
   REG_DM3DAD = (u32) ScreenBuffer;

   REG_DM3CNT = ((240 * 160) | VAL_DMACNT_SRC_INC_NC |
                     VAL_DMACNT_ENABLE |
                     VAL_DMACNT_SIZE_16 |
                     VAL_DMACNT_START_VBL);

   while (REG_DM3CNT_H & 0x8000)   // wait for completion
      ;
}


...(untested)
_________________
http://www.bookreader.co.uk/

#3665 - jammin.won - Tue Mar 04, 2003 2:31 pm

truely thankful to pelrun and gb =) that really helps a lot

ill try it out tomorrow (china's tomorrow) ...

#3666 - jammin.won - Tue Mar 04, 2003 4:51 pm

well....i thought it again and used something in Perlrun's reply then formed this solution ... maybe not the best but its far better than the last one ....thanks again to Pelrun

btw...i really realized now the difference between simple addition/subtraction and multiplication/division ,even the difference between multi and div is substantial...which i never gave them a thought in pc-programming =)

and...sb told me that i can actually create a rectangle of 30*32,and enlarge it to fit the screen.....so far ,i didnt know how to do that .... and dunno weather it's even faster than the code below

the complete simple code:
Code:
//brief header defination
typedef   unsigned short int   u16;
#define REG_DISPCNT      *(u16*)0x04000000   //DISPLAY CONTROL REGISTER
#define REG_VRAM_16B   (u16*)0x06000000   //address of Video RAM,MODE_3,5
#define DISP_MODE_3   0x0003      //BG 2 with 240*160. 16-bits TRUE color. ONLY 1 frame buffer
#define DISP_BG2_ON         0x0400      //BG2 enabled
#define SET_DISP_MODE(_MODE_)   REG_DISPCNT=(_MODE_)   //macro to setup screen mode
#define SET_RGB(r,g,b)   ((r)+((g)<<5)+((b)<<10))   //use bitwise shift to mount to proper position


int main(void){

   SET_DISP_MODE(DISP_MODE_3 | DISP_BG2_ON);
   u16* vram=REG_VRAM_16B;   
   u16 x,y,b,addr;

   addr=0;
   for(y=0;y<32;y++){
      b=31-y;
      for(;addr<y*1200+1200;){
         for(x=0;x<30;x++){
            vram[addr]=SET_RGB(y,x,b);
            vram[addr+1]=vram[addr];
            vram[addr+2]=vram[addr];
            vram[addr+3]=vram[addr];
            vram[addr+4]=vram[addr];
            vram[addr+5]=vram[addr];
            vram[addr+6]=vram[addr];
            vram[addr+7]=vram[addr];
            addr+=8;
         }
      }
   }

   while(1);//does nothing
}
[/code]

#3676 - peebrain - Tue Mar 04, 2003 10:33 pm

Why exactly aren't you using DMA's?

Edit: oh nevermind

~Sean
_________________
http://www.pbwhere.com

#3705 - chrisrothery - Wed Mar 05, 2003 6:45 pm

Code:
for(;addr<y*1200+1200;){


You should definitely do something with the limits for addr.

Firstly, I think that each time round this loop, it's going to do y*1200+1200 to evaluate the condition, it's fixed for that for loop so do maxaddr = y * 1200+1200 and change the loop to limit on y < maxaddr.

Second, maxaddr is a fixed set of values (1200, 2400, 3600 etc.), so why calculate it as a function of y anyway. initialise it to 1200 and increment by 1200 during the loop.

Not tested any of this but it's the most obvious thing left after the changes you've made to the original.

#3706 - jammin.won - Wed Mar 05, 2003 7:25 pm

thats right...i also noticed it when i used the changed code in another program,actually no multipulation is needed =)

and...lots of ppl reminded me of using DMA....hmmm,i haven't learned it yet....the current stage of my learning is SPRITE.... but i must try DMA out coz it sounds so sweet

thanks for all the reply ,that helps really much