gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Coding > Style Question

#163298 - Talen - Mon Sep 29, 2008 1:42 am

im having a bit of a style delima

im working on some NDS code involving 64x64 tile BG's
and what im wondering is i have 2 ways to deal with my position 2 memory index converter.

its a bit of a delima for me just want to know what someone else thinks

first way is

Code:

#define POS2IDX(x, y)   ((((x/32)+((y/32)+(y/32)))*32*32) + ((x%32) + ((y%32)*32)))

memoryMap[POS2IDX(x,y)] = tileId;


the second way is

Code:

// Number of tiles in a `small' map (i. e. not the full 64x64 one)
const int _TILE_W = 32;
const int _TILE_H = 32;

// Sets a tile in 64x64 map
// It takes care of the fact that the DS puts 4 32x32 maps after each other
void setTile(u16* memoryMap, int x, int y, const u16 tileId)
{
   int n;

   
   if(x >= _TILE_W && y >= _TILE_H)
   {
      n = 3;
      x -= _TILE_W;
      y -= _TILE_H;
   }
   else if(x >= _TILE_W)
   {
      n = 1;
      x -= _TILE_W;
   }
   else if(y >= _TILE_H)
   {
      n = 2;
      y -= _TILE_H;
   }
   else
   {
      n = 0;
   }

   memoryMap[n * _TILE_W * _TILE_H + x + y*_TILE_W] = tileId;
}

#163301 - Dwedit - Mon Sep 29, 2008 3:16 am

Compile both with -S and see which one executes fewer instructions.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

#163322 - Miked0801 - Mon Sep 29, 2008 4:33 pm

My calcs show that if you change around the if check, they will both run roughly the same speed:
Code:

(2 * y/32 + x/32) * 1024 + (y & 0x1F) * 32 + x & 0x1F

r0 add y,shift 5
r1 add x, shift 5
r0 add r1
shift 10
and Y shift 5
and x 1f
add x,y
mem op


vs
Code:

n=0;

if(x > W)
{
    x |= 1024;
}

if(y > H)
{
    y |= 2048;
}
memMap[n + x + y *W] = tileId;

mov reg,0
cmp r1, W
orrgt reg,#
cmp r2,H
orrgt reg,#
add x,reg
add reg,y,shift

memop


So both take around 7 cycles to setup in ARM and then the mem op. If you can change your math a bit so that you aren't relying on integer math to shift away extra low bits, you can make the 1st version 1 or 2 cycles faster.

#163340 - Talen - Tue Sep 30, 2008 1:13 am

i managed to talk to my old professor today he recomened using the first option with shifts and xor and if i can masks

its been real funn teaching myself bitwise operations........ since i just hadda take C++ classes instead of C lol

edit**
my final code ended up being

Code:
#define POS2IDX(x, y)   ((((x>>5)+((y>>5)+(y>>5)))<<10) + ((x&31) + ((y&31)<<5)))

#163544 - furrykef - Sat Oct 04, 2008 6:36 pm

I think you should use an inline function instead of a macro, so it would look something like this:

Code:
inline unsigned int Pos2Idx(unsigned int x, unsigned int y)
{
  return (((x/32)+((y/32)+(y/32)))*32*32) + ((x%32) + ((y%32)*32));
}


I went with the version with multiplies, divisions, and mods because it's clearer and it should compile to the same code (as long as the variables are unsigned).

Functions tend to have more intuitive behavior than macros and you should prefer them whenever possible.

- Kef

#163606 - Cearn - Mon Oct 06, 2008 1:12 pm

Two small notes on what furrykef said. First, GCC is somewhat odd in how it handles the `inline' keyword. With just `inline', a linkable symbol will still be created in each file the function is #included into, giving linker errors concerning multiple definitions. To avoid that, use `static inline'. Second, basic arithmetic (+-*/%) don't need so many parentheses and it's probably clearer with fewer them:

Code:
// NOTE: the *2 is required only for 64x64 BGs.
static inline unsigned int Pos2Idx(unsigned int x, unsigned int y)
{
    return (x/32 + y/32*2)*32*32 + x%32 + y%32*32;
}

Other than that, what he said.

#163612 - furrykef - Mon Oct 06, 2008 4:03 pm

Yeah, I simply copied the line from the macro so I didn't bother removing all those extra parentheses. Mainly because I didn't want to do it wrong and screw the whole thing up, I guess.

Anyway, using "static inline" shouldn't be necessary if you're compiling C++ code, but it seems that it is if you're compiling C code. (Using "static inline" in C++ code probably wouldn't hurt, though.) Nice catch. Is it required when you compile in C99 mode as well? I think GCC still defaults to C89 mode.

- Kef

#163676 - ninjalj - Wed Oct 08, 2008 5:09 pm

Cearn wrote:
First, GCC is somewhat odd in how it handles the `inline' keyword. With just `inline', a linkable symbol will still be created in each file the function is #included into, giving linker errors concerning multiple definitions. To avoid that, use `static inline'.


Well, that's because in C static symbols have scope limited to the compilation unit (.c file), while non static symbols can be seen from any compilation unit.

Thus, an inline function is compiled inline when called from the current compilation unit, but also has a non inline expansion for external (to the current compilation unit) calls. static inline functions don't need the non inline expansion.