gbadev.org forum archive

Hi there,

Anybody know a fast way to divide by 24?
The multiply is easy (x<<4 + x<<3), but I can't work out the opposite...

Cheers.

shaymanjw wrote:

Hi there,

Anybody know a fast way to divide by 24?
The multiply is easy (x<<4 + x<<3), but I can't work out the opposite...

Cheers.

You could begin with: (x>>3)/3
There's still a division by 3, but perhaps someone knows how to optimize it.
There is always the solution of using a LUT then... For dividing a 16bit signed number, you would need a LUT of 8kbyte (32768 >> 3 = 4096, 2 bytes for each entry). It could be a valuable optimization on GBA (with LUT in ROM), but on NDS, it's likely a bad idea, due to cache... You'd better use hardware accelerators then.

Last edited by thoduv on Mon Sep 15, 2008 8:50 pm; edited 3 times in total

You could use:

Code:

int div24(int x)
{
int newX = x;
int sum = 0;
do
{
sum += newX >> 5;
newX = x - ((sum << 4) + (sum << 3));
}
while(newX >= 48);
if(newX >= 24) sum++;
return sum;
}

Edit: Forgot code tags.

The hardware divider is asynchronous, which is good. And it's quite fast, too.
_________________
July 5th 08: "Volumetric Shadow Demo" 1.6.0 (final) source released
June 5th 08: "Zombie NDS" WIP released!
It's all on my page, just click WWW below.

I believe this should work:

Code:

unsigned int div24(unsigned int v)
{
return (unsigned int)((unsigned long long)v * 0xaaaaaaab) >> (33 + 3);
}

It's based on the math from this page.

The ARM architecture has a MUL instruction in both ARM and THUMB modes.
The MUL instruction would take 3 cycles to multiply by 24. In ARM mode, two shifted additions would take 2 cycles. So far, so good.
But in THUMB mode, you can't use the shifted addition instruction, so each shift has to be a separate instruction, taking 4 cycles instead of 2 cycles. So using MUL wins out in THUMB mode.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."

Multiply by the inverse with a fixed point number.

Something like:

Code:

static const int inverse24 = 65536 / 24;

int div24(int num)
{
return (num * inverse24) >> 16;
}

You lose a touch of precision, but it is very fast. On DS, just use the hardware divider.

Miked0801 wrote:

Multiply by the inverse with a fixed point number.
Something like:

Code:

static const int inverse24 = 65536 / 24;

int div24(int num)
{
return (num * inverse24) >> 16;
}

You lose a touch of precision, but it is very fast.

You can get a little better accuracy for x/N by using something like (0x10000+N-1)/N for the reciprocal.

Code:

// For N=24 and x=24:
0x10000/24 = 2730 (==0xAAA)
24/24 = 24*0xAAA>>16 = 0xFFF0>>16 = 0 :(

(0x10000+23)/24 = 2730 (==0xAAB)
24/24 = 24*0xAAA>>16 = 0x1800C>>16 = 1 :)

The second version gives identical results to standard integer division up to at least x=8192.

Also, when compiling to ARM code, gcc will automatically convert divisions by a constant to multiplication by its reciprocal, and take care of any signed/unsigned shenanigans to boot.

Not in my experienced the compiler doesn't :P Man I wish we had GCC...

(x/16 + x/32) / 2 = (3x/32) / 2 = (6x/64) / 2 = 3x/64 = x / 24

->

anyint x_div_24 = (((anyint)x >> 4) + ((anyint)x >> 5)) >> 1 ;

3 shifts, one add, no fancy magic numbers for each type of integer.
_________________
Trying to bring more detail into understanding the wireless hardware

Maxxie wrote:

(x/16 + x/32) / 2 = (3x/32) / 2 = (6x/64) / 2 = 3x/64 = x / 24

->

anyint x_div_24 = (((anyint)x >> 4) + ((anyint)x >> 5)) >> 1 ;

3 shifts, one add, no fancy magic numbers for each type of integer.

how accurate is this? am I wrong or does this give 0 for anyint = 24?

Maxxie wrote:

(x/16 + x/32) / 2 = (3x/32) / 2 = (6x/64) / 2 = 3x/64 = x / 24

->

anyint x_div_24 = (((anyint)x >> 4) + ((anyint)x >> 5)) >> 1 ;

3 shifts, one add, no fancy magic numbers for each type of integer.

3/64 is not 1/24; it's closer to 1/21. The calculation here is effectively multiplication by a reciprocal, just with very low precision. 1/24= 0x0.0AA..B = 0.000010101010...11. The pattern is such that you need a fair number of shifted-adds for any kind of accuracy.

Erm yes oups :p
_________________
Trying to bring more detail into understanding the wireless hardware

Are you guys sure this has to be in DS development ?!

@a128 - You might be right - I'm working on a tile based platform game for the DS with odd shaped tiles (32x24) and this seemed like a reasonable place.

Thanks for all the replies (apologies if this is the wrong place - Mods please move to wherever).

If you have 32x24 pixel tiles, then you might want to store your world coordinates as fixed-point fractions of a tile and then multiply by 24 for display.

That said, in an ARM (not Thumb) subroutine, try using the smull or umull instruction to multiply by 0x0AAAAAAB and then taking the upper bit.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

gbadev.org forum archive

DS development > Fast divide by 24

#162890 - shaymanjw - Mon Sep 15, 2008 7:58 pm

#162893 - thoduv - Mon Sep 15, 2008 8:35 pm

#162894 - chuckstudios - Mon Sep 15, 2008 8:38 pm

#162895 - silent_code - Mon Sep 15, 2008 8:54 pm

#162896 - kusma - Mon Sep 15, 2008 9:23 pm

#162897 - Dwedit - Mon Sep 15, 2008 9:30 pm

#162899 - Miked0801 - Mon Sep 15, 2008 9:50 pm

#162902 - Cearn - Mon Sep 15, 2008 11:14 pm

#162911 - Miked0801 - Tue Sep 16, 2008 6:24 am

#162932 - Maxxie - Tue Sep 16, 2008 5:19 pm

#162934 - elhobbs - Tue Sep 16, 2008 5:36 pm

#162935 - Cearn - Tue Sep 16, 2008 5:50 pm

#162936 - Maxxie - Tue Sep 16, 2008 5:55 pm

#162941 - a128 - Tue Sep 16, 2008 8:40 pm

#162944 - shaymanjw - Tue Sep 16, 2008 9:22 pm

#162957 - tepples - Wed Sep 17, 2008 3:46 am