#162890 - shaymanjw - Mon Sep 15, 2008 7:58 pm
Hi there,
Anybody know a fast way to divide by 24?
The multiply is easy (x<<4 + x<<3), but I can't work out the opposite...
Cheers.
#162893 - thoduv - Mon Sep 15, 2008 8:35 pm
shaymanjw wrote: |
Hi there,
Anybody know a fast way to divide by 24?
The multiply is easy (x<<4 + x<<3), but I can't work out the opposite...
Cheers. |
You could begin with: (x>>3)/3
There's still a division by 3, but perhaps someone knows how to optimize it.
There is always the solution of using a LUT then... For dividing a 16bit signed number, you would need a LUT of 8kbyte (32768 >> 3 = 4096, 2 bytes for each entry). It could be a valuable optimization on GBA (with LUT in ROM), but on NDS, it's likely a bad idea, due to cache... You'd better use hardware accelerators then.
Last edited by thoduv on Mon Sep 15, 2008 8:50 pm; edited 3 times in total
#162894 - chuckstudios - Mon Sep 15, 2008 8:38 pm
You could use:
Code: |
int div24(int x)
{
int newX = x;
int sum = 0;
do
{
sum += newX >> 5;
newX = x - ((sum << 4) + (sum << 3));
}
while(newX >= 48);
if(newX >= 24) sum++;
return sum;
}
|
Edit: Forgot code tags.
#162895 - silent_code - Mon Sep 15, 2008 8:54 pm
The hardware divider is asynchronous, which is good. And it's quite fast, too.
_________________
July 5th 08: "Volumetric Shadow Demo" 1.6.0 (final) source released
June 5th 08: "Zombie NDS" WIP released!
It's all on my page, just click WWW below.
#162896 - kusma - Mon Sep 15, 2008 9:23 pm
I believe this should work:
Code: |
unsigned int div24(unsigned int v)
{
return (unsigned int)((unsigned long long)v * 0xaaaaaaab) >> (33 + 3);
}
|
It's based on the math from this page.
#162897 - Dwedit - Mon Sep 15, 2008 9:30 pm
The ARM architecture has a MUL instruction in both ARM and THUMB modes.
The MUL instruction would take 3 cycles to multiply by 24. In ARM mode, two shifted additions would take 2 cycles. So far, so good.
But in THUMB mode, you can't use the shifted addition instruction, so each shift has to be a separate instruction, taking 4 cycles instead of 2 cycles. So using MUL wins out in THUMB mode.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#162899 - Miked0801 - Mon Sep 15, 2008 9:50 pm
Multiply by the inverse with a fixed point number.
Something like:
Code: |
static const int inverse24 = 65536 / 24;
int div24(int num)
{
return (num * inverse24) >> 16;
}
|
You lose a touch of precision, but it is very fast. On DS, just use the hardware divider.
#162902 - Cearn - Mon Sep 15, 2008 11:14 pm
Miked0801 wrote: |
Multiply by the inverse with a fixed point number.
Something like:
Code: |
static const int inverse24 = 65536 / 24;
int div24(int num)
{
return (num * inverse24) >> 16;
}
|
You lose a touch of precision, but it is very fast. |
You can get a little better accuracy for x/N by using something like (0x10000+N-1)/N for the reciprocal.
Code: |
// For N=24 and x=24:
0x10000/24 = 2730 (==0xAAA)
24/24 = 24*0xAAA>>16 = 0xFFF0>>16 = 0 :(
(0x10000+23)/24 = 2730 (==0xAAB)
24/24 = 24*0xAAA>>16 = 0x1800C>>16 = 1 :)
|
The second version gives identical results to standard integer division up to at least x=8192.
Also, when compiling to ARM code, gcc will automatically convert divisions by a constant to multiplication by its reciprocal, and take care of any signed/unsigned shenanigans to boot.
#162911 - Miked0801 - Tue Sep 16, 2008 6:24 am
Not in my experienced the compiler doesn't :P Man I wish we had GCC...
#162932 - Maxxie - Tue Sep 16, 2008 5:19 pm
(x/16 + x/32) / 2 = (3x/32) / 2 = (6x/64) / 2 = 3x/64 = x / 24
->
anyint x_div_24 = (((anyint)x >> 4) + ((anyint)x >> 5)) >> 1 ;
3 shifts, one add, no fancy magic numbers for each type of integer.
_________________
Trying to bring more detail into understanding the wireless hardware
#162934 - elhobbs - Tue Sep 16, 2008 5:36 pm
Maxxie wrote: |
(x/16 + x/32) / 2 = (3x/32) / 2 = (6x/64) / 2 = 3x/64 = x / 24
->
anyint x_div_24 = (((anyint)x >> 4) + ((anyint)x >> 5)) >> 1 ;
3 shifts, one add, no fancy magic numbers for each type of integer. |
how accurate is this? am I wrong or does this give 0 for anyint = 24?
#162935 - Cearn - Tue Sep 16, 2008 5:50 pm
Maxxie wrote: |
(x/16 + x/32) / 2 = (3x/32) / 2 = (6x/64) / 2 = 3x/64 = x / 24
->
anyint x_div_24 = (((anyint)x >> 4) + ((anyint)x >> 5)) >> 1 ;
3 shifts, one add, no fancy magic numbers for each type of integer. |
3/64 is not 1/24; it's closer to 1/21. The calculation here is effectively multiplication by a reciprocal, just with very low precision. 1/24= 0x0.0AA..B = 0.000010101010...11. The pattern is such that you need a fair number of shifted-adds for any kind of accuracy.
#162936 - Maxxie - Tue Sep 16, 2008 5:55 pm
#162941 - a128 - Tue Sep 16, 2008 8:40 pm
Are you guys sure this has to be in DS development ?!
#162944 - shaymanjw - Tue Sep 16, 2008 9:22 pm
@a128 - You might be right - I'm working on a tile based platform game for the DS with odd shaped tiles (32x24) and this seemed like a reasonable place.
Thanks for all the replies (apologies if this is the wrong place - Mods please move to wherever).
#162957 - tepples - Wed Sep 17, 2008 3:46 am
If you have 32x24 pixel tiles, then you might want to store your world coordinates as fixed-point fractions of a tile and then multiply by 24 for display.
That said, in an ARM (not Thumb) subroutine, try using the smull or umull instruction to multiply by 0x0AAAAAAB and then taking the upper bit.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.