gbadev.org forum archive

Hi,
Does anyone know a fast method of doing and inverse tangent? Basically, I want a direction from one point to another. The X and Y values that would be the parameters are sfp32, 1:15:16, and I just want an integer value returned for the angle, between 0 and 360. Right now I convert to float and use the math function for tan inverse, but doing those float computations very often (I want to use arc tan at least once every cycle) will probably slow things down in the future.

So what I am looking for is this:

u16 ArcTan(sfp32 Y, sfp32 X)
{
u16 Angle

//please help me here, something faster than:
// Angle = (u16)ArcTan(Y/X)
//where Y and X are converted to floats from sfp32

return Angle;
}

Anyone have any thoughts? Thanks in advance.

The fastest method is to use a lookup table. You just make a table of values, and lookup things in it. Something like:

Code:

angle = table[(y * 10) / (x * 10)];

The basic idea of the code above is to keep you from needing a table of 2^64 values, which would take up way too much memory. You'll sacrifice some accuracy, but it shouldn't matter for a simple game. If you need more accuracy, make it "* 100" or more. You'll have to figure out just how many entries your table will need, but that won't be hard to do. As for the table it's self, you can make a very simple C program to make one for you that you can directly import into source code.

If you need more help, I'll be glad to try. Does this help?
_________________
--Michael

Thanks for the reply, mbcook. I use a lookup table of course for the sin and cosine, which works out really well because the input for a sin or cos is in my case an integer and I only need a table of 450 values to do it, easily filled by a simple C program.

But for Arc Tan I was looking for more accuracy. I guess if I am going to cast my result to an int anyway I shouldn't be so picky. The input ratio of X/Y can have infinite values, and I could just trim it down to a reasonable estimate for a lookup table.

I wonder how big trig lookup tables are on the average calculator? I bet Texas Instruments uses some proprietary estimation algorithms instead tables.

Thanks mbcook.
Anybody else have any suggestions?

KashinKoji wrote:

The input ratio of X/Y can have infinite values

That can be worked around. Either the absolute value of Y/X is less than or equal to 1, or the absolute value of X/Y is less than 1. If you make your arctan2() function take both a y and an x value, you need only store values for 0 < y/x < 1, turn the fraction over if abs(y) > abs(x), and linearly interpolate values between the ones stored in the table. The arctan2() function becomes a divide, a multiply, and a few comparisons and additions.

Quote:

I bet Texas Instruments uses some proprietary estimation algorithms [in its calculators].

Proprietary? The graphing calculators probably use piecewise Taylor series approximations. Taylor series were invented before 1715; any patent on Taylor series has long since expired.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

Tepples is right about the taylor seires, but even if it didn't use that, a table would be a waste of rom. While arctan and such may be "slow" on a GBA, when you have a calculator with an operator who can't even react faster than 1/20th of a second (very rough number) the speed doesn't matter as long as it's not much longer than that. It will still appear to be instantaneous. Much cheaper than a table in most cases, I'd assume.
_________________
--Michael

mbcook wrote:

a table would be a waste of rom

A table might be a waste of ROM on a calculator but not in a GBA game. I just coded up a linear interpolated approximation to arctan() in Maple, and even an approximation with a 9-point table (from y/x = 0 to 1 by 1/8) is never off by more than 0.075 degree. Such a small error shouldn't practically affect object direction calculation.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

an atan table needs only 1/4th as many entries as you have angles. 128 integers (given 512 possible angles) is hardly a waste of ROM. still, to use such a table requires a binary search for the correct y/x ratio.

here's what you really want: Fixed-Point Atan2 With Self Normalization

Last edited by dukope on Mon Feb 24, 2003 5:04 am; edited 2 times in total

I agree, I think you guys missed my point though. In a gameboy where you might have to use the table hundreds of times per second, it's not only a waste, it's a good idea. But on a simple scientific calculator that you can buy for $10 where you don't need the speed, it's a waste. I love off topic discussions.
_________________
--Michael

concerning the table array ...

better use powers of 2 (e.g 64, 128, 256, ...) rather than 10 or 100 for premultipliers. The resulting assembly code will be more compact, hence faster. The ARM processor is able to perform a shift (can be used for multiplications by a power of 2) and an addition in a single instruction. this just rocks.

dukope wrote:

...here's what you really want: Fixed-Point Atan2 With Self Normalization

except that it uses a divide which is very slow on gba. Interesting link though :)

cheers

col.

KashinKoji wrote:

...I wonder how big trig lookup tables are on the average calculator? I bet Texas Instruments uses some proprietary estimation algorithms instead tables. ...

Most calculators used to use a variation on the 'cordic' (COrdinate Rotation DIgital Computer) approach - this can be implemented very cheaply in hardware as it uses shifts and adds.

"The CORDIC algorithm is a shift-add algorithm for computing trigonometric, hyperbolic trigonometric and linear functions and their inverses. It can also be used for log, exponent and square root. Common uses are sine and cosine generation, vector magnitude, polar-cartesian conversions, and vector rotation."

do a google on cordic - there's tons of info out there.

cheers

col.

Wow,
Thanks everybody for the input. Here's what I cooked up: (in pseudocode)

sfp16 ArcTanTable[90];

for(i = 0; i < i++)
{
ArcTanTable[90] = tan(RADIAN(i)) << 8SHIFT;
}

//that is how I generate the 90 point table in sfp16, (to be hardcoded) and I use a macro to //get the values from it into sfp32

#define GetTanVal(a) ArcTanTable[(a)] << 8SHIFT;

u16 MyArcTan(sfp32 DeltaX, sfp32 DeltaY)
{
sfp32 Ratio;
u16 Angle;
//check for special cases first, X or Y equalling 0, then get on to the good stuff

if(abs(DeltaX) > abs(DeltaY))
Ratio = abs(DeltaX) / abs(DeltaY);
else
Ratio = abs(DeltaY) / abs(DeltaX);

// now a quick binary sort to find the nearest ratio in the table. getting an upper and lower
//bound than a quick comparison to find which is closest, and setting the angle to that table
//index. I won't bore you with the details, but what it gives me is the nearest acute angle.

//so then I just check which quadrant from the point of the sprite that
//my angle is in by the signs of DeltaX and DeltaY, and determine the final angle based
//on which ratio i used, X/Y or Y/X.

if(DeltaX > 0)
{
if(DeltaY > 0) //quadrant III
{
if abs(DeltaY) > abs(DeltaX)
return 180 - Angle;
else
return 90 + Angle;
}
if(DeltaY < 0) // quadrant II
{
if(abs(DeltaY) > abs(DeltaX))
return 180 + Angle;
else
return 270 - Angle;
}
}
else if(Delta X < 0)
{
if(DeltaY > 0) // quadrant IV
{
if(abs(DeltaY) > abs(DeltaX))
return Angle;
else
return 90 - Angle;
}
else if(DeltaY < 0) //quadrant I
{
if(abs(DeltaY) > abs(DeltaX))
return 360 - Angle;
else
return 270 + Angle;
}
}
}
}

My actual code looks a lot different than that, but you get the idea. This scheme works out rather well for me, since I am only looking for whole angle values. So thanks everybody for the input. (and if you have comments on that pseudocode algorithm I would welcome those as well)

Thanks especially Teppels, it was your comment on switching the y/x and x/y for a ratio always less than 1 that really helped me. The kind of thing that you look at and think, "why didn't I think of that?" Except for the binary sort, which uses division by 2 (actually a left shift of 1) It is just a bunch of comparisons and some addition or subtraction. No interpolation needed at all.

Here's a question out of curiousity:
When I call this function, would it be faster to add a couple of variables to store the absolute value of DeltaX and DeltaY, or would it be faster to continually call the abs function?
Is it the case that the more local variables you have in a function the greater the overhead when that function is called?

Thanks again.
[edited: I had a bunch of spaces in there to make the pseudocode readable, but they got discarded byt he message board, so sorry for the mess.]

Um... no one has suggested BIOS?

Copied from CowBite Spec:

Code:

0x09: ArcTan

Input: r0 = Tangent(angle) (16-bit; 1 bit sign, 1 bit integral, 14 bit decimal)
Output: r0 = "-PI/2<THETA/<PI/2" in a range of 0xC000h-0x4000.
Note: There is a problem in accuracy with "THETA<-PI/4, PI/4<THETA"

0x0A: ArcTan2

Note: I'm unsure about this one, since there is conflicting info about its purpose, and I have not tested it.
Version 1:
Calculates the arctangent of the given point.
Input: r0 = X (signed 16-bit), r1 = Y (signed 16-bit)
Output: r0=arctan

Version 2:
Calculates the arctangent after correction processing.
Input: r0 = Tangent(angle) (16-bit; 1 bit sign, 1 bit integral, 14 bit decimal)
Output: r0 = 0000h-FFFFh for 0<=THETA<2PI

~Sean
_________________
http://www.pbwhere.com

#define PI (4*atan(1))
#define SCALE (1<<26)

S32 Tan2Tab[32];

void maketab(void)
{
S32 *p=Tan2Tab;

for (double i=0;i<32;i++)
{
/* tan(a)=2^-n a = atan(2^-n) */

double j=2.0;
double tmp=pow(j,-i);
tmp=atan(tmp);

*p++=tmp*SCALE;
}
}

/* atan2 function in degs using cordic rotator */

S32 atan2c(S32 y,S32 x)
{
S32 tmp,ang=0,*p=Tan2Tab;
U32 tmp1;

/* 'normalise' for accuracy */

if (x<0)
{
x=-x;y=-y;
ang=PI*SCALE;
}

tmp1=x;

if (y>0)
{
tmp1|=y;
ang=-ang;
}
else tmp1|=-y;

while(!(tmp1 & 0x20000000)) {tmp1<<=1;x<<=1;y<<=1;}

/* 12 is good number */

for (int i=0;i<32;i++)
{
if (y<0)
{
ang-=*p++;
tmp=x-(y>>i);
y+=(x>>i);
x=tmp;
}
else
{
ang+=*p++;
tmp=x+(y>>i);
y-=(x>>i);
x=tmp;
}
}

return ang;
}

S32 acosc(S32 x)
{
S32 y=sqrt(1.0*SCALE*SCALE-1.0*x*x);
return atan2c(y,x);
}

void main(void)
{
maketab();

#if 1 /* atan2 */

for (double ang=0;ang<360;ang+=1)
{
double a=ang*PI/180;

S32 x=cos(a)*SCALE;
S32 y=sin(a)*SCALE;

double angi=atan2c(x,y)*1.0/SCALE;
double angf=atan2(x,y);

printf("int %lf fpu %lf\n",angi,angf);
}
#else /* acos */

for (double ang=0;ang<360;ang+=1)
{
double a=ang*PI/180;

S32 x= cos(a)*SCALE;

double angi=acosc(x)*1.0/SCALE;
double angf=acos(1.0*x/SCALE);

printf("int %lf fpu %lf %lf\n",angi,angf,100.0*angi/angf);
}

#endif
}

peebrain wrote:

Um... no one has suggested BIOS?

Copied from CowBite Spec:

Code:

0x09: ArcTan

Input: r0 = Tangent(angle) (16-bit; 1 bit sign, 1 bit integral, 14 bit decimal)
Output: r0 = "-PI/2<THETA/<PI/2" in a range of 0xC000h-0x4000.
Note: There is a problem in accuracy with "THETA<-PI/4, PI/4<THETA"

0x0A: ArcTan2

Note: I'm unsure about this one, since there is conflicting info about its purpose, and I have not tested it.
Version 1:
Calculates the arctangent of the given point.
Input: r0 = X (signed 16-bit), r1 = Y (signed 16-bit)
Output: r0=arctan

Version 2:
Calculates the arctangent after correction processing.
Input: r0 = Tangent(angle) (16-bit; 1 bit sign, 1 bit integral, 14 bit decimal)
Output: r0 = 0000h-FFFFh for 0<=THETA<2PI

~Sean

Oh yeah bios swi is the key to fast physics calculatios. Here are few SWI math function lookalike inline assembler defines (whee) that I fumbled up.
Just use big integer numbers to get some precision. I use 1024x times bigger than "real" values. So if you are doing physics that needs trigonometry you need 4 things: cos, sin, atan and square root. My advice is to use lookup tables for cos and sin ( lets say like 360 integer entries in array ), then use SWI calls for square root and arc tan. That swi atan func below returns degree as in integer not radian/float. Same goes for square root, eats int pukes int. See those above cowbite specs for more specific declaration for those ints.

-ari rusakko

ps. If this didn't make much sense, I could try to explain it in detail in email or AIM (email: rusakko@lut.fi / AIM: ruzakko)

pps. these SWI calls are _fast_

Code:

#define SWIsqrt(x) (\
{\
u16 __value ;\
u32 __arg = (x);\
__asm volatile (\
" mov r0,%1 \n\
   swi 0x80000 \n\
   mov %0,r0"\
   : "=r" (__value) \
   : "r" (__arg) \
   : "r0");\
__value;\
})

#define SWIatanDEC(x,y) (\
{\
s16 __value ;\
s16 __arg1 = (x);\
s16 __arg2 = (y);\
__asm volatile (\
" mov r0,%1 \n\
   mov r1,%2 \n\
   swi 0xa0000 \n\
   mov %0,r0"\
   : "=r" (__value) \
   : "r" (__arg1) \
   , "r" (__arg2) \
   : "r0", "r1");\
(__value/180);\
})

_________________
Current binary for my pinball game demo:
www.lut.fi/~rusakko/gba/flibu.gba

gbadev.org forum archive

Coding > Fast Fixed Point Arc Tan function

#3378 - KashinKoji - Sun Feb 23, 2003 9:01 pm

#3385 - mbcook - Sun Feb 23, 2003 9:44 pm

#3386 - KashinKoji - Sun Feb 23, 2003 10:06 pm

#3390 - tepples - Sun Feb 23, 2003 10:33 pm

#3396 - mbcook - Mon Feb 24, 2003 12:45 am

#3403 - tepples - Mon Feb 24, 2003 2:53 am

#3404 - dukope - Mon Feb 24, 2003 3:41 am

#3406 - mbcook - Mon Feb 24, 2003 4:08 am

#3421 - zeuhl - Mon Feb 24, 2003 2:34 pm

#3425 - col - Mon Feb 24, 2003 5:10 pm

#3427 - col - Mon Feb 24, 2003 5:16 pm

#3441 - KashinKoji - Mon Feb 24, 2003 10:50 pm

#3458 - peebrain - Tue Feb 25, 2003 8:43 am

#3645 - AnthC - Mon Mar 03, 2003 11:03 pm

#4204 - excessus - Sun Mar 23, 2003 12:43 pm