gbadev.org forum archive

Hey all,

I'm not a new programmer, but I am relatively new to the GBA platform. As I understand it, the ARM CPU does not have native hardware support for floating point mathamatics, but rather has an emulation of it and thus it should be avoided whenever possible since it will slow down performance. I also understand that fixed point math is the middle ground to this delima. What I wanted to know was how fixed point math works as opposed to floating point. How are floating point numbers stored in memory as opposed to fixed point?

I have read some of the ARM application notes on Advanced Risc Machines' web site but most of it consisted of mathamatical formulae to convert from one form to another but I am curious as to the inner workings of these formats. If I can grasp the technical difference, I think I would be more compentant in being able to correctly implement the fixed number format.

Thanks in advance!

M. Knittel

I'll try and explain my understanding of it with a couple of examples, having used fixed points a fair amount recently to do some 3D stuff.

The decimal point in "fixed point" numbers is always in the same place, as the name suggests. You can place it anywhere in an integer, but the usual place is 8 bits in using signed long ints (which have 4 bytes = 32 bits) Though it may be usefull to shift it more or less depending on the data range.

What you do is shift any decimal point number by the number of bits. Instead of having say 0.1, you would have 0.1<<8 or 0.1 * 256... which is 25.6, though as we store it as an int, we lose that bit of decimal pointage to store 25. Then you use this value all the time instead of 0.1. Usually you multiply a fixed point by a big number at some point and want to find the result, or you use some comparisons to find angles, etc.

In the first case you need to shift back down to find the "true" result. So for example 0.1 * 100 = 10. In the fixed notation... 100 * 25 = 2500 then shift down 8... 2500>>8 ~= 2500/256 = 9. More or less right, and for the sort of things fixed points are used for (3D rotations, etc) such rounding errors aren't usually as catastrophic as they appear. If you're careful with ranges,etc.

The second example, if you want to compare a "decimal point number" to a value... say you want something like:

if (x > 0.95)

then if you use fixed point maths, x will be shifted up by 8.. so you would need:

if (x > 243)

because if you simply shift x back down to compare, then it would lose all its data. (if x is less than 1<<8 then x>>8 will be zero)

I've probably missed out something fundamental, I've not mentioned the stuff like what happens when you divide or multiply fixeds because you've probably read that already. They aren't that difficult to grasp once you realise that all fixeds are, are smallish numbers multiplied by a bigish number so that the result can be stored in an int. When you want the smallish number back, you need to divide (i.e. shift down) by the bigish number.

Thanks for informative reply! I understand the concept of shifting bits, so this makes perfect sense. What I also want to know is what goes on at a binary level. Say for instance:

You have an unsigned short int of 16 bits. The bits are represented as:

00000000 00011001 (25 in decimal)

I have heard that fixed point math is kind of like normal decimal point math in that you can put a "point" somewhere along the bits like so:

00000000 0001.1001

which would make it 1 and 1001 out of 1111 possible binary values, or 1 and 9/16, or 1.5625 in decimal notation. What I want to know is how floats are stored and therefore how converting from floats to fixed works. Such as how does one represent 1.975 in fixed point math at a binary level? Are floats stored in a similar way and thus able to be stored into fixed notation relatively easily? How do floats calculate decimal values? If I understood both I could easily manipulate one from the other.

M Knittel

It isn't that complicated. Fixed point really is just "multiply up and lob off what's left over". So 1.975 would just be...

1.975 <<8 = 505 -> 111111001

'course, you wouldn't have 1.975 in floating point format at any point on the GBA, just its fixed representation.

I don't know 100% the binary representation of floats, I've only learned what I needed for GBA programming, but there's stacks of examples on the web.

http://www.google.com/search?q=how+are+floats+stored+in+binary&btnG=Google+Search&hl=en&lr=&ie=UTF-8&oe=utf-8

Thanks for the link. I read a couple of the articles and seem to understand things better. My final question is:

If the ARM processor doesn't natively process floating point units, and thus our need for using fixed point variables, would it impact the performance of the code to do something like:

typedef FIXED signed long int;

FIXED myFixedVariable = (int)(5.23723 << 8)

Seeing that 5.23723 is a floating point number, would this code snippet require the ARM CPU to emulate floating point math and slow things down?

Most compilators calculate constants and store them directly. I hope the GBA-compiler does :)

I'am not sure floats can be shifted.... can they?

" *256.0 " is equivalent of " <<8 "

Floats cannot be shifted.
_________________
You can't beat our meat

How then could one do the equivalent of storing a float constant into a fixed var?

Well, multiplication of course. :) Shift a integer value left (assuming that the left most bit is the most significant bit) by eight steps will result in the value being 256 times bigger, so you could multiply the original value by 256 to get the same result as arithmetic shift left.
_________________
You can't beat our meat

I define a type for my fixed-point values:

typedef long sfxd32; // signed, fixed-point 32-bit.

And then define:

#define I2F(x) ((x) >> 16)
#define F2I(x) ((x) << 16)

To convert between my fixed-point and integer variables.

Also, so I don't confuse myself, I tag all my fixed-points with 'F', e.g.:

int x;
sfxd32 xF;

x = F2I(xF);

To assign constants, you can:

xF = (sfxd32)(543.125 * 256);

And this is all done at compile time.

Addition of fixed-point values is simple.

zF = xF + yF;

Multiplication needs to be corrected:

zF = ((long long)xF * (long long)yF) >> 16;

You can use the ARM SMULL (I think) to do it better in asm.

The shift is required because you're multiplying values that have already been multiplied by 2^16:

xF = (float)x * 2^16.
yF = (float)y * 2^16.
so,
zF = x * y * 2^32.

You want to normalize that back to 2^16, so you >> 16.

Division works the same way, except you'll want to << 16 before the division.

Grumpy.

I see what you mean. Essentially I need to do this?

(assuming myDoubleValue is a constant and not a variable)

typedef FIXED signed short int;
FIXED myFixedVar = (FIXED)(myDoubleValue * (double)(1 << myShiftAmount));

Which would:

1) evaluate 1 << myShiftAmount to the multiplier (ex: myShiftAmount is 8, the multiplier would evaluate to 256)
2) cast the multiplier as a double to make it compatible with myDoubleVar (ex: 256 becomes 256.0)
3) multiply the constant myDoubleValue amount by the multiplier (ex: 256.0) to yield a double constant representing this amount
4) cast the multiplied double value into the FIXED type, truncating the decimal remainder and storing the value as an integer at a binary level.

Correct?

Ok I see the logic to having to compensate for operations on fixed number types... I still remember some grade-school math ;)

animension wrote:

I see what you mean. Essentially I need to do this?

(assuming myDoubleValue is a constant and not a variable)

typedef FIXED signed short int;
FIXED myFixedVar = (FIXED)(myDoubleValue * (double)(1 << myShiftAmount));

Which would:

1) evaluate 1 << myShiftAmount to the multiplier (ex: myShiftAmount is 8, the multiplier would evaluate to 256)
2) cast the multiplier as a double to make it compatible with myDoubleVar (ex: 256 becomes 256.0)
3) multiply the constant myDoubleValue amount by the multiplier (ex: 256.0) to yield a double constant representing this amount
4) cast the multiplied double value into the FIXED type, truncating the decimal remainder and storing the value as an integer at a binary level.

Correct?

Right-on. In fact you don't need step 2, as (1 << myShiftAmount) will be promoted to a float automatically.

Grumpy.

[quote="grumpycat
Right-on. In fact you don't need step 2, as (1 << myShiftAmount) will be promoted to a float automatically.

Grumpy.[/quote]

Sweetness! Thanks for all the help you guys! This is one of the reasons I have come to love developing for this platform: comraderie! :)

gbadev.org forum archive

Coding > Fixed point math... how does it work?

#511 - animension - Wed Jan 08, 2003 8:46 am

#512 - Quirky - Wed Jan 08, 2003 9:11 am

#514 - animension - Wed Jan 08, 2003 9:41 am

#524 - Quirky - Wed Jan 08, 2003 11:22 am

#554 - animension - Wed Jan 08, 2003 8:07 pm

#560 - coelurus - Wed Jan 08, 2003 8:34 pm

#578 - ampz - Wed Jan 08, 2003 9:43 pm

#579 - Touchstone - Wed Jan 08, 2003 9:45 pm

#590 - animension - Wed Jan 08, 2003 10:17 pm

#592 - Touchstone - Wed Jan 08, 2003 10:27 pm

#597 - grumpycat - Wed Jan 08, 2003 10:35 pm

#601 - animension - Wed Jan 08, 2003 10:45 pm

#603 - animension - Wed Jan 08, 2003 10:47 pm

#604 - grumpycat - Wed Jan 08, 2003 10:48 pm

#605 - animension - Wed Jan 08, 2003 10:50 pm