gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

DS development > Using 20:12 fixpoint with Cross / Dot products

#146209 - a128 - Fri Nov 30, 2007 3:06 pm

While using 20:12 fixpoint math ....I sometimes have overflows while computing vector cross products or dot products

multiplications and a sum of i.e.
Code:
a*a + b*b + c*c
will produce overflows sometimes

does someone have an idea how to avoid this ?

a trivial solution is to use "float" ..but is there a better solution?

#146212 - simonjhall - Fri Nov 30, 2007 4:53 pm

err....isn't this the same as http://forum.gbadev.org/viewtopic.php?t=14086 ;-)
Did you managed to sort it out before, of have the overflow problems come back?
_________________
Big thanks to everyone who donated for Quake2

#146214 - keldon - Fri Nov 30, 2007 5:56 pm

Are you remembering to shift afterwards, and keeping tabs on where the point is after an operation? If it is overflowing after the shift then you may need to work with less magnitude and more fractional bits; at least after the operation.

#146307 - a128 - Sun Dec 02, 2007 3:42 pm

keldon wrote:
Are you remembering to shift afterwards, and keeping tabs on where the point is after an operation?


keldon what do you mean with "tabs"

Oh yes ...what I did wrong was

for every a*a+b*b.... ..I did mulf32(a,a)+mulf32(b,b)

this is wrong

I should do
Code:

((long long)a*(long long)a+(long long)b*(long long)b)>>12


but his also tends to not fitting with 20:12 because the 20bit magnitude overflows?!

maybe I am not getting somethink right:
(still searchin for my brain in my apparment..not found yet:-) )[/code]

#146315 - DiscoStew - Sun Dec 02, 2007 6:52 pm

Using mulf32 is a good way to get started with 1:19:12 fixed-point math. You just need to make sure that any value you use is in the same format as well, so that you can shift the same amount each time.

Code:
((long long)a * (long long)a + (long long)b * (long long)b)>> 12


does the exact same thing (as far as results go) as

Code:
((long long)a * (long long)a) >> 12 + ((long long)b * (long long)b)) >> 12    (2 x mulf32 function)


The difference is how they are done.
_________________
DS - It's all about DiscoStew

#146319 - Peter - Sun Dec 02, 2007 7:27 pm

DiscoStew wrote:
Code:
((long long)a * (long long)a + (long long)b * (long long)b)>> 12

does the exact same thing (as far as results go) as
Code:
((long long)a * (long long)a) >> 12 + ((long long)b * (long long)b)) >> 12    (2 x mulf32 function)


I don't think so, for example:
Code:

((long long)a * (long long)a + (long long)b * (long long)b)>> 12

8192 = 2*4096 (2 in fixed poin 1.19.12)
32768 = (8192*8192 + 8192*8192) / 4096

But it should have been 8. The result has actually 24bits fraction, correct would be:
Code:

((long long)a * (long long)a + (long long)b * (long long)b)>> 24

_________________
Kind Regards,
Peter

#146322 - DiscoStew - Sun Dec 02, 2007 7:44 pm

But (32768 >> 12) does equal 8 though. 32768 is 8 in 1:19:12 fixed point.

And using the other method results in the same value

Code:
32768 = (8192 * 8192) >> 12 + (8192 * 8192) >> 12


It would be a 24-bit fraction is that plus sign in the middle was a multiple sign, I think.
_________________
DS - It's all about DiscoStew

#146323 - Peter - Sun Dec 02, 2007 7:56 pm

DiscoStew wrote:
But (32768 >> 12) does equal 8 though. 32768 is 8 in 1:19:12 fixed point.

eeeks, yes of course. What was I thinking.
_________________
Kind Regards,
Peter


Last edited by Peter on Sun Dec 02, 2007 10:57 pm; edited 1 time in total

#146332 - tepples - Sun Dec 02, 2007 10:11 pm

a128 wrote:
keldon wrote:
Are you remembering to shift afterwards, and keeping tabs on where the point is after an operation?

keldon what do you mean with "tabs"

idiom: keep tabs on
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#146383 - a128 - Mon Dec 03, 2007 11:55 am

DiscoStew wrote:


Code:
((long long)a * (long long)a + (long long)b * (long long)b)>> 12


does the exact same thing (as far as results go) as

Code:
((long long)a * (long long)a) >> 12 + ((long long)b * (long long)b)) >> 12    (2 x mulf32 function)


The difference is how they are done.


I guess the main difference is cpu speed ...you shift only 1x

#146522 - a128 - Wed Dec 05, 2007 10:00 am

simonjhall wrote:
err....isn't this the same as http://forum.gbadev.org/viewtopic.php?t=14086 ;-)


Thats correct....

I have one last question regarding to fixed_point 19:13 usage in your Quake1 port .

you do in mathlib.c
Code:
dist2.value = ((long long)p->normal[0].value*(long long)emaxs[0].value + (long long)p->normal[1].value*(long long)emaxs[1].value + (long long)p->normal[2].value*(long long)emaxs[2].value) / FP_SCALE;

why the division ? is it more preceise then shifting ? I guess so?!
Code:
dist2.value = ((long long)p->normal[0].value*(long long)emaxs[0].value + (long long)p->normal[1].value*(long long)emaxs[1].value + (long long)p->normal[2].value*(long long)emaxs[2].value) >>13;

#146523 - kusma - Wed Dec 05, 2007 10:39 am

If FP_SCALE is (1 << 13), my guess would be either because of programming semantics (the compiler generates the same code anyway -- but only for unsigned integer dividends), or to fix the negative value case (with shifts, negative values gets off-by-one compared to the division).

One could argue that conceptually it's a scale, not some bit-manipulation which is what the shift operator is mainly for. But to be honest, I think fixed point math is kind of a special case and that using shifts directly often is clearer since it's the more common way of doing it. And since it's slightly faster (usually compiles to something like "x >> y" instead of "x >> y + (x < 0 ? 1 : 0)") and the error is "acceptable" (1 ULP), I just find the shift even more appealing.

#146524 - simonjhall - Wed Dec 05, 2007 11:11 am

The float- to fixed-point conversion of Quake happened over a period of about six months, and when I started it I just did it willy-nilly. After getting some code from kusma then making an overloaded fixed point class, I got a lot more consistent at how I did it. That's why in later code you'll see me just shifting by FP_SHIFT (or whatever it's called).

Tbh I would hope that when dividing and integer by (the immediate) 8192 the compiler would just stick in a right shift by 13 places. If it doesn't (I'm sure it wouldn't) then it needs to be changed.
I'd imagine that the result would be exactly the same regardless of whether a shift or integer divide were used. I normally only see imprecise divides when using floating-point...
_________________
Big thanks to everyone who donated for Quake2

#146526 - kusma - Wed Dec 05, 2007 11:42 am

simonjhall wrote:

I'd imagine that the result would be exactly the same regardless of whether a shift or integer divide were used. I normally only see imprecise divides when using floating-point...

That's only true for unsigned integers where you shift in zeros at the top (when right-shifting). For signed integers you shift in the sign-bit, and negative values converge at -1 (all bits high), not 0 as you would expect.


Last edited by kusma on Wed Dec 05, 2007 11:44 am; edited 1 time in total

#146527 - Lino - Wed Dec 05, 2007 11:44 am

You can use the asr.

#146528 - kusma - Wed Dec 05, 2007 11:47 am

Lino wrote:
You can use the asr.

A right-shift on a signed value already results in an ASR. Where are you trying to go with this?

#146529 - Lino - Wed Dec 05, 2007 11:55 am

I have said if you have problems with right shift of signed inetger, you can use the asr instruction. Otherwise the logical right shift puts all zeros in the top, you lose the sign.

#146530 - simonjhall - Wed Dec 05, 2007 12:05 pm

Yes that's true. However the type in my fixed point class is an unsigned int, so I guess that's why I've not had weird problems yet ;-)
But yeah, I did discover these "shifting down the top bit" problems the other day at work when doing endian flipping. Took me ages to track down! Grr...
_________________
Big thanks to everyone who donated for Quake2

#146531 - kusma - Wed Dec 05, 2007 12:20 pm

Lino wrote:
I have said if you have problems with right shift of signed inetger, you can use the asr instruction. Otherwise the logical right shift puts all zeros in the top, you lose the sign.

No, because this is already what the compiler does. The compiler uses the right instruction for the right data. If you get an lsr shitf-modifier or instruction, you have a problem in your source code. No need for low-level hackery to get this working correctly.

#146532 - Lino - Wed Dec 05, 2007 12:23 pm

With my compiler if i use
Code:
 
u32 i;
i = (signed long)i >> 16;


it uses asr insetad of lsr. no devkitarm

#146538 - kusma - Wed Dec 05, 2007 2:26 pm

Lino wrote:
it uses asr insetad of lsr. no devkitarm

That is correct and as expected. The cast-operator has higher precedence, so you're shifting a signed value. ASR is for signed values LSR is for unsigned values. As said already, the compilers already handle this correctly, so there's no point in pulling in the assembly instructions in what is already a discussion on the C/C++ level.