gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

C/C++ > Fast Math

#9372 - Domovoi - Sun Aug 03, 2003 12:34 am

So... I'm wondering about math on the GBA. In general, I believe that additions are faster than subtractions which are faster than multiplications which are faster than divisions. Right?

Also, bit shifting should be faster than multiplying. For instance, a * 32 is slower than a << 5, which produces the same outcome, right? Is this also true on GBA?

Is there a way to do fast MOD's? I'm using val = thing%8 a lot.. I'm wondering if there's some clever and faster way to do this. Dividing by eight, for instance, is simply a bitshift to the right: thing>>3. Is there something you can do for mods too?

In short, do all these little math tricks work on the ARM CPU too, or are there exceptions? Or is it all just futile, because the compiler, for instance, already optimizes such calculations for you?

#9373 - tepples - Sun Aug 03, 2003 12:40 am

Domovoi wrote:
So... I'm wondering about math on the GBA. In general, I believe that additions are faster than subtractions which are faster than multiplications which are faster than divisions. Right?

Additions and subtractions take about one cycle. Multiplications take three to six, depending on the second operand. Divisions require a slow SWI call; division by a non-power-of-two constant can often be replaced with an ARM-mode multiply by inverse.

Quote:
Also, bit shifting should be faster than multiplying. For instance, a * 32 is slower than a << 5, which produces the same outcome, right? Is this also true on GBA?

Shifting by a constant takes one cycle. Shifting by a variable (a << b) takes two. There's no difference between one cycle and two if you're running code from ROM and using prefetch.

Quote:
Is there a way to do fast MOD's? I'm using val = thing%8 a lot.. I'm wondering if there's some clever and faster way to do this. Dividing by eight, for instance, is simply a bitshift to the right: thing>>3. Is there something you can do for mods too?

If b is a positive power of 2, then a % b == a & (b - 1). For example, thing % 8 == thing - 7. But as usual, there's an exception: when thing is negative, the answer might be off by one, as % rounds toward 0, whereas & rounds toward negative infinity.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#9375 - mtg101 - Sun Aug 03, 2003 1:48 am

That mod trick is very clever. I asume there are other tricks like it for other forms of calculation. Does anyone know of a site that lists such tricks?
_________________
---
Speaker for the Dead

#9378 - crossraleigh - Sun Aug 03, 2003 2:28 am

Try Hacker's Delight by Dr. Henry S. Warren, Jr., published by AW. It has all sorts of bit fiddling tricks.

If you spend any amount with computer arithmetic, you will love this book.

#9386 - Domovoi - Sun Aug 03, 2003 8:05 am

Indeed, that mod thing is great. I find that when working with tilemode, you get a lot of multiplications by 32 and divisions or mods by eight... both positive powers of two. Excellent.

tepples wrote:
Divisions require a slow SWI call; division by a non-power-of-two constant can often be replaced with an ARM-mode multiply by inverse.


Hmmm... Where can I learn more about an ARM-mode multiply by inverse? What's the difference between a normal multiply by inverse and an ARM-mode one? Oh, and what exactly is an SWI call?

Quote:
Shifting by a constant takes one cycle. Shifting by a variable (a << b) takes two. There's no difference between one cycle and two if you're running code from ROM and using prefetch.


Prefetch, huh? I looked it up... Seems like simply enabling a single bit. Are there any drawbacks to using prefetch, or can you just simply enable it all the time?

Thanks a lot for the tips, by the way!

#9391 - tepples - Sun Aug 03, 2003 3:22 pm

Domovoi wrote:
Hmmm... Where can I learn more about an ARM-mode multiply by inverse? What's the difference between a normal multiply by inverse and an ARM-mode one?

Thumb allows only for 32x32=32 bit multiplication, where the lower 32 bits are kept. Multiplying by a reciprocal typically requires the upper 32 bits from a 32x32=64 bit multiplication, and that's an ARM instruction.

Quote:
Oh, and what exactly is an SWI call?

Software interrupt. Read all about them in the CowBite spec, and use Andrew P. Bilyk's library to access them.

Quote:
Prefetch, huh? I looked it up... Seems like simply enabling a single bit. Are there any drawbacks to using prefetch, or can you just simply enable it all the time?

Prefetch makes the GBA draw about 10 percent more current from its batteries.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#9397 - mtg101 - Sun Aug 03, 2003 8:06 pm

So... prefetch... apart from the batt drain - are there any other things to take into account? Are there any cases, like when you've got lots of jumps in your code, that you may not want to enable it?
_________________
---
Speaker for the Dead

#9452 - Cyberman - Tue Aug 05, 2003 5:40 am

Prefetch refers to instruction prefetch it gets you better performance if you have a long sequence of instructions to execute and they are sequential. However you will get the queue emptied on any branch instruction. The reason the prefetch draws more power is the bus is busier and the prefetch hardware is active, on the thumb processor. I'm not certain but I believe it will affect DMA some as well.

I don't think Prefetch will get you as much as loading critical routines (IE functions that are called a lot during game execution) into IRAM and switching to ARM mode when they are run instead of runing them from ROM in thumb mode. you won't need to twiddle with prefetch for this.

Cyb