gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Coding > Cost of basic operations

#26370 - doudou - Mon Sep 13, 2004 9:07 pm

Hi,

I have some performance issues and I would like to know the cost in CPU of basic arithmetic operations. I'm sure there are web sites about this, but I did'nt find what I am looking for with google.

I don't need to have a precise cost on the ARM device, but a good idea of the relative cost of basic arithmetic operations (*, /, sqrt, etc.)

Thanks

#26371 - poslundc - Mon Sep 13, 2004 9:35 pm

Addition, subtraction and multiplication are the only arithmetic operations supported by hardware. (Bit operations such as &, | and ^, bitshifting and whatnot are also supported.)

Time for most arithmetic operations is 1 cycle. Multiplication takes a few more depending on the size of the operands; IIRC as many as 7 if you're doing long-long multiplication.

Floating point numbers are not supported by hardware and must be emulated in software. I don't have any numbers for you, but it's terribly slow.

Things like division and square root aren't supported in hardware, and must be calculated algorithmically. These algorithms make take anywhere between 50 and 200 cycles to process, depending on their efficiency.

The GBA has a fast divide routine in its BIOS that isn't actually very fast, but beats out the default GCC division operator. It's there to use if you want, but must be called manually (which requires a bit of ASM).

When it comes to division, the compiler will make several smart optimization for you as well. For example, if you are dividing by an even power of two it will be replaced with a bitshift. If you are dividing by a different constant value, it will precalculate the reciprocal of the divisor and do multiplication instead. So there are times when it's okay to use GCC's division, so long as you're aware of what's going on.

For many advanced mathematical functions - division, square root, and especially periodic functions like the trig functions (sin, cos, tan, etc.) it is often desirable to precalculate a lookup-table (LUT) and use it to directly look up the result of the calculation instead of performing it on the fly. If such a table is placed in EWRAM it can take as few as 10-12 cycles to obtain a result.

Those're the basics of it...

Dan.