#113319 - Lick - Tue Dec 26, 2006 4:07 pm
I've ported some ActionScript tweening code (originally written by Robert Penner) to C, and it's now time to optimize it.
http://rafb.net/p/7ERmrV74.html - tween.c
http://rafb.net/p/rDb5ug58.html - tween.h
Anyone know where to start? Fixed-point? ARM assembly?
By the way, it's works already. So if you want to use it, go ahead!
_________________
http://licklick.wordpress.com
#113322 - strager - Tue Dec 26, 2006 4:57 pm
I'd suggest you start converting all those floats into fixed-point integers, then using LUT's in-place of those costly trig. functions, perhaps optimizing the calls to sqrt32 along the way.
After this you could optimize with ARM assembly. THUMB doesn't seem practical to me, with a lot of shifts (for fixed-point conversion) and multiplies everywhere.
Take it one function at a time, and be sure to check that everything works properly after every change. You don't want to learn the hard way...
If you want to use C++, a fixed point class may help in the conversion (convert from float to fixed easily, and vice versa, with simple typecasting). Of course, you can bring it back to C once everything is working fine.
Just some suggestions from an inexperienced programmer. =X
#113331 - Lick - Tue Dec 26, 2006 8:03 pm
Doing the floating point -> fixed point conversion right now. It seems so backwards adding much more lines for supposedly faster execution speed.
Hope it really matters.
_________________
http://licklick.wordpress.com
#113334 - OOPMan - Tue Dec 26, 2006 8:29 pm
For floating-to-fixed I'd guess it does, since software emu of Floating Point is sloooooooooooooow...
I'd try and avoid the SQRTs if possible as well...
_________________
"My boot, your face..." - Attributed to OOPMan, Emperor of Eroticon VI
You can find my NDS homebrew projects here...
#113339 - sajiimori - Tue Dec 26, 2006 9:26 pm
Lick, going by your "lines of code" measurement of speed, which of these functions is the fastest, and which is the slowest? Code: |
int test1(int n)
{
for(int i = 0; i < 100; ++i)
n += 5;
}
int test2(int n)
{
++n;
++n;
++n;
++n;
return n;
}
int test3(int n)
{
int f(int);
return f(n);
} |
#113343 - Lick - Tue Dec 26, 2006 9:58 pm
Well, it's not lines of code. More of, the amount of operations. A simple multiply * by the compiler vs. all those fixed point operations.
_________________
http://licklick.wordpress.com
#113346 - Lick - Tue Dec 26, 2006 10:19 pm
Here is the almost fixed-point (I haven't fully converted sine and circular easing yet) binary + source.
I'm going to finish the conversion later this week (will only take a few minutes) and then I'll compare it to the original floating point implementation.
That should be interesting.
- Lick
_________________
http://licklick.wordpress.com
#113347 - sajiimori - Tue Dec 26, 2006 10:33 pm
If only using a language with built-in floating-point operations automatically imbued processors with FPUs! Such magic would certainly reduce the cost of manufacturing.
Alas, compilers must produce code that actually runs on the target processors, in this case by inserting calls to functions that perform floating-point math using a long series of integer instructions.
gcc -S is your friend.
#113356 - tepples - Tue Dec 26, 2006 11:59 pm
Lick wrote: |
Well, it's not lines of code. More of, the amount of operations. A simple multiply * by the compiler vs. all those fixed point operations. |
Then make your own C++ class that overrides operator *. The DS has enough RAM for C++'s overhead to be worth it.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#113358 - Lick - Wed Dec 27, 2006 12:04 am
That would be easy to do (much easier), but I restricted myself to C this time. Just to make it more portable. (And perhaps for later use on the ARM7)
_________________
http://licklick.wordpress.com
#113361 - kusma - Wed Dec 27, 2006 1:36 am
Lick wrote: |
That would be easy to do (much easier), but I restricted myself to C this time. Just to make it more portable. (And perhaps for later use on the ARM7) |
The ARM7 also has no problems running C++-code. As long as you're gcc-based, there's no portability-reason to prefer C over C++ either. Sure, pure C is neat if you consider porting your code to some obscure micro-controller, but I strongly doubt that's the case. Myself, I'm using a templated fixed point class for "seamless" switching between float and fixed point arithmetic, and it's really worth it. One compile-time define, and you'll find out if your bug is due to fixed point range/precision-issues or not. And you'll be able to add assertions on overflows etc. It's REALLY convenient for the non-critical code in your projects.
#113368 - tepples - Wed Dec 27, 2006 3:00 am
kusma wrote: |
Lick wrote: | That would be easy to do (much easier), but I restricted myself to C this time. Just to make it more portable. (And perhaps for later use on the ARM7) |
The ARM7 also has no problems running C++-code. |
I compiled the source code of one of the simplest libnds examples as C and as C++, and the C++ was 7 times bigger. (Results on GCC targeting Windows, on the other hand, contradicted this.) This is a problem if you're trying to fit your entire ARM7 program into its IWRAM. Do you want me to build a test case?
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#113371 - kusma - Wed Dec 27, 2006 3:07 am
tepples wrote: |
kusma wrote: | Lick wrote: | That would be easy to do (much easier), but I restricted myself to C this time. Just to make it more portable. (And perhaps for later use on the ARM7) |
The ARM7 also has no problems running C++-code. |
I compiled the source code of one of the simplest libnds examples as C and as C++, and the C++ was 7 times bigger. (Results on GCC targeting Windows, on the other hand, contradicted this.) This is a problem if you're trying to fit your entire ARM7 program into its IWRAM. Do you want me to build a test case? |
I have, and I've found no size-issues as long as you remember to disable rtti and exceptions.
#113383 - HyperHacker - Wed Dec 27, 2006 6:38 am
sajiimori wrote: |
Lick, going by your "lines of code" measurement of speed, which of these functions is the fastest, and which is the slowest? Code: | int test1(int n)
{
for(int i = 0; i < 100; ++i)
n += 5;
}
int test2(int n)
{
++n;
++n;
++n;
++n;
return n;
}
int test3(int n)
{
int f(int);
return f(n);
} |
|
Well the first one never returns its result so it's not of much use. The second would be optimized to "return n+4", and I'm not sure what the third even does.
_________________
I'm a PSP hacker now, but I still <3 DS.
#113388 - sajiimori - Wed Dec 27, 2006 7:33 am
HyperHacker, the first example does return, though the compiler may warn, the second example depends on the compiler and options (which is part of the point), and the whole point of the third example is that you don't know how fast it is even if you look at the compiler's output.
#113787 - dXtr - Sat Dec 30, 2006 4:28 pm
tepples wrote: |
I compiled the source code of one of the simplest libnds examples as C and as C++, and the C++ was 7 times bigger. (Results on GCC targeting Windows, on the other hand, contradicted this.) This is a problem if you're trying to fit your entire ARM7 program into its IWRAM. Do you want me to build a test case? |
did you try to strip out all the "junk", like the rtti and exception support?
_________________
go back to coding and stop screaming wolf :)
#113790 - tepples - Sat Dec 30, 2006 4:41 pm
Shouldn't the compiler strip out exception support if there isn't any throw in the project and RTTI support if there aren't any virtual methods in the project?
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#113968 - sajiimori - Tue Jan 02, 2007 12:20 am
Stripping out stack unwinding code would be a job for the linker, and whether it's even possible depends on the way exceptions are implemented on the platform. I've always had to specify -fno-exceptions.
Not sure about RTTI; it's worth checking out if you don't need dynamic_cast. (I use it often enough to leave RTTI on.)
#131325 - a128 - Thu Jun 14, 2007 11:18 am
http://www.freewebtown.com/festival2005/fixed.tgz
a fixedpoint C++ class
includes some test code....works fine I guess
includes sin,cos stuff on top of libnds