gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

DS development > DS: Profiling, Optimisation and Things Not To Do

#85319 - MrD - Sun May 28, 2006 11:43 pm

Hey,

I'm converting a prototype engine currently using Windows-based libraries to the DS. The game currently renders the world and the sprites using pixel-by-pixel methods (without any HW acceleration, or gfx library sprite functions), and I'm trying to recreate the engine using an exrot background (mode 5) on the DS main screen. There's a lot of VRAM writes using this method though, and I think the engine will slow down greatly...

At this stage, (almost) everything is running on the arm9... I'm trying to figure out some way of using the arm7 also, but I'm not sure how I'd time the two so that they worked correctly.

Does anybody have any hints on common slow-spots in game engines, and could they recommend a method of discovering exactly which functions or actions are taking the most time?

I think trying to rewrite the sprite rendering engine to use the OAM would take too long, but would it be worth it for the speed increase?
_________________
Not active on this forum. For Lemmings DS help see its website.


Last edited by MrD on Mon May 29, 2006 2:42 am; edited 1 time in total

#85323 - Lazy1 - Mon May 29, 2006 12:35 am

A small piece of the profiling code I use every now and then...
Code:

extern u32 g_timerBaseMS;
#define MILLISECOND_COUNTER ( TIMER1_DATA + g_timerBaseMS )

#ifndef PROFILE
   #define SetupProfile( )
   #define BeginProfile( )
   #define EndProfile( )
#else
   #define SetupProfile( ) int ____startMS_ = 0;
   #define BeginProfile( ) ____startMS_ = MILLISECOND_COUNTER;
   #define EndProfile( ) iprintf( "Function %s took %dms\n", __FUNCTION__, ( MILLISECOND_COUNTER - ____startMS_ ) );
#endif

There are probably better ones out there, but thats what I use anyway.
You'll also need to adjust the MILLISECOND_COUNTER macro to however your counting time.

This is how I count time...
Again, it could probably be done better.
Code:

#include <nds.h>

/* Lazy:
 * Since TIMER1_DATA will overflow in 65536 milliseconds, it is necessary
 * to add 65536 to this variable every time TIMER1_DATA overflows.
 * So, the total time elapsed since Timer_Init in milliseconds is g_timerBaseMS + TIMER1_DATA.
 */
u32 g_timerBaseMS = 0;

/* nds_timer1_overflow:
 * Adds 65536 to the base millisecond counter so we
 * don't lose any time when TIMER1_DATA rolls over.
 */
static void nds_timer1_overflow( void ) {
   g_timerBaseMS+= 65536;
}

/* Timer_GetMS:
 * Returns: The time in milliseconds since Timer_Init was called.
 */
int Timer_GetMS( void ) {
   return g_timerBaseMS + TIMER1_DATA;
}

/* Timer_Sleep:
 * Waits until ( usec ) microseconds have passed.
 */
void Timer_Sleep( int usec ) {
   swiDelay( usec );
}

/* Timer_Init:
 * Initialize NDS hardware timers.
 */
void Timer_Init( void ) {
   /* Timer0 will overflow roughly every 0.98 milliseconds */
   TIMER0_CR = TIMER_ENABLE | TIMER_DIV_1;
   TIMER0_DATA = 32768;

   /* When timer0 overflows, TIMER1_DATA will be incremented by 1.
    * When timer1 overflows 65536 is added to g_timerBaseMS so we don't lose
    * any time.
    */
   TIMER1_CR = TIMER_CASCADE | TIMER_IRQ_REQ;
   TIMER1_DATA = 0;

   /* Set and enable the interrupts for timer 1 */
   irqSet( IRQ_TIMER1, nds_timer1_overflow );
   irqEnable( IRQ_TIMER1 );
}

#85379 - silent_code - Mon May 29, 2006 5:00 pm

i rewrote my simple (old) qbasic engine that used direct framebuffer rendering for nds and it worked fine. but then i rewrote it, so it makes use of the hardware now. and i tell ya, i have more things in higher resolution, with even more special effects and a constant framerate than with software rendering. the speed increese is great! now i can do some software rendering for thousands of particles and get the backgrounds and (4 times bigger) sprites for free! try it, it won't gurt! ;)

it will certainly take some time to figure things out, but it's worth the time, just believe me! especially if you draw big sprites, like bosses, that take half the screen, you won't notice any speed issues with the hardware sprites, but, depending on your implementation, a software renderer can cause a notable performance drop...

greets,

Rob