#10532 - Opius - Mon Sep 08, 2003 12:22 am
how would you go about calculating cpu usage on GBA? is it a simple thing to implement? or do any emulators offer this kind of function?
thanks
Opius
#10533 - DekuTree64 - Mon Sep 08, 2003 12:36 am
The best way I know is to set up a timer, or cascade 2 timers if your thing takes more than 65535 cycles, call your function, disable the timers multiply the number of ticks by 100, then divide by 280896, which is the number of cycles per frame, and you have your percent of the total frame time. Multiply by 1000 or 10000 instead of 100 to get more accuracy if needed.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku
#10542 - FluBBa - Mon Sep 08, 2003 10:53 am
Or do what people have been doing since the begining of time, change the background color when you start your routine and then change it back when you're finnished. On the GBA though it's extra important that you start it while you're "on the screen" (raster row 0-160). This only works if it takes less time then the whole frame/screen.
_________________
I probably suck, my not is a programmer.
#11437 - mathieu - Tue Oct 07, 2003 12:06 am
I was writing a mod player, that uses several functions, and I wanted to find a way to compute how much CPU time it took to play my MOD, in a general manner.
I created a little C function:
Code: |
u32 calibrate() {
u32 i=0, j=0;
while (reg16(REG_VCOUNT) < 160) {
i++;
if (i > 160) {i = 0 ; j++;}
// this is to prevent overflows... I know, it's lame.
}
return j;
}
|
This function should be called right at the end of a V-blank, at the beginning of the code (before the modplayer starts), to get a "CPU calibration" value.
Then, while the program is running, call this function at every end of VBlank, and divide the return value by the original you found the first time, thus obtaining a freeCPU/usedCPU ratio, which can be displayed on screen (for example using pixel bars).
Is my approach valid ? Or does it have flaws ? I can see myself one - I don't take into account what's happening during VBlank - but just while the display is running.
Any other ideas to measure this "global" free CPU time ?
#11440 - tepples - Tue Oct 07, 2003 1:16 am
To measure CPU time, I typically measure VCOUNT before and after running a function. Yes, my mixer is slow as sh** compared to the Apex Audio System mixer.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#11441 - DekuTree64 - Tue Oct 07, 2003 2:14 am
Yeah, just wait until VCOUNT is 0, set BG color 0 to something, call your function, set BG color 0 to something else and however much of the screen is covered is how long it took. Another nice trick is if you put some 8x8 pixel text or something near the top of the screen it makes it easier to count the number of lines.
And yes, all my mixer attempts have been slow as poop compared to AAS as well, so after many hours of thinking one day I got mad and did some dissassembling on it to see if I could find out what I was doing wrong. Absolutely incredible. I wouldn't have come up with that in a million years. Even still I think it could still be improved a little bit^_^
I don't really like working off other peoples' ideas though, so I'm playing around with doing a mixer with linear interpolation. And with interpolation I think the quality would still be good enough with 4-bit samples, since you'd generate the rest of the bits anyway (not to mention you'd save a lot of ROM).
Since you have to load 2 samples to do interpolation, I figured it might be better to load more than one sample at a time, since it doesn't cost any more to load 16 bits than 8 bits from ROM, so you'd have to separate them anyway, so then I might as well use 4-bit samples so I can have twice as many loaded, and the separating wouldn't cost anymore. So that's where I am, working out the fastest way to separate them and load new ones in. I think it will actually be pretty fast. Maybe 10-15% CPU for 8 channels, can't really say yet.
I may not be the best programmer, but if I struggle hard enough I'm bound to come up with a good mixer eventually^^
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku
#11442 - tepples - Tue Oct 07, 2003 3:34 am
DekuTree64 wrote: |
I don't really like working off other peoples' ideas though, so I'm playing around with doing a mixer with linear interpolation. And with interpolation I think the quality would still be good enough with 4-bit samples, since you'd generate the rest of the bits anyway |
It doesn't actually work that way. Aliasing noise and quantization noise are two separate kinds of noise. Linear interpolation reduces only aliasing noise, not quantization noise. Your would introduce quantization noise, which could become even more annoying than aliasing.
If you want to pack more samples into less ROM, the best idea is to use a codec, such as my 8ad codec, in decompress-to-EWRAM mode.
Quote: |
since it doesn't cost any more to load 16 bits than 8 bits from ROM |
Unless you're loading an odd-address sample and the following even-address sample.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#11447 - DekuTree64 - Tue Oct 07, 2003 5:00 pm
tepples wrote: |
Aliasing noise and quantization noise are two separate kinds of noise. |
Oh. I heard that SNES used 4-bit samples, and it sounded fine, but I went and looked it up and now it makes sense. I don't think that compression would be too fast without hardware though...
Quote: |
Unless you're loading an odd-address sample and the following even-address sample. |
Actually I was talking about having a register with 8 4-bit samples at a time, shifting it down as needed to AND the current sample, and AND >> 4 with a reg set to 15 to get the next sample for the interpolation, and as soon as you shift at least 16 bits off of it, load a new 16 bits, OR it to the top, and continue.
Unfortunately it ended up taking a huge bunch of stuff to check if you needed to load new samples, and to get them ORed to the right place, so it's back to the drawing board.
But since I'm making an RPG where speed isn't such an issue, I'd rather have a more high-quality mixer. I don't know if interpolation would really make that big of a difference though. I suppose I could do reverb and/or chorus by just keeping a few frames worth of mixing buffers and playing them as normal sound channels though.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku
#11448 - tepples - Tue Oct 07, 2003 5:25 pm
DekuTree64 wrote: |
I heard that SNES used 4-bit samples, and it sounded fine, but I went and looked it up and now it makes sense. I don't think that compression would be too fast without hardware though... |
The Super NES sample compression was a delta PCM technique vaguely similar to that used in my 8ad codec.
Quote: |
But since I'm making an RPG where speed isn't such an issue, I'd rather have a more high-quality mixer. I don't know if interpolation would really make that big of a difference though. |
Interpolation would help mainly with samples played back at a low pitch. Unless you use MIP mapping on your samples as well, interpolation won't help much for samples played back at a high pitch.
Quote: |
I suppose I could do reverb and/or chorus by just keeping a few frames worth of mixing buffers and playing them as normal sound channels though. |
Yes, keeping a few frames worth of mixing buffers would make an echoey sound. You can do chorus more simply by either 1. taking two voices and playing the same sample on both of them, slightly out of tune, or 2. chorusing the sample itself before you put it in the ROM. I've heard tracked music that uses both techniques.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.