gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Audio > Optimizing

#35951 - ProblemBaby - Sun Feb 13, 2005 3:34 pm

Hi

Now Ive tried a lot of techniques to mix channels in stereo as fast as possible and I cant get under about 2,8%/channel with a mixing rate of 31536hz but for 32 channels its to much! about 90%.
More then 2% is spended at 'Load sample from ROM'
and 'Multiplication with Volume'!

One way to omptimze would be to checn if the increment is lower then 1 and then load a sample only if it have advanced, But is it wort it mose cases?

do you have any other ideas how to speed up? decreasing the Loades and Multiplications!

thanks in advance

#35989 - DekuTree64 - Sun Feb 13, 2005 10:13 pm

Predictable as it may be, I shall reply :)

ProblemBaby wrote:
Now Ive tried a lot of techniques to mix channels in stereo as fast as possible and I cant get under about 2,8%/channel with a mixing rate of 31536hz but for 32 channels its to much! about 90%.
More then 2% is spended at 'Load sample from ROM'
and 'Multiplication with Volume'!

That's really not half bad, but there's still plenty of room for improvement. Probably around 40-50% is the best you can expect to get for 32 channels.

Quote:
One way to omptimze would be to checn if the increment is lower then 1 and then load a sample only if it have advanced, But is it wort it mose cases?

Yes, that will help. Espcially at ~32KHz, you'll probably be using increments less than 1 more often than not.

Quote:
do you have any other ideas how to speed up? decreasing the Loades and Multiplications!

Are you using stereo? A classic trick to save on multiplies is to put the left volume in the lower 16 bits of a register, and the right volume in the upper 16 bits. Then you get both multiplies/adds at once. You can also do a similar trick in mono by putting 2 samples into the reg and then multiplying by the single volume.

The problem with those is that your entire mix has to fit in 16 bits, and if you have an 8-bit sample times a 6-bit volume, you get a 14-bit result. That only leaves room for up to 4 channels (2 more bits), unless you shift down, and if you shift, you have to chop off the bottom bits of the top sample so they don't run into the bottom.

One possibility is to batch the channels 4 at a time, mixing into temporary buffers, and then add up the temporary buffers with full 32-bit accuracy. It would be a little slow doing 2 passes, but for 32 channels the savings might pay off (although you'd need 528smps*2bytes*2sides*32chns/4at a time=16896 bytes of temporary buffers for 32 stereo channels).


And then there's dynamic code. I'm not too good with it yet, but do look for ways to speed things up by replacing instructions/generating whole loops/etc.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

#35994 - tepples - Sun Feb 13, 2005 10:23 pm

If you're up to 32 channels, many of them music, then you should probably consider some streaming audio compression method.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#35996 - ProblemBaby - Sun Feb 13, 2005 11:09 pm

DekuTree64:

Yes, Ive checked and seen that samples often is played with a increment lower then 1.
Ive also made one other nice speed improvements first it seems like a CpuFastCopy doesnt take much time at all so I copy from ROM to IWRAM and then load it from there thats about 0,7%

#36068 - ProblemBaby - Mon Feb 14, 2005 8:31 pm

I am a bit confused.
When I use CpuFastCopy it looks like it doesnt take time at all but it runs damn slow, are timers stopped when processing BIOS functions or something?

It works fine in VBA but not in either NG or Real GBA
I do a CpuFastSet 8-9 times frame, and copy 1024 bytes!

Any ideas?

#36107 - Miked0801 - Tue Feb 15, 2005 2:11 am

More than likely, the emulators are doing a local fast assembler version of the functions which don't reflect actual speed on target (Nintendo copyright issues prevent direct inclusion of this code.) No$GBA allows you to include a ROM BIOS file to reflect actual performace. Not sure on others.

#36167 - ProblemBaby - Tue Feb 15, 2005 4:59 pm

Miked0801: Thanks a lot for the replay!
I fixed so It worked in No$Gba in no time
but at real GBA it was 34%=)
It felt like magic that copying that much data should take no time!
So I will trash that idea..