gbadev.org forum archive

Let's say that I'm timing the hardware to play samples at 11khz. Let's say that I'm mixing a couple of channels and that I want to modulate the sound playing on those chanels to play back at different frequencys. The hardware playback would stay constant. Basically I want to know how to modulate samples so that later I can right a mod player. Anybody have any clue as to how to go about that? Thx.
_________________
"head straight for your goal by any means
there is a door that you've never opened
there is a window with a view you've never seen
get there no matter how long it takes"

-Theme of Shadow, Sonic Adventure 2

To play back a sample at a rate relative to your mixing rate, use fixed-point math to store the sample's playback rate divided by the mixing rate.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

tepples wrote:

To play back a sample at a rate relative to your mixing rate, use fixed-point math to store the sample's playback rate divided by the mixing rate.

Thx tepples. I know it's been a while since I posted this (not too long) but now I'm actually working on writing a player. So basically this means that if I have a sample at 11025 and I want to play it back at double speed, 22050, I should just skip ever other sample since 22050/11025 = 2? And to halve the speed I should play each sample twice in sucsession since 5512/11025 = 0.5? Thanks!
_________________
"head straight for your goal by any means
there is a door that you've never opened
there is a window with a view you've never seen
get there no matter how long it takes"

-Theme of Shadow, Sonic Adventure 2

You got it.

Dan.

Sorry bout the delay in responding, was on much needed vacation and didn't get around to checking this forum while away.

DarkPhantom, it is a simple as that however there are special considerations due when playing back samples slower than their original sampling rate -- namely interpolation. To playback a sample slower than it's recorded rate, you must "insert" samples into playback that weren't in the original recording.

The value of the made-up samples can be calculated in several differnt ways:

No interpolation:
Simply repeat the previous sample until a new sample comes along, as you mentioned in your post. This causes noticable distortion of the audio playback the greater the slowdown of playback in relation to the original sampling rate, because the playback impulse train becomes flat between recorded samples, making the wave shape of the playback sound more jagged or sharp -- like a square wave -- rather than curved. This isn't a problem if the shape of the recorded sound wave is jagged or "sharp" to begin with, but for sounds with a round "soft" wave shape, this distortion becomes painfully noticable the more you slow the audio down. The nice thing about this method is that there is almost 0 CPU overhead to it.

linear interpolation:
This assumes an imaginary line is created between samples, and that the made-up sample would have been a value somewhere along this line. For instance, say sample 1 has a value of 6 and sample 2 has a value of 10. Based on playback-to-sampling rate ratio, you figure that the made-up sample would have been recorded at 80% of the "distance" in time from sample 1's recording to sample 2's recording. You calculate the slope of sample 1 to sample 2 (10-6 = 4), and apply this function to where the sample would have been (4 * 0.8) = 3.2 and determine that the sample you insert into playback should have an integer value of 3. This method smooths out playback quite a bit, but can be costly to CPU use.

cubic spline interpolation:
This is very similar to linear interpolation, but rather than applying a linear function to compute the value of the non-existant sample you are going to insert into playback, it calculates from a cubic function. This offers very good smoothing out of the audio samples, but is extremely CPU intensive. It isn't worth considering for the GBA since the extra CPU cycles it would take isn't worth the miniscule benefit in 8-bit playback, but I thought I'd mention it all the same since it is a well known interpolation method.

Personally, I like to take an inbetween approach to no interpolation and linear interpolation by doing this: take the average of sample 1 and sample 2, and use that value for the sample inserted between the two samples. It's less accurate than linear, but also much faster since it requires only an addition and a shift 1 bit to the right, something that can be done with only one extra operation per playback sample in assembly.

Dunno if this is of any benefit to ya, but thought you might be interested all the same.
_________________
"Beer is proof that God loves us and wants us to be happy."
-- Benjamin Franklin

For most applications on the GBA, and especially games programming (where a mixer must usually be very frugal with its CPU consumption), you shouldn't need anything more than nearest-neighbour sampling (ie. no interpolation).

Keep in mind that you are also limited by the quality of the GBA's speakers and audio processor. Maybe there are people out there who can hear the difference between interpolated and nearest-neighbour samples on the GBA, but I sure can't.

Dan.

Actually, the time taken to load 2 samples for interpolation is about as much of a killer as the actual computation. Those ROM waitstates are a real killer in mixers. With some clever tricks I came up with (never implmented though), you can do it pretty quick.

Let's see if I can remember how I was planning to do that...
First, you have your interpolation function, with a 16-bit fractional portion between samples:
out = (smp1*((1<<16)-x)+smp2*x)>>16
Where X is between 0 and 65535. That way, when x is 0, you get entirely the first sample, when it's 65536, you get entirely the second. Actually at that point it wraps around to 0 and smp1 becomes smp2, and smp2 becomes the next sample, but in the end you get entirely the original smp2.
Then, simplify
out = (smp1*(1<<16)-smp1*x+smp2*x)>>16
out = (x*(smp2-smp1)>>16)+smp1
Now that's a nice equation for ARM!

Then, the trick. You store your channel position in 2 regs, one for the integer portion, and one for the fraction. Say,
rp = position int
rf = frac
ri = increment
rv = volume
rm = mix
r1, r2 = smp1 and smp2
rx, ry = temp values
ru = unused, it's only there to fill space in the smlal instruction. Technically it should be set to 0, but it will make at most a difference of 1 in the interpolated sample, so it shouldn't be noticable anyway. It will get changed though, so you can't use it for anything else, but it makes things easier.
Also, when first starting a sound, rp is initialized to the data pointer for the sound, so no need to deal with that anymore.
But for the fractional portion, instead of storing it in the lower 16 bits, store it in the upper, so when it overflows, you know you need to load the next sample. That way, you only need to load at most one sample per loop. That is, providing your increment is less than 1, but the interpolation isn't really necessary if you're advancing more than one sample at a time anyway, so just make 2 versions, one for inc<1 and one for >= 1. Now we're really using 32-bit fixed point on the fraction, so we have to use a long multiply, which is a cycle slower, but saves us from having to shift the fractional portion down. It also lets us add in the smp1 on the end of that equation at the same time, and shift the result down 32 bits as well, so it's not so bad after all

Code:

adds rf, rf, ri
movcs r1, r2
ldrsbcs r2, [rp], #1
sub rx, r2, r1
mov ry, r1
smlal ru, ry, rf, rx
mla rm, rx, rv, rm

Some pretty expensive instructions there, but all in all I think it's pretty fast for interpolation. You will need to load in the first 2 samples before you start the loop though, which is more hassle to code, but oh well.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

If you are sure that you cannot advance for than 1 byte per sample (no down-sampling), then you can 1/2 this time on this using buffering. Grab a full word in a register (8-cycles in ROM), shift the low 8-bits into a target, the next 8-bits into the next. That'll take 3 cycles - so for the first read, you've taken 11 cycles instead of 10 which is a touch slower. But now, when an increment occurs, shift the buffer down 8-bits and repeat above. Every even source boundary, grab another half-word and shift it into the high 16-bits of the buffer. This way on average, it takes only 11 cycles in loads for every 2 increments of the source (as opposed to 20 cycles if you do it 2 reads per). In addition, if no source increment occurs, you don't need to reload the buffer so that's free.

If you don't need interpolation, the above method works even better. Shift the low value from the buffer when needed. Reload every 4 increments. It eliminates 4 source byte read of 5 cycles each (20 cycles total) for a single 8 cycle load. All for the cost of a single register :)

I think my mixer does something like this in the no-down-sampling case, but there is still some additional overhead involved in maintaining the state of the buffer.

It'd be nice if you could enforce a rule of non-zero samples; then you could simply test to see if your buffer is empty at the same time that you shift off the next sample. In fact, now that I think about it, I could easily modify my music encoder to change all zero-samples to one.

Dan.

poslundc wrote:

It'd be nice if you could enforce a rule of non-zero samples; then you could simply test to see if your buffer is empty at the same time that you shift off the next sample. In fact, now that I think about it, I could easily modify my music encoder to change all zero-samples to one.

Changing 0-value samples to 1's would add a LOT of granularity noise to the quiet portions of samples if your samples are signed. The Apple IIGS sound chipset does exactly what you describe, but its samples are unsigned.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

tepples wrote:

poslundc wrote:

It'd be nice if you could enforce a rule of non-zero samples; then you could simply test to see if your buffer is empty at the same time that you shift off the next sample. In fact, now that I think about it, I could easily modify my music encoder to change all zero-samples to one.

Changing 0-value samples to 1's would add a LOT of granularity noise to the quiet portions of samples if your samples are signed. The Apple IIGS sound chipset does exactly what you describe, but its samples are unsigned.

Well, for that matter, Mike's technique won't work on signed samples unless you go through the process of sign-extending each sample when you push it off the buffer.

Dan.

Correct, though all samples I've seen have been unsigned. Still the advantage of buffering was all I was meaning to point out.

gbadev.org forum archive

Audio > Frequency Modulation

#17317 - DarkPhantom - Sat Mar 06, 2004 7:44 am

#17339 - tepples - Sat Mar 06, 2004 5:08 pm

#18290 - DarkPhantom - Tue Mar 23, 2004 6:26 pm

#18294 - poslundc - Tue Mar 23, 2004 7:02 pm

#18820 - animension - Tue Apr 06, 2004 2:56 am

#18823 - poslundc - Tue Apr 06, 2004 6:21 am

#18839 - DekuTree64 - Tue Apr 06, 2004 11:35 pm

#19099 - Miked0801 - Sun Apr 11, 2004 5:44 am

#19114 - poslundc - Sun Apr 11, 2004 3:40 pm

#19116 - tepples - Sun Apr 11, 2004 4:07 pm

#19119 - poslundc - Sun Apr 11, 2004 4:28 pm

#19144 - Miked0801 - Sun Apr 11, 2004 11:55 pm