gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

DS development > Coming up with an ADPCM Midi player...

#175932 - Ruben - Fri Mar 04, 2011 1:01 pm

[this is more General Coding, but since it's targetted at the DS, I thought I'd put it here]

Hullo.

I recently got the idea to make a custom ADPCM format to play sequenced music on the DS by mixing on the ARM9 [SMLAxy ftw!] since I'm not fond of being limited to 16 channels. I've come up with a step table that, so far, appears to work great [may be tweaked later on].

The basic idea of the format is:
Code:
//! when initializing
chan.sample = wave.firstSample;
chan.sampPack = *(s32*)(wave.data)[0]; //! note the sign
chan.sampNext = &(u32*)(wave.data)[1];
chan.stepTab = &stepTab[wave.firstIdx];

//! during mixing
mixOut(chan.sample);
if(needSkipSamples) {
    s32 indexStep = chan.sampPack >> 28; //! ASR 28
    s32 delta = chan.stepTab[indexStep]; chan.stepTab += indexStep;
    chan.sample += delta;
   
    //! last nibble can't be 0 - 0 is reserved for 'end of word'
    if((chan.SampPack <<= 4) == 0) chan.SampPack = *chan.SampNext++;
}


What that means is:
The sound wave starts off with a sample. When it's done with that sample, it reads the next by skipping through the delta step table, using the SIGNED nibble as its index (this allows it to go back/forth). Further, the step table is alternating positive/negative values so it won't have to make humongous leaps to reach the other side. The last nibble is reserved to not be zero as this allows an optimization to the decoder.
Another note is that the step table is 16.16, as the sample data is in the upper 16 bits anyway to use a saturating add.

Now, the problem I have at the moment is calculating the step values when encoding (the decoder is complete and works perfectly. Afaik, it is as optimized as it can get - and doesn't even take up the stack pointer =O).

DekuTree and I had tried to come up with a good way of calculating the index steps, but his method had a small flaw that caused BIG problems.

As such, I'm asking for help here.
Any tips/ideas how to calculate the steps?

The step table is 96 values - 48 positives, 48 negatives, all in the form of:
tab[x+0] = (x*x * 200h << 16) / (48*48);
tab[x+1] = -tab[x+0]

#175933 - elhobbs - Fri Mar 04, 2011 2:44 pm

maybe it would be helpful to show the flawed table and explain why it is flawed?

I can see that
Code:
(x*x * 200h << 16) / (48*48);
will overflow 32bits with even small values for x.

#175934 - Ruben - Sat Mar 05, 2011 12:09 am

The problem isn't the table - everything is calculated using 64 bits just to be safe.

This is the final step table calculated:
Code:
   .word 0x000038E3,0xFFFFC71D,0x0000E38E,0xFFFF1C72,0x00020000,0xFFFE0000,0x00038E38,0xFFFC71C8
   .word 0x00058E38,0xFFFA71C8,0x00080000,0xFFF80000,0x000AE38E,0xFFF51C72,0x000E38E3,0xFFF1C71D
   .word 0x00120000,0xFFEE0000,0x001638E3,0xFFE9C71D,0x001AE38E,0xFFE51C72,0x00200000,0xFFE00000
   .word 0x00258E38,0xFFDA71C8,0x002B8E38,0xFFD471C8,0x00320000,0xFFCE0000,0x0038E38E,0xFFC71C72
   .word 0x004038E3,0xFFBFC71D,0x00480000,0xFFB80000,0x005038E3,0xFFAFC71D,0x0058E38E,0xFFA71C72
   .word 0x00620000,0xFF9E0000,0x006B8E38,0xFF9471C8,0x00758E38,0xFF8A71C8,0x00800000,0xFF800000
   .word 0x008AE38E,0xFF751C72,0x009638E3,0xFF69C71D,0x00A20000,0xFF5E0000,0x00AE38E3,0xFF51C71D
   .word 0x00BAE38E,0xFF451C72,0x00C80000,0xFF380000,0x00D58E38,0xFF2A71C8,0x00E38E38,0xFF1C71C8
   .word 0x00F20000,0xFF0E0000,0x0100E38E,0xFEFF1C72,0x011038E3,0xFEEFC71D,0x01200000,0xFEE00000
   .word 0x013038E3,0xFECFC71D,0x0140E38E,0xFEBF1C72,0x01520000,0xFEAE0000,0x01638E38,0xFE9C71C8
   .word 0x01758E38,0xFE8A71C8,0x01880000,0xFE780000,0x019AE38E,0xFE651C72,0x01AE38E3,0xFE51C71D
   .word 0x01C20000,0xFE3E0000,0x01D638E3,0xFE29C71D,0x01EAE38E,0xFE151C72,0x02000000,0xFE000000


I believe the problem lies in the following:

The step value is currently calculated by, more or less, doing the following:
Code:
s64 smp = adpcm.smp;
s32 idx = adpcm.idx;
s32 upDown = abs(realDelta) < abs(pastDelta) ? (-1) : (1);
s64 bestDelta = 0x7FFFFFFFLL;
s64 bestIdxChange = 0;

for(int i=0;i != upDown*8;i += upDown) {
    s64 estSample = smp+step[idx+i];
    s64 estDelta = estSample - originalSample;
    if(abs(estDelta) < abs(bestDelta)) bestDelta = estDelta, bestIdxChange = i;
}

The problem with this is:
Say I'm on index 95 [delta = -200h]. If the new delta is +200h [need to move back one], then I get this:

abs(realDelta) < abs(pastDelta) = 200h < 200h = false, move up

By doing that, it would result in a virtually endless addition.
So far, I haven't come up with a way to go around that.. >_>

#175935 - elhobbs - Sat Mar 05, 2011 3:12 am

My first thought would be that you need to flip the sign if the sign bits are different. But then it made me wonder why you were using abs for both values. Can you provide a high level view of how that section should work?

#175936 - Ruben - Sat Mar 05, 2011 6:44 am

It basically keeps checking until it finds the smallest difference (hence the abs()). If the signs differ, then the difference would be quite high, so it would say that the absolute difference is higher, thereby skipping it.

#175937 - elhobbs - Sat Mar 05, 2011 4:54 pm

what if the step table were not symetrical with respect to the x and x=1values? so, if you built the table such that it was skewed a little bit towards positive or negative at each step - maybe alternating between skewing towards positve and negative for each pair? essentially the point is to modify the table so that abs(x) will never match abs(x+1)

#175938 - Ruben - Sat Mar 05, 2011 5:16 pm

Even then it doesn't work, which leads me to believe there's something else going on... T_T

This is the function...
Code:
static inline u32 toAD(s64 smp) {
   s64 rDelta = (smp<<16) - gAD.smp;
   s32 idxDir = IABS(rDelta) < IABS(ad_Step[gAD.idx]) ? (-1) : (1);
   s64 bstDlt = 0x7FFFFFFF;
   s32 bstChn = 0;
   s32 bstLim = idxDir < 0 ? 9 : 8; //! 9 because neg can use -8
   
   for(int  idxChn  = 0;
      (idxChn != idxDir*bstLim) &&
      (gAD.idx + idxChn >= 0)   &&
      (gAD.idx + idxChn < 96);
       idxChn += idxDir)
   {
      s32 est = Clamp64(gAD.smp + (s64)ad_Step[gAD.idx + idxChn])>>16;
      s32 del = smp - est;
      if(IABS(del) < IABS(bstDlt)) {
         bstDlt = del;
         bstChn = idxChn;
      }
   } ;
   
   gAD.smp = Clamp64(gAD.smp + (s64)ad_Step[gAD.idx += bstChn]);
   
   static FILE *outF = 0;
   if(!outF) outF = fopen("output.sw", "wb");
   
   s32 var = gAD.smp>>16;
   fwrite(&var, 2, 1, outF);
   
   return bstChn&0xF;
}

I've also uploaded a trimmed [2500 samples] comaprison of input/output.
Link

EDIT:
Accidentally pasted the old code. >_>

#175947 - sverx - Mon Mar 07, 2011 5:47 pm

You mean you want to build up a table that contains 96 signed values and that would be used as a delta to add to your previous sample and you want to encode your wave into a 4 bits per sample ADPCM that will use the table above? And adding these 4 bits (as a signed) to locate the next beginning of the 16 values for the next sample? :| Mmm... tricky!

#175969 - Ruben - Sun Mar 13, 2011 2:25 pm

Bah! Does the universe hate me or something? This is the THIRD time I've lost all my data. >_<

Well, here's hoping my next start will not be interrupted as much...

#175971 - headspin - Sun Mar 13, 2011 8:52 pm

Ruben wrote:
Bah! Does the universe hate me or something? This is the THIRD time I've lost all my data. >_<


Backup, backup, backup.
_________________
Warhawk DS | Manic Miner: The Lost Levels | The Detective Game

#175972 - Lazy1 - Sun Mar 13, 2011 9:16 pm

I'm guessing you backup the same way I do...
Either not at all or to the same disk/directory.

#176090 - Ruben - Wed Apr 06, 2011 8:13 pm

Well, looks like I've hit the jackpot.
The new delta table contains 32 delta values and it sounds AWESOME.

If someone has a nice sounding public domain song (.mp3 or something) they'd like to share to see the results (can be stereo, too), link me.

EDIT:
After a lot of thinking, I decided 'what the hell' and am no longer programming for the DS. As such, I'm uploading the source for this music player (note: there is no Midi handling at ALL - all the structs are there, but the handling is not, so it's just a sound mixer that can be easily made into a sequence player).

I uploaded the source code, the necessary structs and #defines and the conversion program source.
Linky

No credit needed if you use the code but would be appreciated ;D

#176143 - Ruben - Sat Apr 16, 2011 9:09 pm

Ok, I decided 'what the hell' again and am back to programming on the DS xD

Currently, I've optimized this mixer a LOT and improved the step table (and added linear interpolation!)... but as embarrassing as it is, I've hit a tiny little problem [maybe not so little]... I can't seem to loop samples xD

To be more precise, I don't think I'm calculating things properly which leads to not mixing enough samples and the looping going all over the place.

What I'm doing right now is pre-calculating the amount of samples that can be read so that I can remove the checks from the mixing loop (and free another register). However, it appears that I am a total n00b at this as is evidenced by my problems =P

Can anyone shed a little light on this? I've tried looking at Maxmod's source but I couldn't exactly follow. =P

#176146 - Ruben - Sun Apr 17, 2011 7:29 pm

Alright, nevermind that, I fixed it xD

So far, the entire framework is down and I just need to add functions like "SeqStart", "SeqStop", etc... but here's a small demo showing off the music player + interpolating mixer

Link 1 (demo with visual effect - CPU usage doesn't reflect mixing only)

Link 2 (demo without visual effect - CPU usage is 99% just the mixer + sequence player)

Controls:
-A: Start sequence
-B: Stop sequence (doesn't stop sound channels)

Specs:
-4 music players (ie, one for BGM, three for effects or whatever)
-24 channel polyphony
-Linear interpolation of samples (when delta < 1.0)
-Mixing at 32728Hz
-Completely hand-written ASM

#176178 - Ruben - Tue May 03, 2011 10:29 am

Right, well, time for an update.

I've been working on the player/mixer a LOT (even though I should be doing school stuff =P). The music player is working VERY well and the mixer is working surprisingly well (CPU stays under 50% at 24 channels under most circumstances).

Here's a list of specs:
-4 music players updating at 120Hz (300bpm)
-24 channels using a custom ADPCM/DPCM format
-16-bit (8.8) envelopes
-Linear interpolation (can be activated for delta < 1.0 or all samples, or none)
-Real-time decoding of samples (ie. no decompression buffers)
-Mixing at 32728Hz
-1344 bytes of channels (just over 1kB)
-17kB of sound buffers (OUCH! Will drastically reduce when I can sync the sound properly)
-1072 bytes of music data (just over 1kB)
-4kB of mixing temporary in DTCM
-1672 bytes of mixing code

Here's the todo list:
-Reverb

EDIT:
Don't need 8-bit DPCM - turns out it was a small bug I had in my code (technically a rounding error).

#176224 - Ruben - Sun May 22, 2011 10:37 am

Right.

I've given up on making a nice demo to go with the music player so this is the final release. =P

Final specs:
-4 music players updating at 120Hz (300bpm)
-24 channels using a custom ADPCM/DPCM format
-11-bit mixing mode (slightly faster, a bit less DTCM usage, more ITCM)
-16-bit envelopes
-Linear interpolation (can be activated for: always, delta < 1.0, never)
-Real-time decoding of samples (ie. no decompression buffers)
-Mixing at 32728Hz
-1kB of channels
-8kB of sound buffers (can be 4kB without surround effect)
-1kB of music data
-2kB of mixing temporary in DTCM (can be 1kB with 11-bit mixing)
-1.5kB of mixing code

Notes:
-Drastically-changing sounds (such as ride cymbals and hi-hats) have the potential to sound VERY bad due to the delta not being changed fast enough resulting in a lot of noise.
-I gave up on reverb after I tried experimenting even with 32kB of buffers
-Does not use the stack pointer and has a lot of #defines at the top of the mixer code to simplify settings (such as allowing multiple interrupts)
-This sound driver is ripped directly from my custom DS framework, so there may be references to "NitroLib"; it is not the official SDK.
-The song 'Duel of the Fates' is unfinished as I got bored with sequencing it =P

Demo (note: this is using my own DS library and most likely will NOT work on hardware; I had tried porting to libnds but gave up after I couldn't get anything visual working).

Source (note: this is only the ARM9 code with NO initialization; the only thing the ARM7 has to do is capture the sound buffer address and start channel playback [preferably after syncing to V-Blank or something] while the ARM9 has to start a timer/thread/alarm/whatever at 512*272*2 cycles [512 = 16756991/32728, 272 = buffer size, 2 = clock multiplier])