gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

DS development > About MP3 Decoding...

#142363 - Lazy1 - Mon Oct 08, 2007 7:55 pm

I'm having a bit of trouble here, I have the helix MP3 decoder running on the arm7 streaming data from the arm9.

The problem is that the output while recognizable is still very clicky and I have tried a few things to get it to sound better with no luck.

Basically what I do is this:

Code:

   // Keep the ARM7 idle
   while ( 1 ) {
      swiWaitForVBlank( );

      if ( LoadMP3 == 1 ) {
         LoadMP3 = 0;

         if ( OpenMP3( MP3Path, &MP3File ) == 1 ) {
            tprint_string( "ARM7: Loaded [" );
            tprint_string( MP3Path );
            tprint_string( "].\n" );

            if ( DecodeMP3Frame( &MP3File ) == 1 ) {
               MP3FillSegment( &MP3File );

               DecodeMP3Frame( &MP3File );
               MP3FillSegment( &MP3File );

               SCHANNEL_REPEAT_POINT( 0 ) = 0;
               SCHANNEL_SOURCE( 0 ) = ( u32 ) MP3File.PlayBuffer;
               SCHANNEL_LENGTH( 0 ) = ( 1152 * 2 ) >> 2;
               SCHANNEL_TIMER( 0 ) = SOUND_FREQ( MP3File.Frame.samprate );
               SCHANNEL_CR( 0 ) = SCHANNEL_ENABLE | SOUND_REPEAT | SOUND_16BIT | SOUND_VOL( 0x7F ) | SOUND_PAN( 0 );

               TIMER0_DATA = SOUND_FREQ( MP3File.Frame.samprate ) * 2;
               TIMER0_CR = TIMER_ENABLE | TIMER_DIV_1;

               TIMER1_DATA = 65535 - ( 1152 );
               TIMER1_CR = TIMER_ENABLE | TIMER_DIV_1 | TIMER_CASCADE;

               TIMER2_DATA = 0;
               TIMER2_CR = TIMER_ENABLE | TIMER_DIV_1 | TIMER_CASCADE;

               do {
                  if ( TIMER2_DATA > f ) {
                     DecodeMP3Frame( &MP3File );
                     MP3FillSegment( &MP3File );

                     f = TIMER2_DATA;
                  }
               } while ( MP3File.IsEOF == 0 && MP3File.IsOutOfData == 0 );

               SCHANNEL_CR( 0 ) = 0;
            }

            CloseMP3( &MP3File );
         }
      }
   }


I would assume that it's entirely ass backwards which is why it doesn't work properly.
Any suggestions?

#142364 - Noda - Mon Oct 08, 2007 7:57 pm

try first removing the swiForVBlank() call, it's not needed there...

#142366 - Lazy1 - Mon Oct 08, 2007 8:00 pm

While the MP3 is decoding it doesn't wait for vblank.
I'll admit the code is in a cheap hack state, otherwise it would be much neater. ( and done properly )

#142387 - DekuTree64 - Mon Oct 08, 2007 9:40 pm

Two bad things I can see there. First, timer1 data should be set to 65536 - x. Second, this bit of code:
Code:
               do {
                  if ( TIMER2_DATA > f ) {
                     DecodeMP3Frame( &MP3File );
                     MP3FillSegment( &MP3File );

                     f = TIMER2_DATA;
                  }
               } while ( MP3File.IsEOF == 0 && MP3File.IsOutOfData == 0 );

is a bit dangerous, because it's reading timer2 data twice, and the value could change between the two reads. Better to read it once into a temporary, and use that for comparing/setting to f.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

#142388 - Lazy1 - Mon Oct 08, 2007 9:53 pm

I made those changes and it's still crackly, except now it pauses every once and a while for a tiny amount of time.

#142413 - DekuTree64 - Tue Oct 09, 2007 12:16 am

I just noticed that it seems to be waiting until the buffer is completely empty to start decoding the next frame. Much better to double buffer it (decode one frame while the previous plays).
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

#142422 - Lazy1 - Tue Oct 09, 2007 1:42 am

That's what it _should_ be doing.
I made the buffer 2x the size of a mono mp3 frame, when TIMER2_DATA increments that _should_ be when one half of the buffer has been played.

Full, messy/hacky code:
Code:

#include <nds.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <ipcshared.h>
#include <mp3dec.h>

#define READBUF_SIZE ( 16 * 1024 )
#define PCMBUF_SIZE ( MAX_NCHAN * MAX_NGRAN * MAX_NSAMP * 2 )

typedef struct {
   FILE* Stream;

   int BytesLeft;
   int BytesRead;
   int IsEOF;
   int IsOutOfData;
   int LastError;

   HMP3Decoder Helix;
   MP3FrameInfo Frame;

   unsigned char* ReadBuffer;
   unsigned char* ReadPtr;
   short* PlayBuffer;
   short* PlayCursor;
   short* PCMBuffer;
} MP3Stream;

char MP3Path[ 192 ];
volatile int LoadMP3 = 0;

MP3Stream MP3File;

int OpenMP3( const char* Path, MP3Stream* MP3 ) {
   memset( MP3, 0, sizeof( MP3Stream ) );

   if ( ( MP3->Helix = MP3InitDecoder( ) ) == 0 ) {
      tprint_string( "ARM7: Failed to create helix.\n" );
      return 0;
   }

   if ( ( MP3->Stream = tfopen( Path, "rb" ) ) == NULL ) {
      tprint_string( "Failed to open [" );
      tprint_string( Path );
      tprint_string( "].\n" );

      return 0;
   }

   MP3->ReadBuffer = ( unsigned char* ) tmalloc( READBUF_SIZE );
   MP3->PCMBuffer = ( short* ) tmalloc( PCMBUF_SIZE );
   MP3->PlayBuffer = ( short* ) tmalloc( 1152 * 2 * 2 );
   MP3->PlayCursor = MP3->PlayBuffer;
   MP3->ReadPtr = MP3->ReadBuffer;

   return 1;
}

int CloseMP3( MP3Stream* MP3 ) {
   if ( MP3->Stream != NULL ) {
      MP3FreeDecoder( MP3->Stream );

      tfclose( MP3->Stream );
      tfree( MP3->ReadBuffer );
      tfree( MP3->PCMBuffer );
      tfree( MP3->PlayBuffer );

      memset( MP3, 0, sizeof( MP3Stream ) );
      return 1;
   }

   return 0;
}

int FillReadBuffer( MP3Stream* MP3 ) {
   int ReadCount = 0;

   memmove( MP3->ReadBuffer, MP3->ReadPtr, MP3->BytesLeft );
   ReadCount = tfillbuffer( MP3->ReadBuffer + MP3->BytesLeft, READBUF_SIZE - MP3->BytesLeft, MP3->Stream );

   if ( ReadCount < ( READBUF_SIZE - MP3->BytesLeft ) )
      memset( MP3->ReadBuffer + MP3->BytesLeft + ReadCount, 0, READBUF_SIZE - MP3->BytesLeft - ReadCount );

   return ReadCount;
}

int DecodeMP3Frame( MP3Stream* MP3 ) {
   int ReadCount = 0;
   int Offset = 0;
   int Error = 0;

   if ( MP3->BytesLeft < ( MAINBUF_SIZE << 1 ) && MP3->IsEOF == 0 ) {
      ReadCount = FillReadBuffer( MP3 );
      MP3->BytesLeft+= ReadCount;
      MP3->ReadPtr = MP3->ReadBuffer;

      if ( ReadCount == 0 )
         MP3->IsEOF = 1;
   }

   if ( ( Offset = MP3FindSyncWord( MP3->ReadPtr, MP3->BytesLeft ) ) < 0 ) {
      tprint_string( "Helix: Could not find syncword\n" );
      MP3->IsOutOfData = 1;
      return 0;
   }

   MP3->ReadPtr+= Offset;
   MP3->BytesLeft-= Offset;

   if ( ( Error = MP3Decode( MP3->Helix, &MP3->ReadPtr, &MP3->BytesLeft, MP3->PCMBuffer, 0 ) ) > 0 ) {
      switch ( Error ) {
         case ERR_MP3_INDATA_UNDERFLOW: {
            MP3->LastError = Error;
            MP3->IsOutOfData = 1;

            return 0;
         }
         case ERR_MP3_MAINDATA_UNDERFLOW: {
            break;
         }
         default:
         case ERR_MP3_FREE_BITRATE_SYNC: {
            MP3->LastError = Error;
            MP3->IsOutOfData = 1;

            return 0;
         }
      };
   }

   MP3GetLastFrameInfo( MP3->Helix, &MP3->Frame );
   return 1;
}

void MP3SwapCursor( MP3Stream* MP3 ) {
   if ( MP3->PlayCursor == MP3->PlayBuffer ) MP3->PlayCursor+= 1152;
   else MP3->PlayCursor = MP3->PlayBuffer;
}

void MP3FillSegment( MP3Stream* MP3 ) {
   memcpy( MP3->PlayCursor, MP3->PCMBuffer, 1152 * 2 );
   MP3SwapCursor( MP3 );
}

//---------------------------------------------------------------------------------
void startSound(int sampleRate, const void* data, u32 bytes, u8 channel, u8 vol,  u8 pan, u8 format) {
//---------------------------------------------------------------------------------
   SCHANNEL_TIMER(channel)  = SOUND_FREQ(sampleRate);
   SCHANNEL_SOURCE(channel) = (u32)data;
   SCHANNEL_LENGTH(channel) = bytes >> 2 ;
   SCHANNEL_CR(channel)     = SCHANNEL_ENABLE | SOUND_ONE_SHOT | SOUND_VOL(vol) | SOUND_PAN(pan) | (format==1?SOUND_8BIT:SOUND_16BIT);
}


//---------------------------------------------------------------------------------
s32 getFreeSoundChannel() {
//---------------------------------------------------------------------------------
   int i;
   for (i=0; i<16; i++) {
      if ( (SCHANNEL_CR(i) & SCHANNEL_ENABLE) == 0 ) return i;
   }
   return -1;
}

int vcount;
touchPosition first,tempPos;

#define KEY_TOUCH ( 1 << 6 )

//---------------------------------------------------------------------------------
void VcountHandler() {
//---------------------------------------------------------------------------------
   static int lastbut = -1;
   
   uint16 but=0, x=0, y=0, xpx=0, ypx=0, z1=0, z2=0;

   but = REG_KEYXY;

   if (!( (but ^ lastbut) & (1<<6))) {
 
      tempPos = touchReadXY();

      x = tempPos.x;
      y = tempPos.y;
      xpx = tempPos.px;
      ypx = tempPos.py;
      z1 = tempPos.z1;
      z2 = tempPos.z2;
      
   } else {
      lastbut = but;
      but |= (1 <<6);
   }

   if ( vcount == 80 ) {
      first = tempPos;
   } else {
      if (   abs( xpx - first.px) > 10 || abs( ypx - first.py) > 10 ||
            (but & ( 1<<6)) ) {

         but |= (1 <<6);
         lastbut = but;

      } else {    
         IPC->mailBusy = 1;
         IPC->touchX         = x;
         IPC->touchY         = y;
         IPC->touchXpx      = xpx;
         IPC->touchYpx      = ypx;
         IPC->touchZ1      = z1;
         IPC->touchZ2      = z2;
         IPC->mailBusy = 0;
      }
   }
   IPC->buttons      = but;
   vcount ^= (80 ^ 130);
   SetYtrigger(vcount);
}

volatile int ARM9Ready = 0;
volatile int ARM7Waiting = 0;

void IRQFifoNotEmpty( void ) {
   u32 Data = 0;

   do {
      Data = REG_IPC_FIFO_RX;

      switch ( Data ) {
         case FMessage_Ready: {
            ARM9Ready = 1;
            break;
         }
         case FMessage_FifoCompleted: {
            ARM7Waiting = 0;
            break;
         }
         case FMessage_LoadMP3: {
            strncpy( MP3Path, ( const char* ) IPCBuffer, sizeof( MP3Path ) - 1 );         
            LoadMP3 = 1;

            fifoIPC_SendCompleted( );
            break;
         }
         default: break;
      };
   } while ( ! ( REG_IPC_FIFO_CR & IPC_FIFO_RECV_EMPTY ) );
}

void VblankHandler( void ) {
}

//---------------------------------------------------------------------------------
int main(int argc, char ** argv) {
//---------------------------------------------------------------------------------
   int ThisTimer2Data = 0;
   int LastTimer2Data = 0;

   fifoIPC_Init( );

   // Reset the clock if needed
   rtcReset();

   //enable sound
   powerON(POWER_SOUND);
   SOUND_CR = SOUND_ENABLE | SOUND_VOL(0x7F);
   IPC->soundData = 0;

   irqInit();
   irqSet( IRQ_FIFO_NOT_EMPTY, IRQFifoNotEmpty );
   irqSet(IRQ_VBLANK, VblankHandler);

   SetYtrigger(80);
   vcount = 80;

   irqSet(IRQ_VCOUNT, VcountHandler);
   irqEnable(IRQ_VBLANK | IRQ_VCOUNT | IRQ_FIFO_NOT_EMPTY );

   // Tell the ARM9 we are ready to go
   fifoIPC_SendReady( );

   // Wait for the ARM9 to be ready
   while ( ARM9Ready == 0 )
      swiWaitForVBlank( );

   // Keep the ARM7 idle
   while ( 1 ) {
      swiWaitForVBlank( );

      if ( LoadMP3 == 1 ) {
         LoadMP3 = 0;

         if ( OpenMP3( MP3Path, &MP3File ) == 1 ) {
            tprint_string( "ARM7: Loaded [" );
            tprint_string( MP3Path );
            tprint_string( "].\n" );

            if ( DecodeMP3Frame( &MP3File ) == 1 ) {
               MP3FillSegment( &MP3File );

               DecodeMP3Frame( &MP3File );
               MP3FillSegment( &MP3File );

               SCHANNEL_REPEAT_POINT( 0 ) = 0;
               SCHANNEL_SOURCE( 0 ) = ( u32 ) MP3File.PlayBuffer;
               SCHANNEL_LENGTH( 0 ) = ( 1152 * 2 ) >> 2;
               SCHANNEL_TIMER( 0 ) = SOUND_FREQ( MP3File.Frame.samprate );
               SCHANNEL_CR( 0 ) = SCHANNEL_ENABLE | SOUND_REPEAT | SOUND_16BIT | SOUND_VOL( 0x7F ) | SOUND_PAN( 0 );

               TIMER0_DATA = SOUND_FREQ( MP3File.Frame.samprate ) * 2;
               TIMER0_CR = TIMER_ENABLE | TIMER_DIV_1;

               TIMER1_DATA = 65536 - ( 1152 );
               TIMER1_CR = TIMER_ENABLE | TIMER_DIV_1 | TIMER_CASCADE;

               TIMER2_DATA = 0;
               TIMER2_CR = TIMER_ENABLE | TIMER_DIV_1 | TIMER_CASCADE;

               do {
                  ThisTimer2Data = TIMER2_DATA;

                  if ( ThisTimer2Data > LastTimer2Data ) {
                     DecodeMP3Frame( &MP3File );
                     MP3FillSegment( &MP3File );

                     LastTimer2Data = ThisTimer2Data;
                  }
               } while ( MP3File.IsEOF == 0 && MP3File.IsOutOfData == 0 );

               swiWaitForVBlank( );
               SCHANNEL_CR( 0 ) = 0;
            }

            CloseMP3( &MP3File );
         }
      }
   }
}

#142425 - DekuTree64 - Tue Oct 09, 2007 2:12 am

Ah, I think it's this line:
Code:
SCHANNEL_LENGTH( 0 ) = ( 1152 * 2 ) >> 2;

Since the stream is 16-bit, that will play 1152 samples before looping. So multiply by 2 again to play a double length buffer.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

#142426 - Lazy1 - Tue Oct 09, 2007 2:18 am

You did it!
A million thanks!

The only problem that remains is a small gap/glitch every few seconds.
If you want I can upload a sample.

EDIT:
Gap/glitches disappear if the timer is 65535 - blah.

#142427 - Noda - Tue Oct 09, 2007 2:26 am

Once got a working version, can you post/upload the sources somewhere? I think some people could be interested with a proper mp3 decoding/streaming system on the arm7 ;)

#142429 - Lazy1 - Tue Oct 09, 2007 2:34 am

I think I will, though a complete rewrite of the FIFO code would be in order.
Even better would be if I made that into a library.

One thing is bothering me though, is there a problem reading files during an interrupt?
Currently the arm7 requests a buffer fill by sending a fifo message to the arm9 which handles it in the receive irq handler.

EDIT AGAIN:
It seems that what I said earlier may not be the case.
On another MP3 the small glitch was there but almost unnoticeable.

I'll try to throw together some files so it can be heard first hand.

EDIT YET AGAIN:
Files:
http://lazyone.drunkencoders.com/wordpress/wp-content/uploads/2007/10/mp3tests.zip
http://lazyone.drunkencoders.com/wordpress/wp-content/uploads/2007/10/mp3stream-src.zip

EDIT #99999:
The FIFO system in there is a joke, if anyone has some suggestions please let me know.

#142434 - DragonMinded - Tue Oct 09, 2007 4:17 am

Lazy1 wrote:
One thing is bothering me though, is there a problem reading files during an interrupt?
Currently the arm7 requests a buffer fill by sending a fifo message to the arm9 which handles it in the receive irq handler.


YES. Do NOT do this! The bug present for over a year in DSOrganize that caused random freezes in music was due to me reading files in interrupts. Avoid this at all costs.
_________________
Enter the mind of the dragon.

http://dragonminded.blogspot.com

Seriously guys, how hard is it to simply TRY something yourself?

#142437 - Lazy1 - Tue Oct 09, 2007 4:56 am

I see...
The only other option would be to do this in the main loop however I see two problems:

1) The wolf3d source is FULL of while( ) loops
2) If the renderer takes too much time it's possible that the mp3 player will not get the data fast enough

#142455 - DragonMinded - Tue Oct 09, 2007 9:19 am

Or just buffer the data, then in the interrupt when it requests a chunk, just copy it out of already buffered memory, and flag that it needs to be refilled in the main loop.
_________________
Enter the mind of the dragon.

http://dragonminded.blogspot.com

Seriously guys, how hard is it to simply TRY something yourself?

#142457 - simonjhall - Tue Oct 09, 2007 9:28 am

Lazy1 wrote:
EDIT:
Gap/glitches disappear if the timer is 65535 - blah.
Sounds familiar :-)
Anyway, well done. Reason I again never got back to you was because mine started clicking again! I'm still finding that this stuff slows down the ARM9 a bit too much though. This is due to the endless file access and the extra latency introduced due to ARM9/ARM7 IPC since the decoder is running. Any ideas of how to handle this?
_________________
Big thanks to everyone who donated for Quake2

#142478 - Lazy1 - Tue Oct 09, 2007 2:51 pm

Since the MP3 decoder is very dependent on latency why not hack libfat to run on the arm7 and have the arm9 access files though IPC?
This would solve the interrupt problem and hopefully the latency one as well.

As for the arm9 slowing down, maybe map some vram to the arm7 and decode/play the sounds from there?

[EDIT]
The click/pop/crackle problem is definitely the result of ARM7<->ARM9 I/O IPC.
I just tried modifying my sources to load the entire MP3 into ram and then fill the buffer with memcpy as the arm7 requests it.
The result is perfect MP3 playing.

This raises a few issues, the latency can possibly be solved by reading twice the read buffer's size then memcpy()ing it on request.
The question is though, with the game taking up 100% of the ARM9's cpu time will that read buffer be filled in time?

#142519 - Lazy1 - Wed Oct 10, 2007 12:07 am

simonjhall:
Did you try reducing the read buffer size?

I just lowered the size by half to 8KB and xtower2.mp3 no longer snaps, crackles or pops.
I'm not sure how much of a solution this is though, I haven't done any tests for 8KB but reading 16KB takes 100-135 hblanks to complete.

#142522 - tepples - Wed Oct 10, 2007 12:32 am

If you're trying different read buffer sizes, you might want to play with speed tester.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#142523 - Lazy1 - Wed Oct 10, 2007 12:58 am

That just made me realize...
The games 'n music cart is horribly slow isn't it?

I'm just worried that no matter how much I buffer in advance the MP3 decoder will catch up to the slow I/O.

In the meantime I'm re-writing my FIFO code to be much better.

#142526 - tepples - Wed Oct 10, 2007 1:42 am

Lazy1 wrote:
I'm just worried that no matter how much I buffer in advance the MP3 decoder will catch up to the slow I/O.

Obviously the card's built-in MP3 player keeps up with the data stream, and I've read that so does MoonShell at least for music.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#142530 - DragonMinded - Wed Oct 10, 2007 2:24 am

DSOrganize also plays all its formats flawlessly on GnM, so it's not the card.
_________________
Enter the mind of the dragon.

http://dragonminded.blogspot.com

Seriously guys, how hard is it to simply TRY something yourself?

#142538 - Lazy1 - Wed Oct 10, 2007 5:29 am

*slaps forehead*
Next time I should really look at the title of the cart a little more closely.

Still, it'll be interesting to try and update the sound system from the main loop when there are like 50 of them.
Maybe threading would be a better idea?

EDIT:
Chishm said in another thread...
Quote:

Interrupts aren't disabled during an fread. This can cause problems, as in your case, if it interrupts the disc IO in the middle of a sector read/write. Effects vary depending on the card. On some devices it'll just mis-read the sector originally being read. On others it can cause the device to lock up.


Maybe I could just disable interrupts before reading data from the main loop?
All I/O operations in Wolfenstein are wrapped anyway so that might make it easier.

#142546 - simonjhall - Wed Oct 10, 2007 8:06 am

Ah I'd forgotten about the GnM... Yeah, that may need some testing!
The speed problems I get are the endless freads through the game. I've got a tiny data buffer, an this means it's reading all the time.

Also when the '9 wants to play a sound, it blocks until the '7 picks up the message - well if the processor is busy, it's gonna have to wait longer. This isn't a mega amount of time, but it does make some high-speed stuff slower (eg the nail gun). Also y'know that there will be some idiots who try and play 256kbps music and complain to me that it jumps and/or the game itself is slow ;-)
_________________
Big thanks to everyone who donated for Quake2

#142569 - tepples - Wed Oct 10, 2007 12:53 pm

Then maybe you need to preload the weapon firing sounds when the player pulls out a weapon. Not enough RAM? Recommend the RAM build to GnM users.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#142634 - Lazy1 - Thu Oct 11, 2007 12:51 am

Are you using FIFO for your IPC or shared memory?

I know that FIFO is the proper route to go since it keeps the arm7 out of main ram but so far it's been a real pain in the ass.
Should I just go the shared memory route and wait for something official and update my sources in the future?

#142639 - DekuTree64 - Thu Oct 11, 2007 1:48 am

You could try this FIFO system I wrote a while back (what ever became of incorporating it or something similar into libnds?)

Shared memory isn't necessarily evil though. The big advantage of FIFO is that it interrupts the other CPU, so the message is processed immediately. Shared memory is fine if the other CPU can just handle the message next time through its main loop or whatever.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

#142847 - Lazy1 - Sun Oct 14, 2007 4:00 pm

I have a working IPC system and the MP3 plays perfectly up to 128kbps on the arm7.

The problem is that the test app reads during an interrupt, as expected this really screws up wolf3d.
I could change the IPC to read during the main loop except there are an insane amount of loops to begin with.

I guess I could track down every single loop and add a call to SD_Update but that would really be hacky.
I may have to though unless someone has another suggestion.

Could threads be the answer here, or is that just another problem?