gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Coding > Sound IRQ with heavy DMA loads

#1408 - Kay - Sat Jan 18, 2003 1:53 pm

I currently coding a 8 voices VFQ synthesis player in assembler using IRQ to feed DSound FIFO with next DMA flow by request on buffer play end.
This technic works very fine standalone.

When adding other parts of code (VRAM transfers with DMA3, IO registers reload with DMA0 on HBL), it appears that DMA REQ IRQ on transfer end occurs too late, resulting in distortion in sound output.
I tried with triple buffering technic, but this consums too many IWRAM to fullfill my project.

Any idea ?


Thanx alot.

-- Kay

#1430 - Splam - Sat Jan 18, 2003 6:17 pm

Give it a try without the DMA0 stuff going, I know you need it but it's probably whats causing the problem because it's the highest priority thing that can happen.

Also do you have any other interrupts happening? If so then maybe one of those with a higher priority is stopping the dma irq from happening when it should, or maybe you're not servicing the interrupts correctly so one is triggered and while you're processing that the dma irq triggers but you totally miss it.

#1450 - Kay - Sun Jan 19, 2003 12:21 am

My IRQ handler is fast enough, and can assume all events without any problem, and is cascadable without stack overflow.

Troubles reside in a pretty much complex mechanism. You forgot that DMA transfers suspend CPU in a 1/2 cycles delay, by pulling BR signal high.
DMA3 operations take an overall system ressources of nearly 55% (57 KB of datas at minimum flow).

I'm looking for an intelligent technic to prevent sound buffer flickering at end because of massive DMA use for dynamic screen refresh, thus delaying IRQ handler too late after DMA 1 & 2 transfer end.

DMA0 is used as HDMA to change color 0 each line, nothing else.

I've a perfect knowledge on how GBA hardware works ...


--- Kay

#1452 - Splam - Sun Jan 19, 2003 12:43 am

errr yeah, was just trying to help. Personally I don't have any sound problems when running large dma's + vblank and timer interrupts.

#1498 - Kay - Sun Jan 19, 2003 5:14 pm

Sorry for any inconvenience ... :)

I'm looking for anybody having a brillant solution to this.

And, as always, this appens only on real hardware, never under emu !
All is coded in assembly.

The problem reside in fact that DMA1 & 2 are used to feed Sound FIFOs, but due to very long DMA3 transfers (50%+ system ressources) who hangs up CPU, IRQs at DMA1 & 2 transfers end are handled too late (sound buffer swap), resulting in sound distortion and artefacts.


Thanx you all for any help.


-- Kay[/list]

#1499 - NEiM0D - Sun Jan 19, 2003 5:17 pm

DMA1/2 has a higher priority than DMA3, so those will be executed first, no matter what.

#1501 - Kay - Sun Jan 19, 2003 5:29 pm

I've no problem with DMAs, but with CPU hangup during quasi permanent DMAs transfers.

Playing a continuous sound buffer causes no problem, but when using double or triple buffering, IRQ at DMA1 and DMA2 transfer end occurs too late due to DMA3 always hanging up CPU.

But if someone can explain me how to swap sound buffer without IRQ, it may help me too ...


Thanx NEiMOD :)


-- Kay

#1503 - NEiM0D - Sun Jan 19, 2003 5:36 pm

Nope, after the timer interrupt you need to switch.
The 'refilling' however can be delayed.

I suggest implenting some other way using DMA3's all the time, else you've got some serious speed issues.

#1506 - Kay - Sun Jan 19, 2003 6:03 pm

I need DMA3 since application is time critical. Using LDM & STM isn't fast enough to perform all tasks in 1 VBL, even with tricks like generated code running in IWRAM.

Solution may reside in using another timer, twice faster as buffer playing time setting up the flag for next buffer rendering ...

I'll try that ... and who care ?


Thanx anyway.



-- Kay

#1519 - tepples - Sun Jan 19, 2003 8:31 pm

Kay wrote:
I need DMA3 since application is time critical. Using LDM & STM isn't fast enough

If your DMA[3] transfers are going to take longer than one sample (about 600 to 1000 cycles), then use DMA[1] instead of an interrupt routine for sound FIFO fills. DMA[0] takes precedence over DMA[1], then DMA[2], then DMA[3], then interrupts.

Typical assignment of tasks to GBA DMA channels:
DMA[0] copying coordinates into BG scroll registers or affine registers
DMA[1] fill FIFO for DSound A
DMA[2] fill FIFO for DSound B
DMA[3] memcpy()
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#1523 - Kay - Sun Jan 19, 2003 8:51 pm

Hmmmmmmmmmmmm ...

I'm actually not coding an ATARI STF soundtrack player using Timer B.

I don't use IRQ to fill FIFOs (even it's possible but slow down system overall performences), but to swap between work and play buffer and set VBL mixing flag.
Please read more carefully what i posted previously.

I don't need help on how DMA chanels work, but on how to prevent Sound FIFO feed underrun because of too late IRQ handling.

I any case i allready found how to do by myself, and now it works fine.

Thanx for all answer.


-- Kay

#1530 - tepples - Sun Jan 19, 2003 9:19 pm

Kay wrote:
I don't use IRQ to fill FIFOs (even it's possible but slow down system overall performences), but to swap between work and play buffer and set VBL mixing flag.

Oops. I had confused it with one of the examples in http://www.belogic.com/gba/'s Direct Sound section.

Quote:
how to prevent Sound FIFO feed underrun because of too late IRQ handling.

You say you already synchronize your program to vblank. My mixer engine makes each of its buffers one vblank long so that it doesn't need to handle an IRQ to switch buffers. To do this, make sure that your buffer length divides evenly into 280896 (the length in cycles of a frame). This does limit your options for mixing frequencies, but a 304 byte buffer provides 18157 Hz mixing, which should sound OK.

Is that how you solved the problem?
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#1545 - Kay - Mon Jan 20, 2003 12:02 am

Not a bad idea at all !
My documentation shows 279620 cycles per frame, but this isn't important.

At 27360 Hz (samples sync with HBLANK), this give 456*2=912 bytes to be transfered each VBL from IWRAM to FIFO, thus take ((912 / 4) * 2) = 456 DMA cycles per VBL.

This seems to work ...

Thanx alot.



-- Kay

#1552 - Splam - Mon Jan 20, 2003 1:13 am

SplamSID my c64 SID chip emulation uses the vsync method too, I've found it to be the most accurate so far but as everyone knows, you're limited to certain sampling rates, but I don't think thats too much to worry about :) except maybe the PWM, has anyone tested to see what (if any) the differences are when you're not matching sample rates with the hardwares pwm or isn't it that much of a problem if you're only running DirectSound ?

#1558 - tepples - Mon Jan 20, 2003 4:07 am

Quote:
except maybe the PWM, has anyone tested to see what (if any) the differences are when you're not matching sample rates with the hardwares pwm or isn't it that much of a problem if you're only running DirectSound ?


My game runs DSound plus tone generators. On hardware, it runs PWM at 65536 Hz and DSound at 18157 Hz and 304 samples/vblank. On VisualBoyAdvance, it runs PWM at 44100 Hz and DSound at 18157 Hz. It still sounds fine in both environments for the most part.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#1577 - Kay - Mon Jan 20, 2003 1:34 pm

My hardware doc shows a clock rate of 2^24 cycles, 279620 cycles per VBL and 1226 cycles per scanline.
DMAs work at CLK / 8, meening that the best case transfer rate possible is nearly 138460 bytes per VBL.
It's how i made DMA timing tables figuring in CowBite hardware documentation. I've checked those rates, and all seems to work fine. Anyway, nobody confirm if those tables are error free or not ...


-- Kay

#1607 - tepples - Mon Jan 20, 2003 6:31 pm

Kay wrote:
My hardware doc shows a clock rate of 2^24 cycles, 279620 cycles per VBL and 1226 cycles per scanline.

Quoting from the CowBite Virtual Harware Spec:
Tom Happ wrote:
The GBA has a TFT color LCD that is 240 x 160 pixels in size and has a refresh rate of exactly 280,896 cpu cycles per frame, or around 59.73 hz. Most GBA programs will need to structure themselves around this refresh rate. Each refresh consists of a 160 scanline vertical draw (VDraw) period followed by a 68 scanline blank (VBlank) period. Furthermore, each of these scanlines consists of a 1004 cycle draw period (HDraw) followed by a 228 cycle blank period (HBlank).

Thus you have 1232 cycles per scanline, not 1226. My own tests on a GBA corroborate this.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#1619 - Kay - Mon Jan 20, 2003 7:55 pm

I know what is written in Tom Happ's CowBite Hardware documentation, because i'm credited in it for help. :)

We all supposed that CLK signal is 16.78 because it's written "16.78 Mhz" on crystal component. But i nearly sure that in fact crystal is exactly clocked @ 2^24 Hz, since one cycle is ~60 ns length.


Quote:
Thus you have 1232 cycles per scanline, not 1226. My own tests on a GBA corroborate this.

I'm curious on how you measure this with only 6 cycles differences per scanline, and there's no way to stabilise machine time on GBA, like it was possible on older systems (Atari ST/Amiga).

Martin Korth tends to claim the same thing in GBATEK doc under "LCD Dimensions and Timings" chapter:
http://www.work.de/nocash/gbatek.htm
but timings are estimated on a 16.78 Mhz basis !

At last, but not least: don't make mistake between CPU and DMA timings and LCD timings.


This looks like a new challenge: determine the real clock rate of GBA system ! Arf' ! :P

Interresting, not ???


No more blah blah, i'll code something to measure all that with the best accuracy possible, and give the code with results herein.



-- Kay

#1622 - tepples - Mon Jan 20, 2003 8:17 pm

Kay wrote:
I'm curious on how you measure [the scanline period] with only 6 cycles differences per scanline, and there's no way to stabilise machine time on GBA, like it was possible on older systems (Atari ST/Amiga).

Easy. Set up a repeating timer with period 1232 CPU cycles. Change color palette entry 0 at some constant values of the timer. This should create roughly vertical lines between the color areas. If these lines are diagonal, or they drift to the left or to the right over time, then the scanline period isn't 1232 cycles.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.