gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

DS development > Reducing dswifi latency when using FIFO ARM to ARM sync

#105650 - masscat - Tue Oct 10, 2006 4:21 pm

When playing around with my gdb debugger stub I discovered that there was often a ~50ms gap between packets being sent out.
After some investigation I determined that the ARM9 to ARM7 fifo "not empty" interrupt was not getting generated. This happens because the ARM9 places a second sync message into the fifo before the ARM7 has read the first one. Since the ARM7 code only reads one value from the fifo on an interrupt, the fifo never becomes empty and therefore no more interrupts get generated.

To overcome this, change the fifo interrupt handler, on both the ARM7 and ARM9 for completeness, to something like the following:
Code:
void fifo_handler() { // check incoming fifo messages
  int syncd = 0;
  while ( !(REG_IPC_FIFO_CR & IPC_FIFO_RECV_EMPTY)) {
    u32 value = REG_IPC_FIFO_RX;
    if ( value == 0x87654321 && !syncd) {
      syncd = 1;
      Wifi_Sync();
    }
  }
}

#105657 - OOPMan - Tue Oct 10, 2006 5:31 pm

Nice catch masscat :-)
_________________
"My boot, your face..." - Attributed to OOPMan, Emperor of Eroticon VI

You can find my NDS homebrew projects here...

#105666 - Lick - Tue Oct 10, 2006 7:08 pm

Could you post the original function as well? Thankks and nice catch!

- Lick
_________________
http://licklick.wordpress.com

#105667 - masscat - Tue Oct 10, 2006 7:28 pm

The original fifo handler from the ARM7 template.c file from the dswifi example (the ARM9 is similar):
Code:
// interrupt handler to allow incoming notifications from arm9
void arm7_fifo() { // check incoming fifo messages
   u32 msg = REG_IPC_FIFO_RX;
   if(msg==0x87654321) Wifi_Sync();
}

This is installed as the handler for the IRQ_FIFO_NOT_EMPTY interrupt.

#105669 - Lick - Tue Oct 10, 2006 7:55 pm

Someone told me that the IRQ puts any other incoming data on hold (and thus, each message 'calls' the IRQ handler), so arm7_fifo is indeed called twice.
How did you check that it isn't?
_________________
http://licklick.wordpress.com

#105674 - masscat - Tue Oct 10, 2006 9:19 pm

Lick wrote:
Someone told me that the IRQ puts any other incoming data on hold (and thus, each message 'calls' the IRQ handler), so arm7_fifo is indeed called twice.
How did you check that it isn't?

To be honest I just assumed, it fitted the behaviour and the operation of hardware fifos I have seen.
But I have just done a quick test and you only get the NOT_EMPTY interrupt when the fifo goes from empty to not empty, you will not get the interrupt for additional writes until the fifo has been drained. So if you get an interrupt you must drain the FIFO to ensure you will get another one.

The test was:
The ARM7 writes to its FIFO on vblank.

The ARM9 records how many times its NOT_EMPTY handler is called.
Pressing B drains the FIFO.
Pressing X reads one fifo entry and print how many time the handler has been called.

Results:
Pressing X repeatedly - the number of handler calls remains constant.
Pressing B n times followed by X - the number of handler calls increases by n.

#105676 - Lick - Tue Oct 10, 2006 9:29 pm

Could it not have to do with this fifo fact? I dunno. Hmm..
_________________
http://licklick.wordpress.com

#105680 - masscat - Tue Oct 10, 2006 9:46 pm

The fifo does indeed become full (I cannot tap the X button quickly enough).
But if that was the problem then by pressing X you would see the number of handler calls increase. The fifo read allows a write to happen which, if an interrupt is generated per fifo write, would cause the handler call.

But no increase in handler calls means interrupt does not occur per write.

#105681 - sgstair - Tue Oct 10, 2006 9:59 pm

that's probably not the problem you're making it out to be - please note that wifi_update (which should be called in the vblank handler, every 16.6ms) does the same sort of thing as wifi_sync as far as sending unsent packets out. a 50ms delay is more likely on the arm9 size (where there's a 25/50ms timer)

-Stephen
_________________
http://blog.akkit.org/ - http://www.akkit.org/dswifi/

#105682 - masscat - Tue Oct 10, 2006 10:08 pm

Fortunately I did the fix on both ARMs so the problem was solved, but from those timings, as you say, it would have been the ARM7 to ARM9 FIFO.

#105683 - sgstair - Tue Oct 10, 2006 10:21 pm

Quite the opposite - I was stating that because of the timings it couldn't possibly have been the FIFO, is very likely the TCP resend or delayed send (nagle)

-Stephen
_________________
http://blog.akkit.org/ - http://www.akkit.org/dswifi/

#105700 - DekuTree64 - Wed Oct 11, 2006 7:47 am

Ok, this got me curious about the exact behavior of the FIFO, so I ran a few tests of my own. Here is what I found out:

1. FIFO Not Empty interrupt is generated ONLY when transitioning from empty to not empty. If your handler doesn't loop until the FIFO is empty again, then the rest of the data will just sit there and the interrupt will never be generated again. In other words, masscat's original post is correct.

2. Using my interrupt code, it takes about 160 cycles (33MHz) from the time of ARM9 writing a word to the FIFO, until ARM7 has executed its interrupt handler and read that value (ARM9 just does a while(!fifoempty) loop after sending the word). A bit slower than I would expect actually, but not too bad.

3. Depending on how the FIFO is used, there may be an extremely rare potential for missed not-empty interrupts using the libnds interrupt dispatcher.

If you always write a word and then spin until the other processor replies, it should be fine. But if you just write and move on, potentially writing more before the other processor has finished with the first, it could be bad.

The problem is that the dispatcher calls the user handler function (which in this case will empty the FIFO), and then clears the bit for that interrupt in REG_IF after that user handler returns. But there are a few sneaky cycles between the FIFO being for sure empty, and the IF bit being cleared. Consider this sequence of events:

ARM9 writes word to FIFO.
ARM7 fires interrupt, reads FIFO, sees that it is empty, breaks from its loop, and...
ARM9 writes another word to the FIFO.
ARM7 (still in interrupt code, with interrupts blocked) clears the bit in IF.

Bam, data in ARM7's receive buffer, but the IF bit has been cleared before it was processed.

Sooo, the solution is to clear the IF bit BEFORE calling the user handler. That way when ARM9 sends the second word, the IF bit will be left on in the end and trigger another not-empty interrupt as soon as the first one returns. But that will involve fiddling with assembly code in the libnds source.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

#105702 - sgstair - Wed Oct 11, 2006 8:08 am

AFAIK, the devkitpro interrupt handler does indeed clear IF before calling the user interrupt handler (IE is disabled for that bit, and multiple interrupts are also enabled)

I don't think it's generally a problem - and there are many other paths that lead to reading the next entry in the fifo
Both the wifi_update call (which should be called every vblank) and the wifi's "transmit complete" interrupt check the incoming wifi data fifo for more information to transmit. 2 consecutive fifo messages not triggering the second interrupt will hardly be a problem.

(ok, I do now see the possibility that the interrupts will cease entirely if a dual-fifo interrupt event is missed; so it is probably a good idea to clear out the fifo every vblank or so if it isn't empty. - this point I will accept and I should probably correct this in my next version of the lib examples)

-Stephen
_________________
http://blog.akkit.org/ - http://www.akkit.org/dswifi/

#105703 - Lick - Wed Oct 11, 2006 8:47 am

So the interrupt is running asynch with the FIFO hardware that calls the interrupt? Ooo..
_________________
http://licklick.wordpress.com

#105710 - masscat - Wed Oct 11, 2006 12:04 pm

After further investigation:
The ARM9 to ARM7 fifo alone can go full which leads to ~16ms inter packet timings. In this case the ARM9 to ARM7 packet transfer is being performed by the call to Wifi_Update() in the Vblank handler.
Both the ARM9 to ARM7 and ARM7 to ARM9 fifo can go full which leads to ~50ms inter packet timings. The ARM9 to ARM7 packet transfer is being performed by the call to Wifi_Update() in the Vblank handler. The ARM7 to ARM9 packet transfer is being performed by the call to Wifi_Timer() from the 50ms timer handler.
Since the ARM sync mechanism is external to dswifi, without the drain the FIFOs never recover and the latency remains.

#105711 - wintermute - Wed Oct 11, 2006 12:23 pm

sgstair wrote:
AFAIK, the devkitpro interrupt handler does indeed clear IF before calling the user interrupt handler (IE is disabled for that bit, and multiple interrupts are also enabled)


Actually it was modified quite some time ago to clear the IF bit on return from the user handler. IIRC you told me that the wifi interrupts would lose data if IF was cleared prior to reading data.

I did some experiments with FIFO transfer a couple of months ago which indicated that a FIFO irq is only generated on transition from an empty queue and it must be cleared before subsequent interrupts are generated.

I was using a FIFO command system in the OPL emulator I was trying to get working on the DS. The FIFO code looked like this.

Code:

//---------------------------------------------------------------------------------
// callback to allow wifi library to notify arm9
//---------------------------------------------------------------------------------
void arm7_synctoarm9() { // send fifo message
//---------------------------------------------------------------------------------
   REG_IPC_FIFO_TX = 0x87654321;
}

enum {
   CMD_WAIT,
   WIFI_INIT,
   MUS_INIT
};
   
u32 fifo_status = CMD_WAIT;

//---------------------------------------------------------------------------------
// interrupt handler to allow incoming notifications from arm9
//---------------------------------------------------------------------------------
void arm7_fifo() { // check incoming fifo messages
//---------------------------------------------------------------------------------
   while ( !(REG_IPC_FIFO_CR & (IPC_FIFO_RECV_EMPTY)) ) {
      u32 msg = REG_IPC_FIFO_RX;

      switch (fifo_status) {
      case WIFI_INIT:
         Wifi_Init(msg);
         Wifi_SetSyncHandler(arm7_synctoarm9); // allow wifi lib to notify arm9
         fifo_status = CMD_WAIT;
         break;
      case MUS_INIT:
         MusRegisterFile((unsigned char*)msg);
         fifo_status = CMD_WAIT;
         break;

      case CMD_WAIT:
         if(msg==0x87654321) Wifi_Sync();

         if(msg==0x12345678) {
            REG_IME = 1;   // allow other interrupts
            fifo_status = WIFI_INIT;
         }

         if( msg >> 24 == 0x88) {
            switch ((msg >> 16)& 0xff) {
            case 0:
               fifo_status = MUS_INIT;
               break;
            case 1:
               MusPlay(msg & 0xff);
               break;
            case 2:
               MusStop();
               break;
            }
         }
      }
   }
}

_________________
devkitPro - professional toolchains at amateur prices
devkitPro IRC support
Personal Blog

#105752 - masscat - Wed Oct 11, 2006 6:45 pm

I have been looking at the libnds interrupt dispatcher and I feel it has a some issues:

  1. As DekuTree64 said, in order to avoid missing interrupts and therefore not handling them, the acknowledgement (write to REG_IF) must be before the hand off to the user installed handler (or left up to the user handler).

  2. At the moment the dispatcher saves the value of REG_IE, disables the interrupt being handled, calls the user handler and restores the original REG_IE value on the user handler's return. This means that any interrupts that have been enabled or disabled during the user handler are lost.
    Possible solutions could be:

    1. Leave REG_IE alone: nested interrupt handling is left to the user handler. All current user code would need rewriting. A bit more complexity for the user but gives them full control. The acknowledgement of the interrupt would also be handled by user handler. I prefer this approach as I like to have control over the behaviour.
    2. Disable the interrupt being handled and then only restore this interrupt on the user handler's return: the interrupt source currently being serviced cannot be disabled but there is no change to current user code.
    3. Disable the interrupt being handled and then only restore this interrupt based on the return value from the user handler: all current user code will require changing (the compiler can warn that the function passed to irqSet is not compatible). Allows similar control to solution 1 without the need for handling acknowledgement and nested interrupts.

  3. Last and by all means least, when reading from and writing to REG_IME it is done as a 32bit value. Not really a problem as there is nothing at 0x040020a but I thought I would mention it.

#105759 - DekuTree64 - Wed Oct 11, 2006 7:21 pm

masscat wrote:
At the moment the dispatcher saves the value of REG_IE, disables the interrupt being handled, calls the user handler and restores the original REG_IE value on the user handler's return.

Maybe leave IE as-is, and IME disabled, but still do the CPSR fiddling. Then the user handler can reenable interrupts if it wants to, but by default there will be no nesting.

Another option (somewhat of a tangent here) would be to add a global u32, with each bit saying wether that interrupt should enable nesting. That way for fast non-interruptable things like HBlank, the dispatcher could just jump straight to the user handler without touching CPSR. Doesn't really solve the interrupt-disables-itself problem, but does give the user control over when nesting is allowed without having to mess with IE and IME in user handlers.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

#105766 - wintermute - Wed Oct 11, 2006 8:02 pm

masscat wrote:
I have been looking at the libnds interrupt dispatcher and I feel it has a some issues:

  1. As DekuTree64 said, in order to avoid missing interrupts and therefore not handling them, the acknowledgement (write to REG_IF) must be before the hand off to the user installed handler (or left up to the user handler).


Untrue.

Clearing the irq before handoff to the user handler will have zero effect on whether or not interrupts are missed. If an interrupt of the same kind occurs during the user handler then there is a problem elsewhere, possibly in the time taken by the user handler.

Quote:

  • At the moment the dispatcher saves the value of REG_IE, disables the interrupt being handled, calls the user handler and restores the original REG_IE value on the user handler's return. This means that any interrupts that have been enabled or disabled during the user handler are lost.


  • REG_IE should generally not be manipulated during interrupt code. If you need this kind of complexity you should be writing your own dispatcher. Personally, I can't think of many situations where this might be necessary.

    Quote:

  • Last and by all means least, when reading from and writing to REG_IME it is done as a 32bit value. Not really a problem as there is nothing at 0x040020a but I thought I would mention it.


  • Perfectly safe and done this way to simplify code.

    I'm not saying the current dispatcher is perfect but it has been quite well tested and was provided to simplify irq handling as much as possible for the end user.
    _________________
    devkitPro - professional toolchains at amateur prices
    devkitPro IRC support
    Personal Blog

    #105776 - masscat - Wed Oct 11, 2006 8:58 pm

    wintermute wrote:
    masscat wrote:
    As DekuTree64 said, in order to avoid missing interrupts and therefore not handling them, the acknowledgement (write to REG_IF) must be before the hand off to the user installed handler (or left up to the user handler).


    Untrue.

    Clearing the irq before handoff to the user handler will have zero effect on whether or not interrupts are missed. If an interrupt of the same kind occurs during the user handler then there is a problem elsewhere, possibly in the time taken by the user handler.

    Between the interrupt being raised and the bit in REG_IF being cleared zero or more interrupt conditions may occur on the source hardware, none of which will cause an interrupt on the ARM. This is not what I mean by a missed interrupt and the user interrupt handler should be written to handle all present causes of the interrupt to cover this.

    The problem is that if the REG_IF bit is cleared after the user interrupt handler then an interrupt condition may occur between the code of the handler that scans the hardware for causes of interrupts and the REG_IF write. This interrupt condition will be missed and the user handler has no method to detect and handle it.

    If the REG_IF bit clear is before the user interrupt handler call then any occurance of an interrupt condition on the hardware after the REG_IF write will cause the corresponding REG_IF bit to go high and an ARM interrupt to happen when the interrupt is enabled again. Therefore the user handler has the opertunity to handle all interrupt conditions raised by the hardware. The only downside of this approach is that an interrupt may be raised for a condition that has been cleared during the previous handler call.

    DekuTree64 illustrated the case for the inter ARM FIFO but interrupts from any source could be missed.

    #105803 - sgstair - Wed Oct 11, 2006 11:47 pm

    Wintermute: if I did say so I was mistaken; IF should indeed be cleared before handling the interrupt or else there is a window in which an interrupt could occur and not be handled (after the interrupt handler is "complete" and before IF is cleared) - and there is quite nothing the user code can do about it.

    -Stephen
    _________________
    http://blog.akkit.org/ - http://www.akkit.org/dswifi/