gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

C/C++ > Weird loop error

#177508 - brave_orakio - Mon Jul 16, 2012 8:21 pm

Hi guys.
I`m getting some seemingly random copy error when I implement a copy loop.
I`m not sure if I brain farted here but isn`t this
Code:

copy(const u16* src, u16* dest, cnt)
{
  int i;
  for(i = cnt - 1; i >=0; i--)
  {
     dest[i] = src[i];
  }
}


the same as this:
Code:

copy(const u16* src, u16* dest, cnt)
{
  int i;
  for(i = 0; i < cnt; i++)
  {
     dest[i] = src[i];
  }
}

You might be asking, "well orakio, why would you be using such a slow copy anyway?"
Well the original thing I was going for is is
Code:

copy(const u16* src, u16* dest, cnt)
{
  int i;
  for(i = cnt - 1; i >=4; i-=5)
  {
     dest[i] = src[i];
     dest[i - 1] = src[i - 1];
     dest[i - 2] = src[i - 2];
     dest[i - 3] = src[i - 3];
     dest[i - 4] = src[i - 4];
  }
  for(i ; i >=0; i--)
  {
     dest[i] = src[i];
  }
}

the incremental version(which works like expected):
Code:

copy(const u16* src, u16* dest, cnt)
{
  int i;
  for(i = 0; i + 5 < cnt; i+=5)
  {
     dest[i] = src[i];
     dest[i + 1] = src[i + 1];
     dest[i +2] = src[i + 2];
     dest[i + 3] = src[i + 3];
     dest[i + 4] = src[i + 4];
  }
  for(i ; i <cnt; i++)
  {
     dest[i] = src[i];
  }
}

is much slower thanks to the i+5<cnt compare part.
_________________
help me

#177509 - elhobbs - Tue Jul 17, 2012 1:17 am

I will assume that you are copying to and from main memory and you are not sharing this memory on arm9 and arm7. The most likely issue is that the memory is not properly aligned for 16 bit writes. 16 bit writes need to reference a memory address that is evenly divisible by 2 - 32 bit writes by 4.

#177510 - brave_orakio - Tue Jul 17, 2012 4:05 am

Sorry, wasn't clear on that part. Its actually ROM to VRAM. Its part of an LZSS decompression process. The part where the actual data is copied to the VRAM but the whole of the compressed data is 16-bit aligned.

Originally I used a 16-bit DMA copy which works fine, I was just looking for an alternative copy method that doesn't use DMA.

No problems on sharing between processors because this is just for GBA.

But as an aside, those loops I mentioned should technically be the same right? I'm not sure if I missed something in the decrementing type and need someone else's eyes to see my mistake.

edit: One more thing to add is that the output image isn't messed up. Rather there seems to be an off part near the bottom of the sprite where a few pixels appear where there shouldn't be any.
_________________
help me

#177511 - headspin - Tue Jul 17, 2012 7:20 am

Why not use the GBA's built in BIOS decompression routines for LZ77? Here is an example
_________________
Warhawk DS | Manic Miner: The Lost Levels | The Detective Game

#177512 - brave_orakio - Tue Jul 17, 2012 9:05 am

Well, mine actually works already like I said in my previous post. Just that 16-bit decrementing weird loop problem. But if I use DMA(Which is actually faster) it works right as rain. And also the incrementing loop works fine but only a heck of a lot slower. I mainly experimented with the loop copying as an alternative to DMA and try to make it as fast as possible then I ran into the strange result. And by DMA I mean to say decompress current data then copy to VRAM using DMA probably around 4-10 half words on average on each DMA call.

But in light of that, how fast is the built in decompression of the GBA? I built mine primarily because I wanted it to be as fast as possible. If my computations are correct, I think my decompression rate is around 800k/s. Computed from 512 bytes/10k cycles. The decrementing loop version has a speed of 512 bytes/15k cycles and incrementing loop does 512 bytes/21k cycles.
_________________
help me

#177513 - headspin - Tue Jul 17, 2012 11:32 am

Check the link it has the speeds there. Using the VRAM safe method you can decompress directly to VRAM.
_________________
Warhawk DS | Manic Miner: The Lost Levels | The Detective Game

#177515 - brave_orakio - Wed Jul 18, 2012 1:50 am

Ah yes 300k-500k per sec. Just about on par with my own non DMA version.

Admittedly though, my LZSS format isn't compatible with the GBA built in decompression as I modified it to be as fast as possible, avoiding all the bit displacing and making sure a single AND is all that is needed to get the number of compressed copies and a single 4-bit shift for the displacement. My 8-bit compressed format is compatible but not for the direct to VRAM copy I think.
_________________
help me

#177516 - brave_orakio - Thu Jul 19, 2012 1:25 am

Did some more debugging recently. One more symptom of the problem is when I'm only copying from VRAM to VRAM. When I copy from the offset VRAM to target VRAM in a decrementing loop, the problem is present. But for the part where data is copied from ROM to VRAM, the decrementing loop has no problem.
_________________
help me