#147905 - Bloodypriest - Sun Dec 30, 2007 5:47 am
Yesterday, I had an odd problem with my sprites. They came out as random garbage. I couldn't figure it out at first, but in the end I realized that it was the dmaCopy that was going awry.
The only reason I could think of was "memory alignment". So I changed my original code from:
Code: |
dmaCopy(spr_tmp,&SPRITE_GFX[_id*16],size);
free(spr_tmp);
|
to:
Code: |
if ((unsigned int)(spr_tmp) & 3) {
MemCopy32CPU(spr_tmp,&SPRITE_GFX[_id*16],size); // from moonshell's memtool.cpp
}
else {
DC_FlushAll();
dmaCopy(spr_tmp,&SPRITE_GFX[_id*16],size);
}
free(spr_tmp);
|
and it finally worked. But I'm wondering if that alignment check could have been averted if I had used memalign(4,buffer_size) instead of malloc(buffer_size) to allocate the buffer I was copying from.
Can anybody tell me?
#147907 - sajiimori - Sun Dec 30, 2007 6:23 am
The flush is probably what fixed it, not the alignment check. After all, a 32 bit copy is exactly what doesn't work with unaligned addresses.
Besides, no sane implementation of malloc is going to return an unaligned address.
Also, you only need to flush the source and destination regions, not the whole data cache. In fact, you can just invalidate the destination region, since writebacks will be overwritten by the DMA anyway.
#147919 - Bloodypriest - Sun Dec 30, 2007 4:04 pm
Quote: |
The flush is probably what fixed it, not the alignment check. After all, a 32 bit copy is exactly what doesn't work with unaligned addresses.
|
But MemCopy32CPU should work in any circumstances since it's basically just a for loop (I'm just too lazy to write it in full since I'm also using Moonshell's text system which relies on the functions in memtool.cpp).
Quote: |
Besides, no sane implementation of malloc is going to return an unaligned address.
|
Can someone confirm whether the malloc in DevKitPro return aligned addresses? And if so, on what boundary?
#147921 - simonjhall - Sun Dec 30, 2007 4:36 pm
Always comes back aligned for me, and it looks like 32-bit alignment.
Can anyone verify this?
_________________
Big thanks to everyone who donated for Quake2
#147940 - sajiimori - Sun Dec 30, 2007 8:22 pm
Bloodypriest, 'for' loops aren't immune to alignment requirements. All 32-bit reads and writes must be aligned or you'll get weird results.
#147959 - chishm - Mon Dec 31, 2007 4:04 am
simonjhall wrote: |
Always comes back aligned for me, and it looks like 32-bit alignment.
Can anyone verify this? |
As long as the implementation of malloc obeys the standard, it is guaranteed to return a pointer that is "suitably aligned so that it may be assigned to a pointer to any type of object". On the ARM CPUs used in the DS, it shall be aligned to 32 bits in order to fulfil this requirement.
_________________
http://chishm.drunkencoders.com
http://dldi.drunkencoders.com
#147962 - Bloodypriest - Mon Dec 31, 2007 4:36 am
@sajiimori:
Didn't know that. Thanks for the tip.
@chishm:
Thanks for the clarification. I guess this means that we can safely use malloc for all our memory allocations.
#148019 - sajiimori - Mon Dec 31, 2007 8:12 pm
After reading simonjhall's thread about FIFO DMAs, I guess I should correct myself on two things: Flushing a range is not always faster than flushing the whole cache (which was news to me), and invalidating the cache is dangerous for ranges that aren't aligned to 32 bytes (a cache line).
I admit, I've never used cache invalidation in real life, mostly because I use DMA for the geometry FIFO and little else.