#20611 - CyberSlag5k - Thu May 13, 2004 6:25 pm
How often should one use the DMA? Should it only be used in special occasions when you want to move alot of data really quickly? Or should I try and use it as much as possible?
_________________
When you find yourself in the company of a halfling and an ill-tempered Dragon, remember, you do not have to outrun the Dragon...
#20613 - sajiimori - Thu May 13, 2004 6:37 pm
Use DMA for any copy of significant size (except to SRAM). It seems to me that even very small copies (in the double digits) would be faster with DMA, because the overhead is so small.
I just heard today that DMA is slower for fills (which was news to me), but you probably read that too.
The DMAx_SAD register should have been used as an immediate value for fills. Seems kinda silly.
#20620 - poslundc - Thu May 13, 2004 8:15 pm
sajiimori wrote: |
I just heard today that DMA is slower for fills (which was news to me), but you probably read that too. |
DekuTree64 does a little analysis and comparison of the various options in this thread.
Quote: |
The DMAx_SAD register should have been used as an immediate value for fills. Seems kinda silly. |
The real kicker is that the CPUFastSet BIOS routines also behave the same way the DMA does in this regard. I was cursing a blue streak for hours trying to figure out why my palette-clear routine wasn't working until I realized it was because I was passing a literal zero to the routine, rather than the address of a variable that held zero. :P
Dan.
#20625 - Miked0801 - Thu May 13, 2004 10:18 pm
For very small memsets, struct to struct copies are going to be faster - even in thumb (32 bytes or less.) That's because there's at least 70 cycles overhead just in entrance and exit code to the bios routines - not even counting setting up the registers for the copy. You also need to set 5 IO registers for a dma as well (With care this can be done as a load mutliple (ldm) but C compilers won't do this without serious hand holding) so that's at least 15 cycles for the loads and whatever else getting the info ready. So here's an example:
Code: |
typedef struct _foo
{
u32 word1;
u32 word2;
u32 word3;
u32 word4;
} STRUCT_32_BYTES;
// Somewhere in code:
void contrivedCopy(u32 *src, u32 *dest, u32 size)
{
STRUCT_32_BYTES *src32, *dest32;
src32 = (STRUCT_32_BYTES *) src;
dest32 = (STRUCT_32_BYTES *) dest;
*dest = *src;
}
// Or
*((STRUCT_32_BYTES *) dest) = *((STRUCT_32_BYTES *) src);
|
In Thumb, this will compile into 3 ldm/stm instruction - in Arm - it turns into 1 ldm/stm for maximum efficiency. Yes, it does eat some registers but for the most part, the spilling won't hurt performance as bad as the overhead to get to BIOS or DMA.
Mike
Last edited by Miked0801 on Wed May 19, 2004 7:00 pm; edited 1 time in total
#20643 - NoMis - Fri May 14, 2004 8:08 am
I didn't know that the cycle overhead for bios functions is this high. Whats the purpose they made them then. With the given overhead i can't realy see a good use of the most bios functions.
NoMis
#20676 - Miked0801 - Fri May 14, 2004 6:08 pm
Neither do I which is why I almost never use them :)
We use them for stuff where we don't care about speed or early in development when we didn't have our own routines ready.
#20900 - LOst? - Wed May 19, 2004 2:30 pm
Code: |
*((STRUCT_32_BYTES *) dest) = *((STRUCT_32_BYTES *) src);
|
Mike, I know you are really good at programming, but is that possible? Copying data with a =?
It is not possible for me to copy structs by doing = without making an operator= (slow C++ way).
#20903 - Cearn - Wed May 19, 2004 3:23 pm
LOst? wrote: |
Code: |
*((STRUCT_32_BYTES *) dest) = *((STRUCT_32_BYTES *) src);
|
Mike, I know you are really good at programming, but is that possible? Copying data with a =?
It is not possible for me to copy structs by doing = without making an operator= (slow C++ way). |
Yes, it works. Though maybe it helps to see it without all the pointers
Code: |
typedef struct tagFOO
{
int pa, pb, pc, pd;
} FOO;
FOO src= { 0xda, 0x9ba, 0x15, 0x1337 }, dest;
dest= src;
// contents of dest are now the same as src
|
This kind of stuff is actually why all C books tell you to use pointers to structs as function arguments instead of structs: in the latter case you'll just create a full copy.
I've been actually using this for my hand-written tiles for while, until I saw the offputting generated assembly and switched to dma. But now it seems I'm going to put it back in again. (I'll probably also have to rethink the use of software interrupts ... 70 cycles overhead, you gotta be kidding me)
#20908 - sajiimori - Wed May 19, 2004 6:11 pm
Yeah, I was wondering why Mike mentioned BIOS at all. I thought everybody used macros or static inlines for DMA.
#20912 - poslundc - Wed May 19, 2004 7:45 pm
sajiimori wrote: |
I thought everybody used macros or static inlines for DMA. |
Not me!
(Never quite seen the point, frankly.)
Dan.