#134801 - simonjhall - Tue Jul 17, 2007 12:08 am
Me again ;-)
So I've replaced as many strbs and all the funny variants of strb from my code, removed all the memcpys, memsets, strcpys, strncpys etc etc...but I'm still getting dodgy results.
My problems begin on one function which has no strbs, yet runs fine from normal memory. Here's the guts of the function (I have cut a lot away to replicate this funny condition btw):
Again, it doesn't function correctly, but this code here will still generate weirdness.
The prototype for LittleFloat is
ie, it's a function pointer so gets called through bx to a register.
Normal is of type fixed_point, and there's operator overloading which converts from one to the other. The fixed_point type is four bytes in size.
Yet the disassembly looks like this:
Using a bit of objdump and nm tells me that 20ac70c is the function pointer LittleFloat, so the ldr followed by the bx (at 201bbd0) does the call to LittleFloat.
2076114 (the target of the second bx) is the function which promotes float types to my fixed_point class (size 4 bytes). This fixed_point class is then stored in normal[j] (normal is of type *fixed_point).
So (if that made any sense)...what's with the four-byte memcpy? This is the code generated with -Os - I don't get the memcpy if I compile it without the option.
I wouldn't normally care about these memcpys too much, but they are breaking my slot-2 shenanigans.
So to reiterate:
- how do I get rid of four-byte memcpys? The whole point of me of doing the floating/fixed point stuff was to make it fast - extra code ain't gonna help
- if I can't get rid of the memcpy, how can I tell it to use a different (ie my) memcpy? I could replace the first instruction of memcpy with a branch to my memcpy, by that's a bit hacky.
Ta all.
PS: if there are mistakes, it's cos I'm tired!
_________________
Big thanks to everyone who donated for Quake2
So I've replaced as many strbs and all the funny variants of strb from my code, removed all the memcpys, memsets, strcpys, strncpys etc etc...but I'm still getting dodgy results.
My problems begin on one function which has no strbs, yet runs fine from normal memory. Here's the guts of the function (I have cut a lot away to replicate this funny condition btw):
Code: |
void Mod_LoadPlanes (lump_t *l)
{ int i, j; mplane_t *out; dplane_t *in; int count; int bits; for ( i=0 ; i<count ; i++) { for (j=0 ; j<3 ; j++) { out->normal[j] = LittleFloat (in->normal[j]); } } } |
The prototype for LittleFloat is
Code: |
extern float (*LittleFloat) (float l); |
Normal is of type fixed_point, and there's operator overloading which converts from one to the other. The fixed_point type is four bytes in size.
Yet the disassembly looks like this:
Code: |
void Mod_LoadPlanes (lump_t *l)
201bbb4: e92d4070 stmdb sp!, {r4, r5, r6, lr} 201bbb8: e3a06000 mov r6, #0 ; 0x0 201bbbc: e24dd008 sub sp, sp, #8 ; 0x8 201bbc0: ea000013 b 201bc14 <_Z14Mod_LoadPlanesP6lump_t+0x60> 201bbc4: e3a05000 mov r5, #0 ; 0x0 201bbc8: e7950004 ldr r0, [r5, r4] 201bbcc: e59f3054 ldr r3, [pc, #84] ; 201bc28 <.text+0x1b9e8> 201bbd0: e593c000 ldr ip, [r3] 201bbd4: e1a0e00f mov lr, pc 201bbd8: e12fff1c bx ip 201bbdc: e28d4004 add r4, sp, #4 ; 0x4 201bbe0: e1a01000 mov r1, r0 201bbe4: e59f3040 ldr r3, [pc, #64] ; 201bc2c <.text+0x1b9ec> 201bbe8: e1a00004 mov r0, r4 201bbec: e1a0e00f mov lr, pc 201bbf0: e12fff13 bx r3 201bbf4: e0850004 add r0, r5, r4 201bbf8: e1a01004 mov r1, r4 201bbfc: e2855004 add r5, r5, #4 ; 0x4 201bc00: e3a02004 mov r2, #4 ; 0x4 201bc04: eb012e41 bl 2067510 <memcpy> <---- memcpy for four bytes! 201bc08: e355000c cmp r5, #12 ; 0xc 201bc0c: 1affffed bne 201bbc8 <_Z14Mod_LoadPlanesP6lump_t+0x14> 201bc10: e2866001 add r6, r6, #1 ; 0x1 201bc14: e1560004 cmp r6, r4 201bc18: baffffe9 blt 201bbc4 <_Z14Mod_LoadPlanesP6lump_t+0x10> 201bc1c: e28dd008 add sp, sp, #8 ; 0x8 201bc20: e8bd4070 ldmia sp!, {r4, r5, r6, lr} 201bc24: e12fff1e bx lr 201bc28: 020ac70c andeq ip, sl, #3145728 ; 0x300000 201bc2c: 02076114 andeq r6, r7, #5 ; 0x5 |
Using a bit of objdump and nm tells me that 20ac70c is the function pointer LittleFloat, so the ldr followed by the bx (at 201bbd0) does the call to LittleFloat.
2076114 (the target of the second bx) is the function which promotes float types to my fixed_point class (size 4 bytes). This fixed_point class is then stored in normal[j] (normal is of type *fixed_point).
So (if that made any sense)...what's with the four-byte memcpy? This is the code generated with -Os - I don't get the memcpy if I compile it without the option.
I wouldn't normally care about these memcpys too much, but they are breaking my slot-2 shenanigans.
So to reiterate:
- how do I get rid of four-byte memcpys? The whole point of me of doing the floating/fixed point stuff was to make it fast - extra code ain't gonna help
- if I can't get rid of the memcpy, how can I tell it to use a different (ie my) memcpy? I could replace the first instruction of memcpy with a branch to my memcpy, by that's a bit hacky.
Ta all.
PS: if there are mistakes, it's cos I'm tired!
_________________
Big thanks to everyone who donated for Quake2