#7928 - Lupin - Sat Jun 28, 2003 7:37 pm
At the moment I was just using an simple cmp instruction and used rsblt to negate the value, but I figured out that an OR/ADD-Operation would also do what I want, this would get the absolute value of an s32 I think
(NUM | 0x80000000) + 0x7FFFFFFF
Wich way would you use/would take less cycles?
#7935 - DekuTree64 - Sat Jun 28, 2003 10:31 pm
I'd use the cmp/rsblt. cmp and rsb take one cycle each, so you can't really beat that, and 0x7fffffff can't be created from an 8-bit value+shift, so you'd have to use a mvn r, #0x8000000, which would make it a total of 3 cycles.
#7954 - Lupin - Sun Jun 29, 2003 12:45 pm
thanks! Hm, I figured out that the methode above would still need an cmp instruction though :)
#8027 - beelzebub - Mon Jun 30, 2003 11:39 pm
another way of doing abs() without an cmp/branch is as follows...
r1 = r0 >> 31
r0 ^= r1
r0 -= r1
#8071 - Lupin - Tue Jul 01, 2003 4:15 pm
but this seems to be slow as hell :(
#8075 - DekuTree64 - Tue Jul 01, 2003 5:31 pm
Actually it should be the same speed. Just use
Code: |
eor r1, r0, r0, ASR #31
sub r0, r1, r0, ASR #31
|
I don't think it's possible to do in one instruction, so take your pick^^ Beelzebub's may be faster in THUMB though.
Code: |
asr r1, r0, #31
eor r0, r1
sub r0, r1
bx lr
|
as opposed to
Code: |
cmp r0, #0
bgt positive
neg r0, r0
positive:
bx lr
|
which is also 3 instructions, but that branch will take a little extra time if the number was already positive
#8485 - Archeious - Fri Jul 11, 2003 9:47 pm
Couldn't you just or off the signed bit.
Value = Value and 0x80000000
or am I missingin something big here.
Archeious
#8487 - tepples - Fri Jul 11, 2003 10:44 pm
Archeious wrote: |
Couldn't you just or off the signed bit.
Value = Value and 0x80000000 |
No.
You probably think that the system works like this: 0x00000000 = 0, 0x00000001 = 1, 0x00000040 = 64, 0x80000001 = -1, 0x80000040 = -64, etc. That's the "sign bit" representation, which no modern architecture uses for integers.
The most common representation of integers is two's complement. A negative number is represented as 0x1 0000 0000 minus the number. Thus, 0xffffffff = -1, and 0xffffffc0 = -64.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#8494 - Archeious - Sat Jul 12, 2003 12:57 am
tepples wrote: |
Archeious wrote: | Couldn't you just or off the signed bit.
Value = Value and 0x80000000 |
No.
You probably think that the system works like this: 0x00000000 = 0, 0x00000001 = 1, 0x00000040 = 64, 0x80000001 = -1, 0x80000040 = -64, etc. That's the "sign bit" representation, which no modern architecture uses for integers.
The most common representation of integers is two's complement. A negative number is represented as 0x1 0000 0000 minus the number. Thus, 0xffffffff = -1, and 0xffffffc0 = -64. |
I guess thats what happens when you never get under the hood.
#8749 - Maddox - Sat Jul 19, 2003 2:15 am
It is.
_________________
You probably suck. I hope you're is not a game programmer.
#8752 - sgstair - Sat Jul 19, 2003 3:39 am
Here's another way to do an abs in ARM, which has the virtue of not using an exta register.
Code: |
mov r0,r0
rsbmi r0,r0,#0
|
for THUMB the code earlier quoted is probably the best way:
Code: |
asr r1, r0, #31
eor r0, r1
sub r0, r1
|
The ARM implementation will take 2 cycles to execute from iwram regardless of the value of r0, the thumb version will take 3 cycles from iwram.
Assuming however, that we are running from the cartrige with 3 nonsequential wait states and 1 sequential, it will take 8 cycles for the ARM version, and 6 cycles for the thumb version (if the cart is set to default waitstate of 4-2, it will be 12 for arm and 9 for thumb)
-Stephen
#8760 - Dev - Sat Jul 19, 2003 6:03 am
A few corrections to the ARM side of things:
You need to use MOVS, not MOV, otherwise the status bits won't be set, and the RSBMI won't execute properly.
Obviously, if the status bits were set by a previous instruction that operated on R0, you don't even need the MOVS.
Also, it'll be 10 cycles from ROM at 3/1 (8 cycles for THUMB), but could be lower if prefetched.
Dev.
#8770 - sgstair - Sat Jul 19, 2003 1:46 pm
Dev wrote: |
A few corrections to the ARM side of things:
You need to use MOVS, not MOV, otherwise the status bits won't be set, and the RSBMI won't execute properly.
Obviously, if the status bits were set by a previous instruction that operated on R0, you don't even need the MOVS.
Also, it'll be 10 cycles from ROM at 3/1 (8 cycles for THUMB), but could be lower if prefetched.
Dev. |
yeah, sorry :)
-Stephen
#8776 - Lupin - Sat Jul 19, 2003 4:09 pm
wouldn't an cmp or tst 0x8000 be faster then movs r0, r0?
#8780 - sgstair - Sat Jul 19, 2003 6:05 pm
Lupin wrote: |
wouldn't an cmp or tst 0x8000 be faster then movs r0, r0? |
How so? they both take 1S cycles
-Stephen