gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Beginners > Bit shifting to clear bits

#44357 - Ultima2876 - Tue May 31, 2005 11:37 pm

Code:
collmask_target << shiftamount;
collmask_target_temp = (collmask_target << (64 - shiftamount)) >> (64 - shiftamount);


This'll be a short one, but it's been bothering me for hours because it's stopping me putting the code in iwram (and it needs it otherwise the thing runs too slow). Is there any trick to acheive the same effect as that second line of code (which chops off some dodgy, unwanted bits) - or at least can someone help me to get it to compile into iwram? I think it's the length of the instructions or something, but I've tried splitting it up into seperate lines, and although I can get the code to compile, I can't get it to do the same thign as the original line.

Thanks in advance.

#44360 - strager - Tue May 31, 2005 11:50 pm

Use a complemented AND:
Code:

filtered = origional & ~filter;

Example:
origional = 01010110
filter    = 00010011
filtered  = origional & ~filter;
filtered  = 01010110  & ~00010011;
filtered  = 01010110  &  11101100;
  01010110
& 11101100
----------
  01010000
filtered = 01010000


filtered is the result.
origional is the origional data to be filtered.
filter is the bits you want removed.

Clear enough?

#44367 - DekuTree64 - Wed Jun 01, 2005 12:42 am

Hmm, is collmask_target a 64-bit int? Those are known to do squirrely things sometimes, since they're not natively supported.
Also, for speed critical code, it's generally best to work in 32-bit ints and 'emulate' 64-bit yourself, so you know exactly what's being done.

As for the actual problem, it looks like you're wanting to clear all the bits above shiftamount? That may be the fastest way there, but you could also do it by AND masking, like
Code:
collmask_target_temp = collmask_target & ~(((u64)1 << shiftamount) - (u64)1);

Let's see what could be done in assembly, for 64 bits...
First method, doing it by shifting:
Code:
rTarget_lo = collmask_target, bottom 32 bits
rTarget_hi = collmask_target, top 32 bits
rShift = shiftamount

rsb   rShift, rShift, #64   // rShift = 64 - rShift
mov   rTarget_lo, rTarget_lo, lsl rShift
mov   rTarget_lo, rTarget_lo, lsr rShift
sub   rShift, rShift, #32
mov   rTarget_hi, rTarget_hi, lsl rShift
mov   rTarget_hi, rTarget_hi, lsr rShift

10 cycles (2 for each shift by register, 1 for each sub).
Using masking:
Code:
Starting regs same as before.

mvn   r3, #0
bic   rTarget_lo, rTarget_lo, r3, lsl rShift   // lo &= ~(0xffffffff << shiftamount)
rsb   rShift, rShift, #64
and   rTarget_hi, rTarget_hi, r3, lsr rShift   // hi &= (0xffffffff >> (64 - shiftamount))

6 cycles, but uses a temporary register. Worth the trade, I think, but I doubt a compiler would be smart enough to generate code like that.
Heck, I'm not even sure it would work myself :)

I'm guessing this is for pixel-perfect collision detection? I'd suggest doing the inner loop in assembly to begin with. ARM assembly is so nice that sometimes it's easier to write directly than to try to convince a compiler to generate fast code for you.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku

#44374 - Ultima2876 - Wed Jun 01, 2005 1:06 am

It turns out I'm stupid - long longs are signed by default, which was the cause of my shifts producing the horrible F's. I made them unsigned, and now I don't need to do any fixing up at all, I merely shift and use the value straight away.

Yep, it is for pixel perfect collisions using bitmasks. Putting that code in iwram was a final optimisation - I now have it running at ~86 FPS (and this is worst case - I've forced it to create all the objects I'll ever really need). However, if I find that I need to optimise further, I know to take a look at those long longs.

Thanks for the info DekuTree, and I appreciate that you looked into it so much. Sorry for wasting your time like that x.x

Anyhow, I'm fairly sure it'll come in useful later on, there things usually do for me. Same to you strager, that AND stuff looks like it wouldn't be out of place in a dozen other bits of my game's code.

Thanks again, both of you.