#19589 - isildur - Thu Apr 22, 2004 2:52 pm
I am a bit new to ARM assembly and I read somewhere that its better to avoid branching as much as possible because branching is slow. But I was wondering which is faster between the 2 versions of the same code (taken from Dooby's lib).
This is the original, no branching:
And this modification with just one branch and no conditions in the instructions:
So which one is faster/better and why?
This is the original, no branching:
Code: |
tst r0, #0x1 @ halfword aligned? @ not aligned ldrneh r1, [r0, #-1]! @ grab short holding byte we're writing bicne r1, r1, #0xff00 @ blank pixel we're writing orrne r1, r1, r2, lsl #8 @ insert pixel we're writing strneh r1, [r0], #2 @ write & leave aligned to next halfword @ aligned ldreqh r1, [r0] @ grab halfword biceq r1, r1, #0x00ff @ clear left pixel orreq r1, r1, r2, lsr #24 @ insert left pixel streqh r1, [r0] @ plot bx r14 |
And this modification with just one branch and no conditions in the instructions:
Code: |
tst r0, #0x1 @ halfword aligned? beq L_Aligned @ not aligned ldrh r1, [r0, #-1]! @ grab short holding byte we're writing bic r1, r1, #0xff00 @ blank pixel we're writing orr r1, r1, r2, lsl #8 @ insert pixel we're writing strh r1, [r0], #2 @ write & leave aligned to next halfword bx r14 @ aligned L_Aligned: ldrh r1, [r0] @ grab halfword bic r1, r1, #0x00ff @ clear left pixel orr r1, r1, r2, lsr #24 @ insert left pixel strh r1, [r0] @ plot bx r14 |
So which one is faster/better and why?