#170357 - Ruben - Fri Sep 18, 2009 8:58 pm
So I decided to start writing my FFIV sequel from scratch again, and am close to running out of cycles, so I'm thinking about speeding up the only part 'optimizable' in the sound mixing code: 16-bit to 8-bit, 4 samples at once. (That is, make 4x 8-bit samples from 4x 16-bit samples quickly)
What I have at the moment is this...
I know there's gotta be a way to speed this up by getting rid of the ror's but... HOW? Can anyone help me here? Please? ><'
What I have at the moment is this...
Code: |
@ r0: Destination (8-bit, word aligned)
@ r1: Source (16-bit, word aligned) @ r2: Count @ r4: Mask 0x00FF00FF .LMix: ldmia r1!, {r5-r8} @ RRRRLLLL and r5, r4, r5, lsr #0x08 @ 00RR00LL and r6, r6, r4, lsl #0x08 @ RR00LL00 orr r5, r5, r6 @ RRRRLLLL mov r6, r5, lsr #0x10 @ 0000RRRR bic r5, r5, r6, lsl #0x10 @ 0000LLLL mov r6, r6, ror #0x10 @ RRRR0000 and r7, r4, r7, lsr #0x08 @ 00RR00LL and r8, r8, r4, lsl #0x08 @ RR00LL00 orr r7, r7, r8 @ RRRRLLLL orr r5, r5, r7, lsl #0x10 @ LLLLLLLL orr r6, r6, r7, lsr #0x10 @ RRRRRRRR mov r6, r6, ror #0x10 str r6, [r0, #BUFFER_LEN*BUFCNT] str r5, [r0], #0x04 subs r2, r2, #0x04 bne .LMix |
I know there's gotta be a way to speed this up by getting rid of the ror's but... HOW? Can anyone help me here? Please? ><'