#84537 - DiscoStew - Tue May 23, 2006 5:25 am
This is something I've put off for a while now. As of right now, I've got the MODE 7 perspective working for my demo, but a lot of processing time is spent on the "lambda" being calculated for each scanline (not during each scanline), more specifically, the divide being used for it. Now, from the line of code I have from my demo....
lam = Asm_Divide(Cam_Pos.y << 12, BG_Cam_Incs.y);
...Where "Cam_Pos.y" is the height position of the camera, and "BG_Cam_Inc.y" is an incrementation of an initial value incremented/decremented with "BG_Cam_YAxis.y", which is affected by the cosine of theta. The bit-wise shift in it is merely for more precision. The only optimization I've done with it thus far is an added check if theta of the camera is 128. or, looking straight down, which in this case, would make the layer look as if MODE 7 wasn't being used at all, because at theta = 128, "BG_Cam_Incs.y" does not increment/decrement...
lam = Cam_Pos.y >> 3;
So far, I've tried other possible methods of optimization, like linker settings, putting the full code of the needed calculations into IWRAM, etc. The only other idea I've come up with for optimizing (though not implemented yet) is if "Cam_Pos.y" and theta do not change, a LUT can be processed in one frame, and then reused for following frames, but if either of these variables change, the LUT would have to be filled in again.
What I'm looking for is one single thing. To get rid of that divide per scanline. It isn't so bad if it is used a couple of times, but when used at max 160 times, it really limits anything else I can do, and even with the LUT idea, if any change is made to the camera's height or the camera's theta, the new lambda values would have to be calculated, and it could show among other things being processed.
The only other thing I can see is taking a different approach to calculating lambda using the same values being put into it. However, I have no idea where to begin, which is why I'm here to ask you all about it. The info I can give is that throughout the lambda calculation, the camera's height is constant, and "BG_Cam_Incs.y" starts off at a calculated value, of which then per scanline, it is incrementing/decrementing at a constant rate.
If you have any thoughts, or even a formula for a faster lambda calculation with minimum error, please fell free to post.
_________________
DS - It's all about DiscoStew
lam = Asm_Divide(Cam_Pos.y << 12, BG_Cam_Incs.y);
...Where "Cam_Pos.y" is the height position of the camera, and "BG_Cam_Inc.y" is an incrementation of an initial value incremented/decremented with "BG_Cam_YAxis.y", which is affected by the cosine of theta. The bit-wise shift in it is merely for more precision. The only optimization I've done with it thus far is an added check if theta of the camera is 128. or, looking straight down, which in this case, would make the layer look as if MODE 7 wasn't being used at all, because at theta = 128, "BG_Cam_Incs.y" does not increment/decrement...
lam = Cam_Pos.y >> 3;
So far, I've tried other possible methods of optimization, like linker settings, putting the full code of the needed calculations into IWRAM, etc. The only other idea I've come up with for optimizing (though not implemented yet) is if "Cam_Pos.y" and theta do not change, a LUT can be processed in one frame, and then reused for following frames, but if either of these variables change, the LUT would have to be filled in again.
What I'm looking for is one single thing. To get rid of that divide per scanline. It isn't so bad if it is used a couple of times, but when used at max 160 times, it really limits anything else I can do, and even with the LUT idea, if any change is made to the camera's height or the camera's theta, the new lambda values would have to be calculated, and it could show among other things being processed.
The only other thing I can see is taking a different approach to calculating lambda using the same values being put into it. However, I have no idea where to begin, which is why I'm here to ask you all about it. The info I can give is that throughout the lambda calculation, the camera's height is constant, and "BG_Cam_Incs.y" starts off at a calculated value, of which then per scanline, it is incrementing/decrementing at a constant rate.
If you have any thoughts, or even a formula for a faster lambda calculation with minimum error, please fell free to post.
_________________
DS - It's all about DiscoStew