#862 - bomberman - Sat Jan 11, 2003 1:36 pm
Hello there,
if I understood correctly the ARM runs at 16MHz. We have 60 refresh of the screen per second. A small computation will tell us we have 266 000 or so ARM instructions done during one refresh.
So I tested this
volatile u16* scanLine = (volatile u16*) 0x4000006;
int main(void)
{
int count = 0;
while(1)
{
// Let's wait for refresh to begin
while(*scanLine != 0);
/// Let's wait for refresh to begin
while(*scanLine != 227)
{
// A simple counter
count++;
}
count = 0;
}
}
When I have completed a full refresh of the screen, my variables count holds a mere 5400. OK, a C instruction implies more than one ARM instruction, but here the ratio is 50. I doubt that each of my while and count++ is translated into 50 ARM instructions.
I had a look in GDB, displaying Source + Assembly and we are far from this ratio...
Has someone an explanation about this?
Thanks very much.
#864 - Splam - Sat Jan 11, 2003 2:55 pm
I can explain it, your C is probably becoming a mass of instructions because compilers are generally quite stupid :) add to that the fact that not all instructions are 1 cycle... You just can't use code to test something like this, especially not C.
GBA has 16777216 per second = 279620 per frame. No doubt some of this will be take up by system overhead + the fact that if the compiler has turned your code into anything that acesses ram you have the ram waitstates to add to those instrucitons..
Also, did you compile to arm or thumb? If arm did you locate it to ram or run it from rom? All of these can make a drastic difference on speed of execution due to the different waitstates for the instructions in ram/rom + arm instructions being 32bit while thumb are 16 (prefetch in ram executes arm at the same speed as thumb in rom) etc etc
#896 - bomberman - Sat Jan 11, 2003 11:14 pm
OK... I am quite new to GBA...
I compiled with HAM. If I have a look at the master.mak, it uses normal instead of thumb.
But it also compiles with -mthumb-interwork... What happens in this case?
How do you control which code is arm and which is thumb?
What does it implies to run code from rom or ram (maybe ram is faster?)? How do you control to execute in ram or rom?
When I see that it is so slow in C, do you use it to write your C64 emulator???
#918 - jaymzjulian - Sun Jan 12, 2003 6:01 am
bomberman wrote: |
if I understood correctly the ARM runs at 16MHz. We have 60 refresh of the screen per second. A small computation will tell us we have 266 000 or so ARM instructions done during one refresh.
(code delted)
When I have completed a full refresh of the screen, my variables count holds a mere 5400. OK, a C instruction implies more than one ARM instruction, but here the ratio is 50. I doubt that each of my while and count++ is translated into 50 ARM instructions.
|
Your assertion that one instruction == one cycle is incorrect. Only simple arithmatic on basic registers takes this amount. Any artihmatic on r15 takes 2S cycles more (S is sequential memory access. iwram is 1 cycle for this, ewram is i think 3, and rom is more again, but my numbers might be wrong). A conditional branch, which is your while loop, takes 1N+2S (N is nonsequential memory access, and in rom, those are really very slow, i think 6 or 8 cycles off the top of my head, but it might be more). Reading from memory, which includes reading the scanline, takes 1N+1S+1 cycles, which in this case should be probably 3 cycles, since I *think* IO memory is a single cycle area, but i'm not sure on this, And, of course, you're running this out of rom, and since you havn't specified -mthumb as well, it's going to be arm code, which means that reading *each instruction* is going to lose you at least 3 more cycles, because your prefetch won't be filled, and probably more.
look at the gba dev'rs faq and the arm7tdmi refrence manual, specifically pages 6-5 and 6-6. The timing is not nearly as simple as you think.
- jj
#1062 - tepples - Tue Jan 14, 2003 7:19 am
Splam wrote: |
GBA has 16777216 per second = 279620 per frame. |
Actually, 1232 cycles per scanline * 228 scanlines per frame = 280896 cycles per frame, that is, 59.7275 fps. I have verified this with timers and audio playback.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.