#108139 - ollhak - Sun Nov 05, 2006 9:12 pm
Alrighty, I'm trying to get a rough idea of how slow it is to simulate sprites with 3d quads but I'm not sure about what to measure. I guess the problem is that I don't know exactly how the DS does its drawing *scratches head*...
Alright, so it takes some time to do the scene drawing commands every frame, but won't i also get less frames per second or so because of the rendering taking longer time?
Getting my supercard any day now so I can try it on hardware myself but I'm just curious to hear if anyone has tried it already.
#108140 - Sausage Boy - Sun Nov 05, 2006 9:28 pm
You usually update your game logic and prepare your graphics. Then you wait for vblank, and update the graphics registers and copy stuff to vram. Vblanks come at a steady pace (about 60 times per second), and if your data preparing and game logic updating takes too long, you might end up missing the vblank, which is bad. As long as you don't miss the vblank (too often), your game will run fine.
_________________
"no offense, but this is the gayest game ever"
#108142 - Payk - Sun Nov 05, 2006 10:00 pm
Well u can use vblank interrupt...
Best is count the loops of your game/app
and compare with vblank. As sausage boy already said: 60 times per second the interrupt is called.
Webez showed me a nice method to do this:
Code: |
int frameCounter=0;
int loopCounter=0;
int elapsedFrames=0;
int frameold=0;
void vBlank(void){
++elapsedFrames;
++frameCounter;
if (frameCounter>=60){
frameold=loopCounter;
frameCounter=0;
loopCounter=0;
}
}
void main(void){
irqSet(IRQ_VBLANK,vBlank);//we give the interrupt a function to call
while(1==1){
render(stuff);
loopCounter++;
}
} |
Just print "frameold" and u know how fast it is.
Caution: Just results above 30FPS are correct!
I needed much time to reach 60FPS...Be sure to use v16/t16/f32 as much as u can avoid float! (and that wasnt enough to reach 60fps for me...)
#108150 - sajiimori - Sun Nov 05, 2006 11:05 pm
Use the hardware timers instead of counting VBlanks. Anything less is practically guesswork. Code: |
Main loop:
1. Start timer
2. Do all the work for this game loop
3. Stop timer, get result
4. Wait for VBlank |
Do separate measurements on subsets of step 2 to find out where the CPU time is going. Get as fine-grained as you have to.
#108153 - ollhak - Sun Nov 05, 2006 11:33 pm
Alright. So it's all about losing time between the vblanks then right? I guess that there should be quite some difference from setting all the matrix modes, rendering each quad and so compared to just updating the OAM information when doing normal sprites.. time for some testing.
#108154 - Lick - Mon Nov 06, 2006 12:17 am
*Interested in the results.*
I think the 2D hardware is faster though. I mean, it's a compact and well-designed piece of hardware, while the 3D chips are probably rushed and far from decent anyway. (If only we had texture filtering..)
_________________
http://licklick.wordpress.com
#108155 - tepples - Mon Nov 06, 2006 12:29 am
Lick wrote: |
I think the 2D hardware is faster though. I mean, it's a compact and well-designed piece of hardware, while the 3D chips are probably rushed and far from decent anyway. (If only we had texture filtering..) |
There's no 2D texture filtering either.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#108158 - sajiimori - Mon Nov 06, 2006 12:46 am
It makes little sense to ask whether the 2D or 3D rendering hardware is faster. Both pipelines will happily render whatever is in their buffers (or at least attempt to) every VDraw, with zero interaction from the CPU.
How long does it take for the DS to render a 3D scene whose complexity is the absolute maximum that the DS can handle? Exactly as long as it takes for REG_VCOUNT to go from 0 to 192, period.
How many CPU cycles does that use? Zero. The rendering hardware runs in the background.
Uploading OAMs and quads to the hardware takes CPU time, but neither of those is likely to be a bottleneck compared to loading graphics from the cart or copying them to VRAM. If you don't have to do either of those things dynamically, you're likely to have a lot of unused CPU time for other parts of your game.
#108159 - Payk - Mon Nov 06, 2006 1:05 am
^^Yep. Just can confirm that. Another interesting thing is culling on nds.
When not using it you can render a 2048 polygon mesh @ 60FPS.
If you then deactivate culling you have about 1024 polys left (depends on mesh of course). When you now think you can render them too, u will be suprised (or not). Framerate goes down then. I think the 3d hardware timing was made to recive vertices for exact 2048 polys, there texture coords and normal. If you send more (doesnt matter if rendered or not) the timing doesnt match anymore.
I think its a good idea to use 3D hardware for sprites !
128 Sprites arent much for some projects. But you could render 1024 instead ...and best is: using vertexcolor you can do some nice stuff!
The generell problem is ... how to avoid looking 3D? Does anyone has an idea how zelda phantom hourglass did that?
#108189 - Sunray - Mon Nov 06, 2006 12:50 pm
I really hate working with the 2D engines. Seriously... I think they should be replaced by a better 3D chip. In my game (much like Quake in 2D platform) I use the 3D chip to render everything. Tilemap, sprites and 3D objects (such as rotating health boxes etc).
#108192 - ollhak - Mon Nov 06, 2006 1:43 pm
Lick: I'll post any results I find.
sajiimori & Payk: I kind of suspected that - thanks for confirming it. Guess all I have left to test is how much time it takes to push quads to HW compared to updating the OAM.
Payk: Agreed, vertex colors, easy blending (between sprites anyway), rotating and scaling can be great for better visuals. Not sure what you mean by "avoid looking 3d" - I checked some screnshots of the game and it didn't look anything but 3d to me.
#108228 - Payk - Mon Nov 06, 2006 10:02 pm
I mean that parellel lines apear to aim on a shared point (fuck english word is missing).
Is saw a video from zelda where all looked 2D till link talked to a npc. The camera rotated and u where able to see thats 3d in fact...let me hv a look
here this screen:
[img]http://www.zelda-world.com/v3/img/zeldaph/gallerie/NTR_ZeldaDS_ss06.bmp[/img]
^^take a close look...closer...yeah thats what i mean...the gras stands straight up!...regulary just the grass in middle should look that...grass on right side should aim right a bit...same for the walls of the cave....you should be able to see the sidewalls...but u arent...Thats 2D?!?! Nope...its made with polys/quads...i didnt found the video but there was somekinda switching mode where u could see thats all 3d in real...
Last edited by Payk on Mon Nov 06, 2006 10:06 pm; edited 1 time in total
#108229 - Payk - Mon Nov 06, 2006 10:06 pm
Ahh here is that video:
Its 2D style till link starts talking to the char. again take a close look ;)http://youtube.com/watch?v=yrDDaYhrFbc
(time: 0:12 till ~0:24)
#108233 - Lick - Mon Nov 06, 2006 10:31 pm
Payk, no Zelda is not isometric. If you look closely at the video, when Link is in the dungeon, you can see the walls on the right side, when Link passes them, they're visible on the left side. There is perspective, but a very near-isometric one.
_________________
http://licklick.wordpress.com
#108253 - Payk - Tue Nov 07, 2006 1:14 am
Hmhmh in that video is apeared to be like...well on the otherhand the roof of the houses are fully 3d...so maybe they just wanted a old stlye made with 3d
#108260 - HyperHacker - Tue Nov 07, 2006 3:18 am
Sunray, using the 2D chips has an advantage: They use a lot less power.
As for performance measurement I use a very simple but effective method. I have a timer going on ARM7 counting milliseconds; every time it interrupts I increment the millisecond counter in IPC so ARM9 can see it too. Both CPUs record the current millisecond count at the beginning of their main loop. At the end they subtract the current count from the recorded count and store it in IPC. ARM9, at the end of its loop, draws some nice CPU bars, one pixel high and ~8 wide per millisecond counted. This gives you a simple indication of how long each CPU's main loop is taking while using very little power to actually display the bars. (I also threw in a quick check that if the counter is too high it drops to about 30, this prevents the bars from wrapping past the edge of the screen. If they're getting that far at all something's wrong anyway.)
I actually wanted the bars to show CPU usage percent, but I'm not sure how to measure that. My method does the job for what I need it for though.
_________________
I'm a PSP hacker now, but I still <3 DS.
#108285 - Payk - Tue Nov 07, 2006 2:24 pm
Also nice way..
Did you found out that arm9<->arm7 arent 100% synchron?
Thats why sending soundbuffers from arm9 ro arm7 creates that "clicks".... And funny is that every restart of the ds changes the clicks...on some starts there arent clicks at all....
It would be interesstin to measure this...
#108288 - tepples - Tue Nov 07, 2006 2:43 pm
Try making a single sound buffer or pair of sound buffers, at least 4096 samples in size, that's set to play at 32768 Hz in a loop.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#108291 - Payk - Tue Nov 07, 2006 3:21 pm
I play sound on arm7 since some weeks....and yeah i tried that...
Doublebuffering and a bigger buffer. Bigger buffer = less clicks.
But they would be stillt there. A ringbuffer could avoid that, i guess