#139966 - brave_orakio - Wed Sep 12, 2007 3:44 am
I'm a bit confused on how to move multiple sprites or only one sprite witohut much change in speed?
Currently, if I run 1 or 2 sprites, there is no noticable change. If I run 3 or more, slow down is very noticable. Even if I remove my sorting and collision engine, no noticable change is seen.
Also, why doesn't this work:
while(REG_VCOUNT<=160){};
while(REG_VCOUNT>160)
{
dma3_cpy16(OAM_MEM, OAM_BUFFER, numberofSprites*8);
}
I have to do struct copies instead using:
while(numberofSprites)
{
while(REG_VCOUNT>160 && numberofSprites)
{
numberofSprites--;
OAM_MEM[index] = OAM_BUFFER[numberofSprites];
}
}
_________________
help me
#139985 - gmiller - Wed Sep 12, 2007 11:40 am
the DMA as you have put it will copy NumberOfSprites*8 shorts (DMA_16) (should be NumberOfSprintes*4) so your copying too much and the DMA call should not be in the loop because it need only be done once.
If your DMA is executed immediately then as soon as the function returns all data has been copied. You could set the options to do the copy during the next VBLANK (assuming the function you are using supports this) then you could avoid the loops all together (assuming again that the asynchronous behavior works in your environment).
#139986 - Cearn - Wed Sep 12, 2007 11:45 am
brave_orakio wrote: |
I'm a bit confused on how to move multiple sprites or only one sprite witohut much change in speed?
Currently, if I run 1 or 2 sprites, there is no noticable change. If I run 3 or more, slow down is very noticable. Even if I remove my sorting and collision engine, no noticable change is seen.
Also, why doesn't this work:
Code: | while(REG_VCOUNT>160)
{
dma3_cpy16(OAM_MEM, OAM_BUFFER, numberofSprites*8);
} |
|
That will copy to OAM again and again (and nothing else) as long as you're in the VBlank period. " while(REG_VCOUNT>160) " is part of waiting for the next VBlank but you're not supposed to do anything while you're waiting. The normal structure of the main loop is this:
Code: |
while(1) // main loop
{
// Wait for next VBlank (don't reverse these)
while(REG_VCOUNT>=160); // Wait till next VDraw
while(REG_VCOUNT<160); // Wait till next VBlank
// Do other stuff
// Update OAM
dma3_cpy16(OAM_MEM, OAM_BUFFER, numberofSprites*8);
}
|
Waiting for VBlank using interrupts is the preferred method, by the way. Also, check how the third parameter of dma3_cpy16 is used: is it the byte-count or halfword-count? If the latter, you're copying twice as much data as you should.
#140061 - brave_orakio - Thu Sep 13, 2007 3:00 am
Thanks, but its still not working. It doesn't seem to be updating at all!
Is it really difficult to use the DMA with OAM?
When I reset the numberOfSprites to 0 after the DMA, no update happens.
The routine consists of calling the sprite functions in an infinite loop. Then within those functions, give their respective OAM to the OAM functions and increment the numberOfSprites running. After the sprite functions, the sorter is called and the OAM is updated. After update numberOfSprites is reset to 0. If I don't reset it, something appears but not the result that I want.
I don't understand why it won't update even after setting the copy on next vblank in DMA or even the wait for next vblank and wait for next vdraw functions are used
_________________
help me
#140077 - Cearn - Thu Sep 13, 2007 11:51 am
brave_orakio wrote: |
Thanks, but its still not working. It doesn't seem to be updating at all!
Is it really difficult to use the DMA with OAM?
|
It shouldn't be, no. But you do have to be sure it's done right. Check that the order of the arguments and the number of transfers is correct -- if not, then there's trouble.
Start with this:
Code: |
// Testing dma3_cpy16
int main()
{
int ii;
u16 *foo[128*84; // Temp OAM buffer
u16 *bar= (u16*)OAM_MEM;
for(ii=0; ii<128*4; ii++)
{
foo[ii]= ii;
bar[ii]=0xDEAD;
}
while(REG_VCOUNT<160); // Writing to OAM doesn't work in VDraw
dma3_cpy16(OAM_MEM, foo, 128);
while(1);
}
|
If dma3_cpy16() does the right thing, OAM should have the numbers 0 through 0x00FF (see VBA memory viewer). If you're copying too much, you should see higher numbers as well. If you see 0xDEAD, you'll know the copy didn't work at all and if there are zeroes, it's done something but not what you wanted. Once you're sure it works, try it again with all the other stuff.
#140190 - brave_orakio - Fri Sep 14, 2007 4:03 am
Thanks for the tip, I had no idea VBA could do something like that!
When I tried it as is, it worked until 3F then DEAD all the way.
Then, I modified my code to do the same thing as yours.
with DMA zeroes are seen in the areas where the OAM values should be, then DEAD after that.
With the other loop, the expected values are there.
I'll try to mess around with the code a little.
By the way, my other question regarding the slowdown, What is the stanard used by developers when moving multiple sprites? I simply used an increase in the delay counter with each added sprite so the time to update for each sprite will be quicker. What technique do you guys use?
_________________
help me
#140284 - tepples - Sat Sep 15, 2007 12:40 am
brave_orakio wrote: |
By the way, my other question regarding the slowdown, What is the stanard used by developers when moving multiple sprites? I simply used an increase in the delay counter with each added sprite so the time to update for each sprite will be quicker. What technique do you guys use? |
Fixed-point arithmetic, for one. It's like floats, but faster on the DS. Look it up on Wikipedia and Google.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#140397 - brave_orakio - Sun Sep 16, 2007 2:43 am
I think I have to rephrase my question. A sprite doesn`t update every scanline right? Each sprite has a certain delay before its OAM values are updated. The same is true for its image, the frames aren`t updated immediately or we would only see a quick blur of something moving very quickly.
So with each added sprite these delays all add up, creating artificial slowdown(My multiple sprites aren`t using complicated AI, just moving left to right), although the processor isn`t really doing much yet. So then how do you guys deal with this? Also, how do you delay the sprite update? Is it by counting the number of scanlines, then updating after a certain number of scans?
By the way thanks for the Fixed poit arithmetic tepples, I`m sure I will be using this when things get more complicated in my game.
_________________
help me
#140455 - tepples - Sun Sep 16, 2007 5:15 pm
The rotation/scaling hardware on the GBA and Nintendo DS uses what the Wikipedia article calls Q8 fixed-point, where numbers are represented as integers with an implied denominator of 2^8 = 256. Games may store things other than hardware registers, such as positions of game objects, in Q8 or in other formats such as Q12 or Q16 (which have more fractional bits, that is, a larger denominator).
The following fragment illustrates how to update the sprite every frame while moving the sprite by a fraction of a pixel per frame.
Code: |
/* this function converts a Q8 number to an integer, always rounding down */
#define fixfloor(x) ((x) >> 8)
/* this function rounds a Q8 number to the nearest integer */
#define fixtoi(x) fixfloor((x) + 128)
xVel = 64; /* that is, 64/256 pixels per frame to the right */
yVel = -32; /* that is, 32/256 pixels per frame up */
while (!done) {
x += xVel;
y += yVel;
moveSpriteTo(fixfloor(x), fixfloor(y));
waitForNextFrame();
}
|
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#140527 - brave_orakio - Mon Sep 17, 2007 3:00 am
intresting, that is certainly one way (a very elgant way!)of doing it. So no delays using vblank as a timer though? I read somewhere that vblank is often used as a timer
By the way, I tried messing with the DMA but still no progress. The DMA function itself works, because I use it to copy my sprite images. Also the data is aligned so there should be no problem there. Anymore ideas?
_________________
help me
#140553 - tepples - Mon Sep 17, 2007 12:35 pm
brave_orakio wrote: |
So no delays using vblank as a timer though? I read somewhere that vblank is often used as a timer |
You can have an NPC walk toward a given position for 240 vblanks, then stop for 120 vblanks. Or you can have a particular animation change cels every 6 vblanks.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#140555 - brave_orakio - Mon Sep 17, 2007 2:04 pm
I see, so that`s a few of the uses of a vblank timer.
But in your example, let`s say all of the sprites use that same routine to move. If a lot of the sprites are active, then all that processing adds up.
There could be huge difference if say 1 sprite is active(The sprite moves very quickly) and 3 sprites are active(there is a noticeable change in speed).
So how do you synchronize movement? meaning if there is 1 sprite or 3 sprites or maybe 6, there is no noticeable change in their speed. Of course if too many sprites are moving, things will slow down.
I`m talking about an average of maybe 1 to 6 or even 8 sprites without noticeable speed differences. What is the usual solution? I`m curious as to what you and other people do in this case.
My solution to this problem was to, lets use your code as an example:
Code: |
xVel = 64+numBerOfSpritesActive; /* that is, 64/256 pixels per frame to the right */
yVel = -32+numBerOfSpritesActive; /* that is, 32/256 pixels per frame up */
while (!done) {
x += xVel;
y += yVel;
moveSpriteTo(fixfloor(x), fixfloor(y));
waitForNextFrame();
}
|
well something like that. Meaning the one pixel increase will be reached faster to compensate for the increase in the nuber of sprites because of more processing. Of course the above is not optimised. Just a general idea of how my solution works.
_________________
help me
#140561 - gauauu - Mon Sep 17, 2007 3:02 pm
I think you're missing something.
Each time vblank occurs we'll call 1 frame. So you set up one big master loop in your program, so that the whole game logic, EVERYTHING (with a few exceptions maybe) gets executed, then it waits for vblank, then starts over.
Each time through, each frame, you'll check every sprite, and move it or not move it.
Then, unless you add way too much code so that you run out of time between vblanks, your processor spends most of it's time waiting for vblanks, and it doesn't matter how many sprites you have*, as they each get updated 1 frame for each vblank.
Sorts of a pseudocodey example:
Code: |
while (true)
{
doAllGameLogic(); //update all game-logic stuff, everything but
//the stuff drawn on the screen
waitForVBlank(); //wait for Vblank
updateSprites(); //now update ALL your sprites
}
void updateSprites()
{
for (int i = 0; i < MAX_NUM_SPRITES; i++)
{
if (spriteIsActive(getSprite(i)))
{
moveSprite(....);
}
}
}
|
If you start waiting for vblank at other times during your code, other than the main loop, you start running into problems.
Does that make sense?
*To a reasonable extent...as long as you have less than the max # of sprites supported by the gba, and as long as you don't do ridiculous amounts of processing for each one. There are other areas in this post where I grossly generalize something to make my point. The point is to help brave_orakio understand this thing, not to debate every possible place that you might occasionally wait for a vblank.
#140600 - brave_orakio - Tue Sep 18, 2007 1:48 am
Ah, yes thank you that does make sense. I was thinking of the same thing too. Only one vblank in the whole game loop to synchronize.
I was just wondering if others put a vblank in other functions in their loop.
Although I would imagine that would slow things down quite a bit.
But basically you don't add some sort compensation in cases of plenty of active sprites? Looks like I will need to do some optimizations in my code then.
_________________
help me
#140601 - gauauu - Tue Sep 18, 2007 2:04 am
I don't add any sort of compensation, but if you had a truly computationally heavy game, you could lock it at 2 vblanks per game frame, and double the amount of time you have.
But unless you're doing some pretty fancy stuff, or doing things horribly wrong, you won't need it.
#140616 - brave_orakio - Tue Sep 18, 2007 10:06 am
I am about to give trying to use DMA on OAM. I tried using 16 bit and 32 bit, and different configuration of each. Still nothing. All I see is 0x00000000 to the copied areas. Does anybody know why this is happening?
_________________
help me
#140622 - tepples - Tue Sep 18, 2007 12:58 pm
brave_orakio wrote: |
Ah, yes thank you that does make sense. I was thinking of the same thing too. Only one vblank in the whole game loop to synchronize.
I was just wondering if others put a vblank in other functions in their loop. |
A game SHOULD use only one or two vblank waits per "main loop", but there may be more than one "main loop" in a program. For example, a "main loop" for game mode, one for pause mode, one for the title screen, one for the menus, etc.
Quote: |
But basically you don't add some sort compensation in cases of plenty of active sprites? Looks like I will need to do some optimizations in my code then. |
If there are so many active sprites that computation may overflow the time for one frame (280896 ARM7 cycles on the GBA or 560190 ARM7 cycles on the DS), then you can move the sprites twice as far and wait for 2 vblanks.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#140695 - brave_orakio - Wed Sep 19, 2007 4:53 am
Can the dma copy 128 OAM entries in one vblank? I tried messing with the DMA again, copying all 128 entries instead of just how many are needed, and some of the spites appeared.
I looked at the memory viewer and found out that every 14th entry had the correct entry. Has anybody experienced this? Is it an alignment problem or something else entirely?
EDIT:
Alright here is more information on the problem.
Code: |
dma3_cpy32(OAM_MEMORY,OAM_BUFFER, numOfSprites*2);
|
if OAM_BUFFER is declared as a global, everything works fine. If it is declared as a local variable, it gets messed up. I would like to avoid declaring it as a global variable because it eats up IWRAM space. Can anybody help?
_________________
help me
#140748 - Miked0801 - Wed Sep 19, 2007 6:09 pm
Sounds like a ptr alignment issue. And yes, there is plenty of time to DMA the entire OAM during vblank.
#141004 - brave_orakio - Fri Sep 21, 2007 7:21 am
Yes it does sound like an alignment issue. I tried aligning it again(the typedef of the struct is already aligned) but to no avail.
Anyway about the delays/sync question again, is it alright to sync with the dma_vblank function?
Adding a wait_for_vblank function sounds redundant because the dma_vblank already does that and the dma_vblank is the last function to be called anyway before the loop begins again.
What are your opinions on this?
_________________
help me
#141395 - Miked0801 - Mon Sep 24, 2007 11:14 pm
Are you adverse to writing your own Vblank Interrupt handler? The DMA call should sit within Vblank (Or and HBlank if you want to be really tricky/silly.)
#141418 - brave_orakio - Tue Sep 25, 2007 4:03 am
I haven't gotten into interrupts yet. And I'm a bit wary of using it since my design would require lots of DMA, and I read that DMA can screw up intterupts.
_________________
help me