#161087 - brave_orakio - Mon Jul 28, 2008 3:31 am
I have a very strange problem. I have a sprite sorting function that works when I run it from ROM and compiled to thumb or ARM code(naturally ARM code is slower from ROM). Now when I place the function in IWRAM, it breaks completely.
Now here's the strange part. If I remove the loop where the OAM buffer is sorted before being copied to OAM memory, it works. Of course the sprites aren't sorted because I removed that part. Here is the function that has problems:
Code: |
OBJ_ATTR update[128]; //Global array
OAM_SORT oam_buffer[128] // Basically an array of OBJ_ATTR with //an extra u32 variable to indicate sorted position , Also Global
int i,i2;
i = index;
while(i)
{
i--;
i2 = i;
while(i2)
{
i2--;
if((oam_buffer[i].OAM.attr0&255) < (oam_buffer[i2].OAM.attr0&255))
oam_buffer[i].position++;
else
oam_buffer[i2].position++;
}
update[oam_buffer[i].position] = oam_buffer[i].OAM;
}
// dma3_cpy16_Vblank(OAM_MEMORY,update ,index*4);
dma3_cpy32_vblank(OAM_MEMORY,update ,index*2);
index = 0;
|
The while loop is the problem there. If I remove it, the dma copy works fine. Can you guys give me a suggestion for the source of this error?
_________________
help me
#161090 - brave_orakio - Mon Jul 28, 2008 3:52 am
Also some additional info
Code: |
OBJ_ATTR objInit;
objInit.attr0 = 0xDEAD;
objInit.attr1 = 0xDEAD;
objInit.attr2 = 0xDEAD;
objInit.fill = 0xDEAD;
index = MAX_SPRITES;
while(index)
{
index--;
update[index] = objInit;
}
dma3_cpy32_vblank(OAM_MEMORY,update ,MAX_SPRITES*2);
|
The code above is a modified version of my sprite_init function. When I check the update array in memory it doesn't have the 0xDEAD value. Also the update array is not succesfuly copied into OAM(At least I don't think so but I have no way to find out because the update array doesn't even have the 0xDEAD value)
_________________
help me
#161091 - eKid - Mon Jul 28, 2008 4:17 am
What do you mean by "break completely"?
#161092 - brave_orakio - Mon Jul 28, 2008 4:24 am
None of the sprites showed up at all.
_________________
help me
#161093 - Miked0801 - Mon Jul 28, 2008 5:09 am
I am going to guess that you are going out of bounds on your update or oam_buffer arrays (leaning towards update from glancing at code). If the code lives in ROM, you are wiping out something else and you may not notice and the code could continue to work ok. But, as the code now lives adjacent to the buffer (more than likely), an array overwrite will now wipe-out code causing your bug.
I am suspicious of pre-decrementing your array in your bubble? sort routine.
FYI, if any programmer on my team had vars named index, i, and i2 in the same function, they would be in trouble.
#161095 - DiscoStew - Mon Jul 28, 2008 5:16 am
brave_orakio wrote: |
Of course the sprites aren't sorted because I removed that part. |
What do you mean by this? Are you saying that if you remove the while loop, the dma works, and you see the unsorted sprites, or do you mean in general by removing it, you have unsorted sprites. At first glance, it looked like you said the sprites show, but are unsorted, but now that I look at it again, I'm thinking otherwise.
_________________
DS - It's all about DiscoStew
#161096 - brave_orakio - Mon Jul 28, 2008 5:39 am
To Miked0801:
Checked the IWRAM memory and as far as I can tell, the code goes first then the array next. Although I will look into that. It might be the problem.
To DiscoStew:
Sorry if I didn't say that clearly, With the loop, nothing shows up at all. If I remove it, the sprites show up but it is unsorted as the loop does the sorting.
_________________
help me
#161098 - DensitY - Mon Jul 28, 2008 6:54 am
Looks like your doing a bubblesort there. Although there are faster sorting algos, lets keep things simple.
below is a _100% untested_ bubblesort implementation without the oddness.
Code: |
const int MAX_SPRITES = 128;
int i,k;
OBJ_ATTR temp;
OBJ_ATTR update[128];
for(i = 0; i < MAX_SPRITES - 1; i++)
{
for(k= 0;k < MAX_SPRITES - 1 - i; k++)
{
if(update[k+1].attr0 & 255 < update[k].attr0 & 255)
{
temp = update[k];
update[k] = update[k+1];
update[k+1] = temp;
}
}
}
|
#161099 - brave_orakio - Mon Jul 28, 2008 7:17 am
Hi guys, an update on the problem.
I tried Miked0801 suggestion and made the loop go upwards instead of down. Still the same behavior. Tried putting the global arrays into EWRAM, and still the same behavior. I did confirm that the arrays went into EWRAM with the 0xDEAD test.
Anyway I'll try your suggestion DensitY
_________________
help me
#161100 - DiscoStew - Mon Jul 28, 2008 8:30 am
just a few more questions.
In the first section, I assume that "index" is the number of sprites that will be loaded into OAM. Have you made sure that the value of this is correct? Having some odd number can result in going out-of-bounds with your array, bring in values not meant for it, or overwriting values assigned to other aspects of the data/code. Also consider the limitation of only 128 entries, in case you happen to exceed it.
Is the "position" value of the oam_buffer being reset every time prior to getting to the sort portion? By not resetting the positions to 0, they'll continue to increment into out-of-bounds areas. Same reading/writing of data/code explained above.
Is there a reason for that last bit in the first section, with "index = 0", and is it altered when you go back through to the sort function?
Other than that and not having a much larger portion of your project to examine the problem, I would say to revert all your stuff back to it's working form in ROM, and slowly move it to IWRAM, testing here and there each compile to see when the problem begins.
EDIT:
Another thought. You say that if you remove the sort loop, you see the sprites, but they are unsorted, but when you put back in the sort, they are gone. Now, I could be wrong on this, but for this case, wouldn't that mean that in some other part of your code, you are loading your sprite data to the "update" array rather than the "oam_buffer" array? What I mean is that with your sort, your data in "oam_buffer" gets sent to the proper places in "update", but should you not have that happen, "update" doesn't get updated, so how could you have your sprites visible unless you've got your data already in "update" in the first place by some other means?
Understand where I'm coming from? Perhaps, you are placing the info into "update" instead of "oam_buffer", leaving "oam_buffer" empty, and in the sort, you are actually placing in *empty* data into "update", therefore, you get nothing. It's a guess, but hey, it got me thinking.
_________________
DS - It's all about DiscoStew
#161101 - brave_orakio - Mon Jul 28, 2008 9:17 am
To DiscoStew:
Yes sir, you are correct about "index". It is basically the number of sprites to be loaded. Every time I put an OAM entry into the buffer, "index" increases. After the sorting loop and DMA copy to OAM memory is finished, I reset "index" back to 0 and the process begins again.
As for the "position", I checked again and yes it does reset to 0 when I place it into the buffer.
The thing is, I removed all other existing functions and left only the sprites and sorting. When I compile as ROM, everything is ok. When I compile to IWRAM, again nothing appears. This is a very strange problem.
_________________
help me
#161104 - silent_code - Mon Jul 28, 2008 10:17 am
This is just a vague guess:
There is 256kB of work RAM (not counting on-chip RAM).
This RAM includes the user stack.
Now, depending on how much data and code you store, adding the loop may cause the code to exceed the limits of the work RAM. Depending on the implementation of the memory mapping (I don't know if it works that way) memory addresses may warp (like VRAM block addresses) beyond the limits of the RAM, thus overwriting other content at the beginning. My guess is also, that the stack is located at the end and growing towards lower addresses, so it may be overwritten.
This is just a theory and I don't know about the technical detail. You would have to check for it yourself.
Good luck!
_________________
July 5th 08: "Volumetric Shadow Demo" 1.6.0 (final) source released
June 5th 08: "Zombie NDS" WIP released!
It's all on my page, just click WWW below.
#161137 - brave_orakio - Tue Jul 29, 2008 2:42 am
To silent_code:
The code isn't too long so that may not be the problem. Also I tried doing the loop from 0 going upwards and still the same behavior.
Another symptom of the problem is that the "update" buffer isn't getting the data from the "oam_buffer" buffer. this part of the code
Code: |
update[oam_buffer[i].position] = oam_buffer[i].OAM;
|
isn't changing the what's inside of the "update" buffer.
_________________
help me
#161144 - brave_orakio - Tue Jul 29, 2008 5:43 am
I finally fixed it! although the fix does raise some questions. On my previous post I said that
Code: |
update[oam_buffer[i].position] = oam_buffer[i].OAM;
|
So i changed it to
Code: |
dma3_cpy32(&update[oam_buffer[i].position], &oam_buffer[i].OAM, 2);
|
technically the same but is this slower or faster than the previous method? Isn't it a bit slow to copy only 2 words using DMA(because of the start up time) or is it actually faster?
And thanks to all who posted. I appreciate the effort from you guys! On a side note though, I didn't notice any performance enhancement from running the code from ROM or IWRAM. maybe my sorter is really inefficient?
_________________
help me
#161163 - Miked0801 - Tue Jul 29, 2008 4:51 pm
The sorter is very inefficent - but it depends on how many entries you are sorting. Less than 15 or so and you'll probably not notice much difference switching to a more efficient sort.
The DMA call is VERY in-efficient in comparison to the assignment. One is a single half-word assign. One is a crapload of setup code to do next to nothing. I am very suspicious of your code that your change makes any difference at all. Perhaps you had a run-away DMA elsewhere and your using DMA here is reseting it? Or perhaps just the added codesize for the DMA is re-aligning your code such that an array overwrite elsewhere isn't affecting you anymore. When debugging, did you step through the code in assembler to make sure the opcodes weren't getting wacked? Also, did you sepcifically watch the upload occur?
Your solution raises too many questions. You need to investigate it further.
#161168 - Dwedit - Tue Jul 29, 2008 5:28 pm
"in-efficient" and "inefficient" parse oppositely when I read them, so try to spell works correctly.
_________________
"We are merely sprites that dance at the beck and call of our button pressing overlord."
#161173 - Miked0801 - Tue Jul 29, 2008 6:21 pm
I'll make sure to work on my works :)
#161180 - silent_code - Tue Jul 29, 2008 8:40 pm
Well, if that function is the only thing you have in work RAM, then it should not be a problem, but if there are other things (like data fields! [arrays]) and routines there, too, then an overwrite seems totally possible.
_________________
July 5th 08: "Volumetric Shadow Demo" 1.6.0 (final) source released
June 5th 08: "Zombie NDS" WIP released!
It's all on my page, just click WWW below.
#161187 - brave_orakio - Wed Jul 30, 2008 2:03 am
To Miked0801:
I'm only sorting 7 objects, so I guess either something is wrong with my sorting Function or something else is slowing down the whole thing! Unfortunately I have no idea how to debug using Programmer's Notepad 2(Is it even possible?)
Although the array overwrite is possible, I feel it is highly unlikely because I usually set my buffers and such to a fixed size of 128 and I changed my loops to start from 0 going up and I limit the loops to the number of objects which is currently 7.
To silent_code:
I think the only things residing in IWRAM currently is the sorting function that I mentioned and the 2 buffers I use for sorting so an overwrite is possible but highly unlikely as I explained above.
One of my suspects that slow down the program though is the collision function. admittedly it is slow but I was hoping that I would see a significant increase in efficiency when I moved the sorting to IWRAM and then just optimize later on.
another suspect is this piece of code
Code: |
while(REG_VCOUNT<=160){}
|
I call this before the sorting function to synchronize by frame(VSYNC). Is this piece of code necessary? Do you guys actually use something like this? Because if I remove this, the program runs significantly faster though I have no idea if it is in sync because The only call that waits for VBLANK is my dma at vblank call.
edit: I messed around with the code again and made another fix. I returned the DMA copy to
Code: |
update[oam_buffer[i].position] = oam_buffer[i].OAM;
|
But I also declared both buffers to be volatile. Now it work again(Still at the same speed though). Apparently the compiler got smart on me. However the above questions still apply.
_________________
help me
#161613 - brave_orakio - Thu Aug 07, 2008 8:23 am
Hi guys, a follow up to my previous question.
I read up on interrupts and since I was able to turn the OAM sorter into ARM code and place it into IWRAM succesfully, I worked then to put this function into VBLANK interupt.
Now then, the problem is that it it flickers and sometimes there is this weird shadow effect(like a trailing image of the sprite) from time to time when the sprites move. The code for the sorter is of course the same as the one that I posted earlier.
Also another question is, now that I have the OAM sorter in VBLANK interrupt, do I still need this:
Code: |
while(REG_VCOUNT<160){}
|
in my main loop?
_________________
help me
#161619 - Kyoufu Kawa - Thu Aug 07, 2008 8:05 pm
Nobody would need that! Any slightly sane compiler would optimize it right out of there! I'm not the right person to try and tell you how you should wait for VBlank though, no matter where.
#161631 - brave_orakio - Fri Aug 08, 2008 10:07 am
OK, so can anybody tell me what is the proper way to do to vsync?
I'd like to hear what other guys do to vsync. Especially also what commercial games do to vsync?
_________________
help me
#161639 - Kyoufu Kawa - Fri Aug 08, 2008 7:16 pm
Okay, I lied -- I for one use the VBlankIntrWait BIOS call.