gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

Coding > 128 sprites limit, again

#22806 - pentagram - Tue Jun 29, 2004 9:22 pm

I've been working on a sprite multiplexing system to get more than 128 sprites per frame. I'm overwriting the OAM 6 times each frame from within the VCount ISR, and it works, the OAM gets updated, why? Everywhere is said that OAM (and VRAM and palette) can only be accessed during VBlank or HBlank, so it seems there's something I don't understand. Any thoughts?

#22811 - poslundc - Tue Jun 29, 2004 10:09 pm

Have you tried it on hardware? If it works at all, you will probably get artifacts (tearing) as a result.

Dan.

#22821 - pentagram - Tue Jun 29, 2004 11:06 pm

It works perfectly on hardware. The VCount is triggered about 5 scanlines above the higher sprite's y coordinate for each group, although the copying loop takes less than a single scanline (pure asm ldmia and stmia). Even getting artifacts (its not the case) I still wonder how can I access the OAM outside the allowed access period...

#22822 - Miked0801 - Tue Jun 29, 2004 11:43 pm

Because, unlike the GBC, if the sprites aren't used on the lines you are doing the reload, things should be perfectly fine. The GBA doesn't lock the bus when drawing OAMs.

#22837 - abilyk - Wed Jun 30, 2004 2:07 am

pentagram wrote:
Everywhere is said that OAM (and VRAM and palette) can only be accessed during VBlank or HBlank, so it seems there's something I don't understand. Any thoughts?


Actually, you've kinda got it backwards. OAM cannot be accessed during HBlank unless the H-Blank Interval Free bit (bit 5) in the REG_DISPCNT register (0x4000000) is set to 1. Setting that bit allows HBlank manipulation of OAM but at a price -- the number of OBJ rendering cycles per line is cut from 1210 to 954.

See GBATEK for more details.

Edit: Also, to further clarify, as far as I know the only denied access period for OAM is during HBlank. As mentioned above, that denial can be overridden, so you should be able to access OAM at any time.

#22838 - poslundc - Wed Jun 30, 2004 2:24 am

pentagram wrote:
It works perfectly on hardware. The VCount is triggered about 5 scanlines above the higher sprite's y coordinate for each group, although the copying loop takes less than a single scanline (pure asm ldmia and stmia). Even getting artifacts (its not the case) I still wonder how can I access the OAM outside the allowed access period...


The artifacts would only be on the same scanline. As tepples says, HBlank is the only "prohibited" region. HDraw is sort of considered generally "prohibited", only because you would get artifacts on the same scanline. But as long as you can coordinate your sprites' positions so that you can keep your multiplexer ahead of the game, you should be fine.

Dan.

#22840 - jd - Wed Jun 30, 2004 3:01 am

Here's the relevant snippet from GBATEK:

Quote:

VRAM, OAM, and Palette RAM Access
These memory regions can be accessed during H-Blank or V-Blank only
(unless display is disabled by Forced Blank bit in DISPCNT register).
There is an additional restriction for OAM memory: Accesses during
H-Blank are allowed only if 'H-Blank Interval Free' in DISPCNT is set (which'd reduce number of display-able OBJs though).
The CPU appears to be able to access VRAM/OAM/Palette at any time, a
waitstate (one clock cycle) being inserted automatically in case that the display controller was accessing memory simultaneously. (Ie. unlike
as in old 8bit gameboy, the data will not get lost.)


It seems to contradict itself though - in the first sentence it says that OAM can only be accessed in H-Blank or V-Blank, but in the penultimate one it says that OAM can be accessed at any time. I assume from this that it is technically ok to access OAM during H-Draw but doing so will be slower than in H-Blank/V-Blank and might cause minor graphical glitches. Can someone confirm whether or not this interpretation is correct?

#22847 - wintermute - Wed Jun 30, 2004 12:56 pm

I've been having a look at a few things with GBA video hardware recently which seem to indicate that the VRAM is dual ported. I suspect that, knowing how cheap Nintendo are, I'm actually seeing writes being delayed to hblank or vblank. I still need to do some speed tests to confirm this though.

It would also seem possible that writing isn't blocked/delayed at all but rather the video hardware loads what it needs for a given scanline and you take your chances with the timing :)

From experiments so far, it's beginning to look as if the writes during hblank/vblank apply only to DMA and not the CPU.

#22850 - R0d Longfella - Wed Jun 30, 2004 1:54 pm

Perhaps a bit curious question, but why would you need 6 times more sprites? Ain't 128 sprites alot already?

And besides the amount of sprites that can be displayed, have you thought already about where you're going to store the video data of 768 sprites, isn't there only 32kb of space?

#22851 - poslundc - Wed Jun 30, 2004 2:13 pm

One of the most common uses for sprite multiplexing is variable-width text systems, such as the one used by the Golden Sun games.

Dan.

#22853 - Miked0801 - Wed Jun 30, 2004 3:22 pm

Which can (and is by us) done with BG layers or with intelligent sprite management. But to where you'd put those sprites? You can multi-plex sprite VRAM as easily as OAM data - not as fast, but still useful. Reload the OAM, reload the VRAM, instant new sprite. But truthfully, this technique is better for pre-loaded sprites (like space invaders.)

#22856 - Lord Graga - Wed Jun 30, 2004 4:09 pm

Golden Sun did it's text by writting 2 letters to one 16x8 sprite when the write function was called.

#22857 - poslundc - Wed Jun 30, 2004 4:15 pm

Man, that's crazy... I am having a hard time dreaming up a game-design scenario that would run into the kind of hardware limitations that would be best overcome by muxing both OAM and sprite VRAM.

Personally, I use text BGs for my variable-width font stuff. But then I use a lot of large rot/scale sprites, so sprite rendering cycles are very valuable to me. Although another good reason to use a text BG is twice the VRAM; you can have all four BGs going with more than enough room to spare if you are frugal about it.

Dan.

#22859 - tepples - Wed Jun 30, 2004 4:33 pm

Sure, but proportional-width text sits on a baseline and changes slowly. What would you do when porting a Neo-Geo shooter that uses up to 380 sprites? And what about the Robotron Writ Large that is Very Serious RoboDOOM? I'd do it by muxing OAM to draw the baddies and muxing sprite cel VRAM to draw them closer and closer (no, can't use sprite scaling with that many characters on screen).
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#22860 - poslundc - Wed Jun 30, 2004 4:40 pm

It'd be tough to negotiate the logistics and the hardware in something like that. What do you do if the majority of your sprites share a scanline?

This is the problem I have with my RPG: it's not the general situation where characters are scattered all over the place, but rather the exceptional situation where they line up that I will run out of sprite rendering cycles. (Which is why I'm implementing a software scaler to handle about half the load.)

Dan.

#22873 - ampz - Wed Jun 30, 2004 6:20 pm

poslundc wrote:
It'd be tough to negotiate the logistics and the hardware in something like that. What do you do if the majority of your sprites share a scanline?

This is the problem I have with my RPG: it's not the general situation where characters are scattered all over the place, but rather the exceptional situation where they line up that I will run out of sprite rendering cycles. (Which is why I'm implementing a software scaler to handle about half the load.)

Dan.

Easy.
Since it is, as you say, a exceptional situation; a perfectly good solution is to let the sprites flicker when there are more than 128 sprites sharing the same scanline.
Lets say 200 sprites share the same scanline.. You draw the first 100 sprites during one frame, and then the other 100 during the next frame.
The sprites will flicker somewhat, but it will be perfectly playable, and with 200 sprites on the same scanline, they will overlap eachother anyway.

I have seen several games using this solution.

#22875 - poslundc - Wed Jun 30, 2004 6:23 pm

I thought about doing this but opted not to. I figured it would happen frequently and unpredictably enough to be distracting to the gameplay. :P

Dan.

#22878 - pentagram - Wed Jun 30, 2004 6:56 pm

I have made a couple of tests:

Copy during HBlank, using CPU ldmia,stmia
Copy during HBlank, using DMA inmediate
Copy during VCount, using DMA inmediate
Copy during VCount, using DMA HBlank (real transfer performed during HBlank)

In all cases it worked, so it seems that both CPU and DMA have full access to OAM at any time. Wintermute, I would like to know about your experiments. Can you confirm my results or are you having problems with DMA transfers during hblank/hdraw? If finally this gets confirmed I'm wondering about the real purpose of the bit 5 in DISPCNT...

#22886 - Miked0801 - Wed Jun 30, 2004 8:22 pm

Perhaps legacy of GBC where it mattered?

#22902 - tepples - Wed Jun 30, 2004 10:32 pm

poslundc wrote:
I thought about doing this but opted not to. I figured it would happen frequently and unpredictably enough to be distracting to the gameplay.

Go play Teenage Mutant Ninja Turtles II or III for NES, which had a 64 pixel per scanline limit. Then tell me whether flicker, while noticeable, is actually distracting in practice.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.

#22906 - poslundc - Wed Jun 30, 2004 10:42 pm

I remember TMNT2 quite well, and yes, I believe it can work. But TMNT2 was a fairly straightforward action game, versus my game, which is an action-based RPG, in which the user may be required to pay more attention to the "specifics" of the sprites - because of the wider variety of actions - than they are in a game like TMNT.

It is also an aesthetic choice. One of the reasons I've developed a raycasting Mode 7 engine is so that I can use the camera's movement to create a more dramatic effect. I think that flickering sprites would detract from this a lot more than they would a simple sidescroller.

Besides which, flickering it was always my failsafe option, but if I can scale in software instead, I may as well, no?

Dan.

#22907 - ampz - Wed Jun 30, 2004 10:43 pm

tepples wrote:
poslundc wrote:
I thought about doing this but opted not to. I figured it would happen frequently and unpredictably enough to be distracting to the gameplay.

Go play Teenage Mutant Ninja Turtles II or III for NES, which had a 64 pixel per scanline limit. Then tell me whether flicker, while noticeable, is actually distracting in practice.

Or play "Life Force" for NES. Same thing there. Nost notable example: The arms of the boss on the first level.
Yes, it flickers, but it only happens in graphically messy situations and it is not very distracting.

And really, there is no perfect solution to the problem.
You can use software to eliminate sprites that are entirely overlapped by other sprites. This would be computationally intensive, and it still would not solve the problem entirely.

#22913 - pentagram - Thu Jul 01, 2004 12:51 am

Mike, you mean that bit 5 is there only for backward compatibility? If so perhaps this is not the only case, and there are more 'hardware features' that are only used when plugging GB/GBC carts. Also I bet that VRAM and palette memory accesses are not as restrictive as described in gbatek and other docs out there.

#22918 - tepples - Thu Jul 01, 2004 1:28 am

I have demonstrated, with a flash cart, that a program can read or write VRAM or palette memory at any time. Consequence of this is that I often use writes to PALRAM[0] (the backdrop color) for graphical profiling of a function.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.