#11149 - poslundc - Sat Sep 27, 2003 2:40 pm
More adventures with my raycasting/Mode7 engine...
Looks great so far. However when I tried it on hardware I noticed a few anomalous pixels on the occasional scanline, and the problem got worse as I increased the viewing angle (tilted the camera up).
I went back and checked my math carefully, and it turns out I've got a tiny imprecision in my trig calculations. Basically to determine the length of the ray I divide by cosine of the ray's angle (multiply by a secant LUT) and then multiply by cosine of the angle to the field of view to correct for fisheye. Secant and cosine obviously complement each other for the same angles; for example, when looking straight down the ray length should divide out to one. It doesn't, though; it divides out to 0.9999... (0x0000FFFF in my fixed-point math.)
It's literally the difference of a single bit in the end-result, but it shows up on screen.
I'm already using 8.24 LUTs and 64-bit math, so I'm not sure how I can improve on this.
Currently I'm thinking of calculating all of my ray lenghts, then doing a REALLY basic FIR filter on them ((left adjacent + 2 * current + right adjacent) >> 2) before going and calculating the rest of my values, but I am hesitant to add that many calculations to an already somewhat-complicated routine.
Any advice on how to deal with this problem would be greatly appreciated.
Thanks,
Dan.
#11152 - regularkid - Sat Sep 27, 2003 5:50 pm
I'm currently writing a raycaster as well, however it is a wolfenstien type engine. Anyways, I ran into sort of the same problem that you are having (precision problems). The way I fixed this was to round my numbers at the end. So, If I got the number 0.9999 like you are getting, it would round to 1 (if you take a look at some of my previous posts are about rounding). Basically, I just add 2048 (0.5 in my 12.20 numbers) to my final number before shifting back down to the integer. This works for both positive and negative numbers. Hope that helps!
_________________
- RegularKid
#11153 - tepples - Sat Sep 27, 2003 5:54 pm
Here's what I did for TOD: Define a far clipping plane distance, and if a ray's distance is greater than that, just don't draw anything on the corresponding scanline.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#11163 - poslundc - Sat Sep 27, 2003 8:08 pm
regularkid: Hm... rounding certainly seems to help the more basic cases... for example, it certainly cleans up the no-perspective view, and it's a helluva lot cheaper to implement than the FIR filter, so thanks for that. The problem is still persisting, though; I'm not entirely certain what's causing it anymore.
tepples: although the problem is naturally worse the further away from the camera the rays are being cast (and to that end I fully intend to clip the plane and have a horizon), I'm still encountering the problem within the intended display area.
I'm a visual person, so I've put a picture on my website with a sample background for anyone who feels like helping out:
[Images not permitted - Click here to view it]
Please note that while the problem does not seem very severe in the emulator, it is much more noticeable on hardware.
Thanks both... any more suggestions still welcome!
Dan.
#11167 - DekuTree64 - Sat Sep 27, 2003 10:32 pm
That looks exactly like what my texture mapped tri-filler does. Check out the screenshot of my demo on the demo poll, look at the edge of the circling fire thing near the center of the screen, and you can see some pixels that to that every-other-line-off thing. I never did find out why though. It's definately related to affine mapping though cause there was another guy writing a texture mapper a while back with the same problem too.
Higher accuracy doesn't seem to make any difference though, so I think the problem lies somewhere else. Good luck finding it though, I'm still pretty curious as to why it happens too.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku
#11169 - tepples - Sat Sep 27, 2003 10:39 pm
It looks like you're suffering from two problems: you may be losing a few bits of precision somewhere, and you may be throwing all your imprecision on one side of the screen.
Without a look at your code, I can't help you much with the first problem.
Here's how I solved the second problem in one of my software rot/scale engines: GBA rot/scale takes left side and offset vector coordinates. It may be better to 1. calculate the map coordinates at the center of the screen and then 2. subtract 120 times the offset, and use that as the origin of the scanline. This way, instead of having all the roundoff errors accumulate at the right side of the screen, the errors will be evenly spread on both sides, where they're less noticeable. I'm pretty sure Super Mario Kart for Super NES does its floor raycasting in this way, as straight lines are more likely to become randomly jagged (in the manner that's a telltale sign of roundoff error) on the sides than in the center. Then make sure you actually round the offset instead of truncating it.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#11171 - poslundc - Sun Sep 28, 2003 12:07 am
tepples wrote: |
It looks like you're suffering from two problems: you may be losing a few bits of precision somewhere, and you may be throwing all your imprecision on one side of the screen.
|
Ooh, I think that might just be the screencap I did. The problem is pretty much uniform wherever on the map/screen I go to; that was just a spot that showed up well.
tepples wrote: |
Without a look at your code, I can't help you much with the first problem.
|
Sorry, I didn't want to waste space by reposting too much that I had already posted a day or so ago (although I guess the code has changed a fair bit since then). Anyway, ask and ye shall receive:
Code: |
slice = gCamera.fov / 160;
phiCos = gCosLUT[gCamera.phi >> 7];
phiSin = gCosLUT[((gCamera.phi >> 7) + 128) & 511];
// calculate the length of the ray (ie. zoom factor) for each scanline
for (i = 0; i < 160; i++)
{
angle = gCamera.theta + (gCamera.fov >> 1) - (i * slice);
if (angle < 0)
angle += (256 << 8);
// mulTemp is of type (long long)
mulTemp = ((long long)gCamera.z * gSecLUT[angle >> 7]);
angle -= gCamera.theta;
if (angle < 0)
angle += (256 << 8);
// mulTemp is now 32.32
mulTemp = ((mulTemp >> 8) * (gCosLUT[angle >> 7]));
// mulTemp is 16.48
dist = mulTemp >> 40;
// dist is 24.8, units are pixels
// round to nearest 7 bits
if (dist & (1 << 6))
dist += 1 << 6;
// we now have distance in PIXELS. we need it in SCALE FACTORS
za[i] = dist >> 7; // divide by 128 (pixel distance to viewport)
}
...
// use the zoom factor just calculated to determine the rot/scale parameters for the background
for (i = 0; i < 160; i++)
{
zoom = za[i];
cX = gCamera.x - ((120 * (long long)zoom * phiCos) >> 24) - ((80 * (long long)zoom * phiSin) >> 24);
cY = gCamera.y - ((80 * (long long)zoom * phiCos) >> 24) + ((120 * (long long)zoom * phiSin) >> 24);
bgTransform[i].pa = ((long long)zoom * phiCos) >> 24;
bgTransform[i].pb = ((long long)zoom * phiSin) >> 24;
bgTransform[i].pc = ((long long)zoom * -phiSin) >> 24;
bgTransform[i].pd = ((long long)zoom * phiCos) >> 24;
bgTransform[i].x = cX + ((((i * (long long)zoom)) * phiSin) >> 24);
bgTransform[i].y = cY + ((((i * (long long)zoom)) * phiCos) >> 24);
}
|
I should probably clarify that the LUTs are 8.24 and based on a 512-degree circle.
Anyway, it works fine except for the imperfections in the rendering. Please let me know if you can identify anything that needs fixing.
DekuTree64 wrote: |
Higher accuracy doesn't seem to make any difference though, so I think the problem lies somewhere else. Good luck finding it though, I'm still pretty curious as to why it happens too.
|
I agree that accuracy doesn't seem like the problem... insofar as while it may be the ACTUAL theoretical source of the problem, there are diminishing returns for increasing the precision of the variables such that you can't really fix the problem at the source.
Anyway, I will keep you all posted if I make an unexpected breakthrough...
Dan.
#11207 - poslundc - Mon Sep 29, 2003 1:05 am
OK, I've managed to eliminate the problem pretty much entirely... in software. So long as I round my values carefully, VBA seems to give me the correct output.
Hardware is another matter, though. I am still getting the misplaced scanlines, although it is now very clear where they are occurring. I have doctored up a VBA screencap to reflect what is happening on hardware:
[Images not permitted - Click here to view it]
This is an extreme closeup shot of the right-hand side of the map, without any wraparound. (It's not actually at the top of the map; the horizon is artificial.)
Basically, the first scanline of every set of different zoom values mysteriously jumps ahead to the next set. You can see that it's a problem that gets worse further along the right-hand side of the screen, since that's where the difference in zoom values accumulates.
Tepples, I know that you mentioned something similar to this, but as you can see from my code (mostly unchanged from what I posted earlier) I am already centering the map to the screen. To my knowledge, scaling from the upper-left corner of the screen is not something that can be changed, since it's how the hardware processes it. (If you know of a way, I'd love to hear it.)
Also, the problem scanlines only occur if there is a change in zoom values. When there is no perspective (ie. camera looking straight down at the map) there are no anomalies, and as the perspective increases so do the problem scanlines.
If anyone can suggest a reason that this might be happening, I would really appreciate it!
Thanks,
Dan.
#11230 - FluBBa - Mon Sep 29, 2003 6:01 pm
how are you putting all the values to their regs?
your not using cpu interrupts to do it are you?
_________________
I probably suck, my not is a programmer.
#11235 - DekuTree64 - Mon Sep 29, 2003 6:40 pm
Hmm, well that screen shot DOES look more like an accuracy problem. Maybe it would help if you wait until the end to do your divide by 128 pixels to viewport, because that's cutting it down to 1 fractional bit on your distance, which really doesn't seem like enough.
If all else fails, you could try plugging in values and working out the math yourself, and see if you can find where things start to get screwy.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku
#11239 - poslundc - Mon Sep 29, 2003 7:55 pm
FluBBa wrote: |
how are you putting all the values to their regs?
your not using cpu interrupts to do it are you? |
I am using HBlank interrupts to activate DMA0 (I didn't want to just set DMA0 to auto-reload because I need it for other things as well, like HBlank palette changes).
Is this a potential problem?
Dan.
#11240 - poslundc - Mon Sep 29, 2003 7:58 pm
DekuTree64 wrote: |
Hmm, well that screen shot DOES look more like an accuracy problem. Maybe it would help if you wait until the end to do your divide by 128 pixels to viewport, because that's cutting it down to 1 fractional bit on your distance, which really doesn't seem like enough.
If all else fails, you could try plugging in values and working out the math yourself, and see if you can find where things start to get screwy. |
I don't think that is the problem, mainly because the division-by-128 still leaves it as a 24.8 number (it just scales that number down to the GBA hardware). I've played around with changing it later on, and it doesn't seem to make a difference.
I've already gotten screwy with the math... but I imagine I will have to get even more screwy with it before I'm through with this. :)
Dan.
#11590 - AnthC - Sun Oct 12, 2003 7:43 pm
Hi
I think you will find that those 'notches' you describe are a result of you not defining what a pixel is, so your math is off. A common problem, it's called sux pixel accuracy.
Hope this helps
Anth
#11608 - poslundc - Tue Oct 14, 2003 1:25 am
AnthC wrote: |
Hi
I think you will find that those 'notches' you describe are a result of you not defining what a pixel is, so your math is off. A common problem, it's called sux pixel accuracy.
Hope this helps
Anth |
Well the problem is long since solved, but I must confess I have no idea what you're talking about.
What do you mean, exactly, by my math being off on account of me not defining what a pixel is?
And what is sux pixel accuracy? (I would've assumed you misspelled sub-pixel accuracy except for the distance between the "b" and "x" keys on a qwerty keyboard.)
Thanks,
Dan.
#11611 - tepples - Tue Oct 14, 2003 3:05 am
poslundc wrote: |
(I would've assumed you misspelled sub-pixel accuracy except for the distance between the "b" and "x" keys on a qwerty keyboard.) |
B and X touch on Dvorak.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#11613 - col - Tue Oct 14, 2003 11:11 am
poslundc wrote: |
Well the problem is long since solved, but I must confess I have no idea what you're talking about.
What do you mean, exactly, by my math being off on account of me not defining what a pixel is?
And what is sux pixel accuracy? (I would've assumed you misspelled sub-pixel accuracy except for the distance between the "b" and "x" keys on a qwerty keyboard.)
|
wow - the guy was just trying to help !
If you have started a thread like this, and used the time and knowledge of people on the list, the least you could have done is posted that you found the solution to your problem, and what the solution was!
That way, you stop other people wasting their time on the already solved problem, and you will possibly help others who experience the same difficulties.
You really need an attitude review.
cheers
Col
#11616 - poslundc - Tue Oct 14, 2003 3:43 pm
col wrote: |
wow - the guy was just trying to help ! |
And I was just trying to understand his post. I genuinely didn't know what he meant, and until tepples explained that "b" and "x" touch on the dvorak keyboard, for all I knew he might be describing a different method that I could use for greater accuracy than that which I currently am.
Try rereading the post without forcing a sarcastic tone on it and you will see that I was only looking for clarification.
As for me posting that the problem had been solved, it wasn't solved for another couple of weeks and I didn't feel it would be especially polite for me to bump a topic that had long since died a natural death, just to say it had gone away.
The last thing I have ever intended to do is offend anyone on this board, and I apologize to anyone who has ever taken offence at my posts.
Dan.
#11617 - DekuTree64 - Tue Oct 14, 2003 5:09 pm
Actually I was wondering what AnthC meant myself...
But anyway, I'm curious as to how you fixed it too, cause my texture mapped tri-filler still has a similar problem, and I don't really have any idea why (not that I've made any attempt to fix it, but still).
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku
#11621 - col - Tue Oct 14, 2003 6:27 pm
sorry, i misunderstood the tone of your post.
It would still be good to post the solution though :)
these forums are often used as a reference resource, so its good to have a resolution to the thread. (IMO of course)
cheers
Col
poslundc wrote: |
And I was just trying to understand his post. I genuinely didn't know what he meant, and until tepples explained that "b" and "x" touch on the dvorak keyboard, for all I knew he might be describing a different method that I could use for greater accuracy than that which I currently am.
Try rereading the post without forcing a sarcastic tone on it and you will see that I was only looking for clarification.
As for me posting that the problem had been solved, it wasn't solved for another couple of weeks and I didn't feel it would be especially polite for me to bump a topic that had long since died a natural death, just to say it had gone away.
The last thing I have ever intended to do is offend anyone on this board, and I apologize to anyone who has ever taken offence at my posts.
Dan. |
#11622 - poslundc - Tue Oct 14, 2003 7:17 pm
Well, the other reason that I didn't post the solution was that it stemmed from something unrelated to the discussion...
If it's any help to others constructing similar engines, the main problem seemed to be rooted in my HBlank ISR. I'm not certain if it was the length of the ISR (which I know WAS extending past the HBlank period) or if it was my use of DMA0 (DMA has always been problematic for me on hardware, for some reason).
The problem is symptomatic of an ISR that runs too long, but I'm quite certain that the background memory updates were finished well before I ran out of HBlank time. On the other hand, I've noticed that gcc with -O3 has a tendendcy to interleave your statements in an unpredictable way, so who knows? On the other-other hand, I still had the problem with my original HBlank ISR which was much shorter, and should easily have fit into HBlank time, which is why I think it might be the DMA. In any case, when I rewrote the ISR in ASM and got rid of the DMA, the problem up and vanished.
It's been my observation that VisualBoy Advance either does not properly time HBlank and VBlank, or that it doesn't "cut off" and start drawing the line/frame when it should. Either way, it seems that many of the problems I've encountered when moving to hardware have been caused by me overrunning the length of a VBlank/HBlank cycle and VBA "forgiving" me for it, even though I wish it wouldn't.
Dan.
#11624 - tepples - Tue Oct 14, 2003 8:15 pm
poslundc wrote: |
it seems that many of the problems I've encountered when moving to hardware have been caused by me overrunning the length of a VBlank/HBlank cycle and VBA "forgiving" me for it, even though I wish it wouldn't. |
This is largely because VBA doesn't emulate wait states right. Any code running in EWRAM or ROM will run faster on VBA than on hardware.
I've seen this issue quite a bit in this forum and in gbadev@yahoogroups. Added to FAQ: Testing your code.
_________________
-- Where is he?
-- Who?
-- You know, the human.
-- I think he moved to Tilwick.
#11645 - AnthC - Wed Oct 15, 2003 2:35 am
tepples wrote: |
poslundc wrote: | (I would've assumed you misspelled sub-pixel accuracy except for the distance between the "b" and "x" keys on a qwerty keyboard.) |
B and X touch on Dvorak. |
Yup thanks Tepples - next time I will proof read the damn thing!
#11646 - AnthC - Wed Oct 15, 2003 2:45 am
Sub pixel is a bit of a hard topic to explain - but it's really obvious when you know it :)
You would think that when you draw a triangle, you start with your first texture position tx,ty as stored in your vertex then you interpolate down. This is wrong ? it causes pixel jitter and notches!
By very careful definition of what a ?pixel? is, you can avoid this jitter.
An easy way to understand this is to draw a triangle on graph paper, then defining a pixel as the center of a graph square, you can see that you run into problems with texture rounding. The idea is to only take pixels from inside the triangle at consistent positions.
Say for example (these are the _real_ left hand co-ordinates of your texture span)
Line 1 x=0.1 y=4.5
Line 2 x=0.4 y=3.5
Line 3 x=0.5 y=2.5
You can see if we round these positions, our integer x positions are
Line 1 x=0 y=4 (x=0 is outside the triangle!)
Line 2 x=0 y=3 (x=0 is outside the triangle!)
Line 3 x=1 y=2 (x=1 is inside the triangle)
So what we need to do is take our texture positions from the pixel centers.
Line 1 x=0 y=4 (x=0 is outside the triangle! We fix up our texture by (0.5-0.1)*inner dtx)
Line 2 x=0 y=3 (x=0 is outside the triangle! We fix up our texture by (0.5-0.4)*inner dtx)
Line 3 x=1 y=2 (x=1 is inside the triangle) We fix up our texture by (0.5-0.5)*inner dtx
That corrects the notches and the bouncing. You also need to fix up your y texture positions also. This is done once per vertex.
Here's some crude code to demonstrate sub pixel rounding to draw a correct flat shaded triangle. (assume Mode 3 screen setup)
typedef signed int S1616;
typedef struct
{
S1616 x;
S1616 y;
} POINT2D;
#define FPS 16
#define FPM (1<<FPS)
#define FPH (FPM/2)
S32 FPR(S1616 n)
{
return (S32)((n+FPH)>>FPS);
}
S32 FPMUL(S1616 a,S1616 b)
{
long long tmp=a;
tmp*=b;
return (S1616)(tmp>>FPS);
}
S32 FPDIV(S1616 a,S1616 b)
{
long long tmp=a;
tmp<<=FPS;
tmp/=b;
return (S1616)(tmp);
}
int TopY(POINT2D *p)
{
int topi=0;
S1616 topy=p[0].y;
S1616 topx=p[0].x;
for (int i=1;i<3;i++)
{
if (p[i].y > topy)
{
topi=i;
topy=p[i].y;
topx=p[i].x;
}
}
return topi;
}
void DrawTri(POINT2D *p)
{
int la,lb,ra,rb;
S1616 lx,rx,ldx,rdx;
lb=rb=TopY(p);
int ry=FPR(p[lb].y)-1;
int exit=0;
int rl=1,rr=1;
u16 *ps=((u16 *)0x6000000)+(160-ry)*240;
for (;;)
{
int h,lh,rh;
if (rl)
{
for (;;)
{
la=lb--;
if (lb<0) lb+=3;
if (lb==rb) exit++;
lh=1+ry-FPR(p[lb].y);
if (lh<=0)
{
if (exit) goto fini;
else continue;
}
else break;
}
rl=0;
lx=p[la].x;
ldx=FPDIV(p[lb].x-p[la].x,p[la].y-p[lb].y);
S1616 tmp=p[la].y-((ry<<FPS)+FPH);
lx+=FPH+FPMUL(ldx,tmp); // sub pixel correction
}
if (rr)
{
for (;;)
{
ra=rb++;
if (rb>=3) rb=0;
if (rb==lb) exit++;
rh=1+ry-FPR(p[rb].y);
if (rh<=0)
{
if (exit) goto fini;
else continue;
}
else break;
}
rr=0;
rx=p[ra].x;
rdx=FPDIV(p[rb].x-p[ra].x,p[ra].y-p[rb].y);
S1616 tmp=p[ra].y-((ry<<FPS)+FPH);
rx+=FPH+FPMUL(rdx,tmp); // sub pixel correction
}
if (lh<rh) {h=lh;rl=1;rh-=h;}
else {h=rh;rr=1;lh-=h;if (lh<=0) rl=1;}
ry-=h;
for (;;)
{
S32 l=lx>>FPS;
S32 r=rx>>FPS;
S32 w=r-l;
lx+=ldx;rx+=rdx;
u16 *pl=ps+l;ps+=240;
for (;w-->0;) *pl++=0x7fff; // Hline() ?
if (--h<=0) break;
}
if (exit) break;
}
fini:
;
}
void ClearScreen(void)
{
U16 *p=(u16 *)0x6000000;
for (int y=0;y<160;y++)
{
for (int x=0;x<240/16;x++)
{
*p++=0x0;*p++=0x0;
*p++=0x0;*p++=0x0;
*p++=0x0;*p++=0x0;
*p++=0x0;*p++=0x0;
*p++=0x0;*p++=0x0;
*p++=0x0;*p++=0x0;
*p++=0x0;*p++=0x0;
*p++=0x0;*p++=0x0;
}
}
}
void DrawIt(void)
{
POINT2D pts[3]={0*FPM,16*FPM,16*FPM,-16*FPM,-16*FPM,-16*FPM}; /* clockwise */
for (;;)
{
POINT2D tmp[3];tmp=pts;
for (int i=0;i<3;i++)
{
tmp[i].x+=240*65536/2;
tmp[i].y+=160*65536/2; /* fix to screen co-ordinates */
pts[i].x-=pts[i].y/256;
pts[i].y+=pts[i].x/256; /* crude rotate */
}
ClearScreen();
WaitVBL();
DrawTri(tmp);
}
}
Hope this helps.