gbadev.org forum archive

This is a read-only mirror of the content originally found on forum.gbadev.org (now offline), salvaged from Wayback machine copies. A new forum can be found here.

C/C++ > Sqrt

#56812 - NighTiger - Tue Oct 11, 2005 7:41 pm

Hi guys,
I would like to know if there's and where's the funciont sqrt in the devkitARM Pro.
tnx

#56830 - Miked0801 - Tue Oct 11, 2005 9:26 pm

What do you need it for?

There are some ARM sprt functions out and about, but I was curious...

#56835 - NighTiger - Tue Oct 11, 2005 9:47 pm

Code:

sf VectorMagnitude (psVector pxVector)
{
  return ((sf)sqrt (pxVector->x*pxVector->x + pxVector->y*pxVector->y + pxVector->z*pxVector->z));
}


;-)

#56839 - Joat - Tue Oct 11, 2005 10:01 pm

You can always use math.h, but I don't recall if it includes an integer sqrt (which may not matter, dunno if your sf type is signed float, or signed fixed). There is also a bios sqrt function, and while it's faster than anything in math.h, it's still not great (if you need to do a lot of these fast, google for arm sqrt and you should find a website that has a lot of different implementations, one of which you can stuff into intram).

If you're just doing one or two per frame, any solution will work.

Also, if you're comparing distances, you can skip the sqrt entirely and compare the squared distances.
_________________
Joat
http://www.bottledlight.com

#56842 - DiscoStew - Tue Oct 11, 2005 10:19 pm

Code:

@Asm_Sqrt(u32 number) // u32 is an unsigned long
Asm_Sqrt:
      MOV     r1,#(3 << 30)
      MOV     r2,#(1 << 30)

      CMP      r0,r2
      SUBHS      r0,r0,r2
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 1)
      SUBHS      r0,r0,r2,ROR #(2 * 1)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 2)
      SUBHS      r0,r0,r2,ROR #(2 * 2)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 3)
      SUBHS      r0,r0,r2,ROR #(2 * 3)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 4)
      SUBHS      r0,r0,r2,ROR #(2 * 4)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 5)
      SUBHS      r0,r0,r2,ROR #(2 * 5)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 6)
      SUBHS      r0,r0,r2,ROR #(2 * 6)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 7)
      SUBHS      r0,r0,r2,ROR #(2 * 7)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 8)
      SUBHS      r0,r0,r2,ROR #(2 * 8)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 9)
      SUBHS      r0,r0,r2,ROR #(2 * 9)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 10)
      SUBHS      r0,r0,r2,ROR #(2 * 10)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 11)
      SUBHS      r0,r0,r2,ROR #(2 * 11)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 12)
      SUBHS      r0,r0,r2,ROR #(2 * 12)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 13)
      SUBHS      r0,r0,r2,ROR #(2 * 13)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 14)
      SUBHS      r0,r0,r2,ROR #(2 * 14)
      ADC      r2,r1,r2,LSL #1

      CMP      r0,r2,ROR #(2 * 15)
      SUBHS      r0,r0,r2,ROR #(2 * 15)
      ADC      r2,r1,r2,LSL #1

      BIC     r0,r2,#(3 << 30)
      BX      LR


This ASM function is what I use, and it seems pretty fast compared to other sqrt functions
_________________
DS - It's all about DiscoStew

#56927 - IRbaboon - Wed Oct 12, 2005 8:27 am

I personally use the Sqrt Swi Bios call (0x8h)
I'm not sure how fast it is compared to the other functions already listed, but it's hardware-supported so it can't be that bad.
_________________
[Images not permitted - Click here to view it]

#56954 - poslundc - Wed Oct 12, 2005 4:47 pm

IRbaboon wrote:
I'm not sure how fast it is compared to the other functions already listed, but it's hardware-supported so it can't be that bad.


It's not really "hardware supported". It's just a function Nintendo wrote and placed in fast memory (same speed/bus width as IWRAM). Fast memory is a good thing, but it's marred by a couple of things:

- Nintendo typically optimizes for space instead of speed, so it's not hard to improve on their routines.

- There is a significant time cost to entering any of the BIOS routines, where Nintendo's quasi-system-software wastes a bunch of cycles shuffling around memory and registers. While it's not enough to render them useless, it's enough to hurt their value considerably.

If you've got an ARM assembly function that you know works well and you don't mind it consuming IWRAM, you can easily outperform the BIOS.

Dan.

#56985 - NighTiger - Wed Oct 12, 2005 10:01 pm

tnx at all :)

#56989 - NighTiger - Wed Oct 12, 2005 10:32 pm

DiscoStew wrote:
Code:

@Asm_Sqrt(u32 number)


This ASM function is what I use, and it seems pretty fast compared to other sqrt functions


excuse me Disco, I'm very beginner with assembly.
How can I use it?[/code]

#57166 - SevenString - Fri Oct 14, 2005 1:08 am

http://www.finesse.demon.co.uk/steven/sqrt.html

http://www.finesse.demon.co.uk/steven/invsqrt.html

enjoy



btw, those two links were on the first page as

ARM code inverse square root routines

and

ARM code square root routines

when I did a routine google search for square root code


<SMARTASS>
that newfangled internet thing is just amazing, isn't it? ;)
</SMARTASS>
_________________
"Artificial Intelligence is no match for natural stupidity."

#57463 - SevenString - Sun Oct 16, 2005 5:09 am

The "sqrtf32" function in LIBNDS/math.h returned incorrect results for me. The fix was to alter the function to shift-left (<<) by 13, NOT 12.

Normally, I tend to use tables, or avoid sqrts altogether when I can. But the problem that led to this discovery was that when using the libnds "glRotatef", the normalization function was not returning a unit vector. So I had to drill down to find the culprit.

I'm not sure if this is due to an emulation bug in Ideas, or if this is an actual bug in libnds. To be honest, I'm too busy coding 3D stuff to bother drilling down into what the HW is actually doing vs. what the emulator may or may not be doing wrong.

However, if your 3D geometry is warping unpleasantly as you rotate, and you're using glRotatef, you might try altering sqrtf32 to use the << 13 trick. However, you'll need to rebuild the libnds source for this to work.

As a side note, many of the demos looked funky on the Ideas emulator until I recompiled with this "fix".

** UPDATE... LONG OVERDUE **

This is NOT a bug in libnds as the glRotatef/sqrtf32 code works fine on the hardware. In another thread, I describe some build techniques to deal with discrepancies between emulators and hardware.

http://forum.gbadev.org/viewtopic.php?t=7368
_________________
"Artificial Intelligence is no match for natural stupidity."


Last edited by SevenString on Tue Nov 08, 2005 8:17 pm; edited 1 time in total

#57660 - Miked0801 - Mon Oct 17, 2005 5:53 pm

Just wanted to add that very often, one does not actually need the magnitude of a vector. The square of the vector can suffice for many instances. Consider radius checks. Comparing the sqaures of the magnitudes works better than the normalized versions. In addition, many other times the magnitude will "cancel out" depending on the math.

#60297 - jumperwillow - Tue Nov 08, 2005 7:26 pm

SevenString wrote:
http://www.finesse.demon.co.uk/steven/sqrt.html

http://www.finesse.demon.co.uk/steven/invsqrt.html

enjoy



btw, those two links were on the first page as

ARM code inverse square root routines

and

ARM code square root routines

when I did a routine google search for square root code


<SMARTASS>
that newfangled internet thing is just amazing, isn't it? ;)
</SMARTASS>


Kuods for the links.
_________________
-h