#148552 - naleksiev - Mon Jan 07, 2008 3:16 am
I'm preparing for job interview and going through some interview tests. One of the question is how to write function
unsigned int div7(unsigned int val);
without using division, mod or multiply. I figured out if it can be optimized someone here will know how :)
#148555 - DekuTree64 - Mon Jan 07, 2008 4:30 am
So the goal is to write a function that returns the same value as this:
Code: |
unsigned int div7(unsigned int val)
{
return val / 7;
} |
Without using division?
The obvious way is repeated subtraction, but that's no fun, and gets very time consuming with large numbers. The "standard" division algorithm is to use bitshifts so there's at most one iteration per bit of the values. Here's a generic unsigned 32-bit divde off the top of my head:
Code: |
unsigned int div(unsigned int numer, unsigned int denom)
{
unsigned int result = 0;
int shift = 0;
// Figure out how high we can shift it without overflow
while(!((denom << shift) & BIT31))
shift++;
for (; shift >= 0; shift--)
{
if (numer >= (denom << shift))
{
numer -= denom << shift;
result += 1 << shift;
}
}
return result;
} |
You can specialize it for the number 7 if you want. Also for an interview test you might not want to assume that an unsigned int is 32 bits :)
Oh, and another technique you can use is fixed-point reciprocal multiplication. But you have to be aware of the range of numbers you're working with to make sure it's fully accurate.
_________________
___________
The best optimization is to do nothing at all.
Therefore a fully optimized program doesn't exist.
-Deku
#148557 - keldon - Mon Jan 07, 2008 5:20 am
Hmm, well they just said not to use div or mul, so do just that! I really don't think it's a trick question, but you can store a multiplication table, such as 7 multiplied by all powers of 2 (or even 10), etc, making this a whole lot faster. There's even an optimisation for finding the shift using 2 steps of a binary search followed by an 8-byte lookup.
But like I said, I don't think it's a trick question - this might just be a question to see you code!
(can't sleep :()
#148560 - naleksiev - Mon Jan 07, 2008 5:45 am
The reciprocal doesn't work because you can't do multiplication.
I was trying to google it. Apparently there are similar threads but not with a good answer.
I'm not sure if this is a tricky question. The previews question is multiplyBy123. There are 61 more questions in the test. I do believe there should be a simpler answer.
p.s. Please stop with the interview tips. I just thought it will be fun to brainstorm it :)
#148562 - keldon - Mon Jan 07, 2008 6:10 am
61 questions pretty much says that they just want 61 quick answers to me!
#148569 - Jim e - Mon Jan 07, 2008 7:36 am
naleksiev wrote: |
The reciprocal doesn't work because you can't do multiplication. |
Nonsense, I come from assembly where you don't even have a mul instruction, and we did reciprocal multiplication just fine.
Code: |
//-----------------
// Divide by 7 using reciprocal multiplication
// (2^32)/7 = 100 100 100 100 100 100 100 100 100 100 . 1
// round up that last digit for accurracy.
unsigned int div7(unsigned int val) {
unsigned long long num = 0;
int i;
for(i=0;i<10;i++) {
num = num<<1;
num += val;
num = num<<2;
}
num += val;
num = num>>32;
return num;
} |
This WILL lose accuracy on larger integers(off by one or 2). It is possible to try to correct for it though by checking the range of the number.
#148581 - naleksiev - Mon Jan 07, 2008 5:00 pm
Jim e wrote: |
This WILL lose accuracy on larger integers(off by one or 2). It is possible to try to correct for it though by checking the range of the number. |
I tried your solution for range 0 to 1 billion there's no a single error
#148584 - naleksiev - Mon Jan 07, 2008 6:10 pm
Hey Jim e,
I kind of understand your solution.
I'll inline to make it more readable for me ... sorry.
Code: |
static uint div7(uint val)
{
ulong v = val;
return (uint)((val + (val << 2) + (val << 5) + (val << 8) + (val << 11) + (val << 14) + (val << 17)) >> 20);
} |
I don't know how you come up with the val + .... If it's up to me I'll write it as (val >> 1) + (val << 2) + ... and I'll make a mistake.
Actually, now when I'm writing this post I think I figure it out. Is the other 3.5 before devision with 7 is coming from the half that I need to add when converting fixed point to int?
#148585 - Lick - Mon Jan 07, 2008 6:43 pm
Isn't long long 64-bit? A single long is 32-bit? Not sure for 64-bit machines.
_________________
http://licklick.wordpress.com
#148589 - wintermute - Mon Jan 07, 2008 7:10 pm
naleksiev wrote: |
I don't know how you come up with the val + .... If it's up to me I'll write it as (val >> 1) + (val << 2) + ... and I'll make a mistake.
|
What he's doing here is factoring 1/7 in powers of two.
(1 + 4 + 32 + 256 + 2048 + 16384 + 131072) / 1048576 = ~ 1/7
basically you take a large(ish) power of two, divide by 7, remove the fraction part, multiply your numerator by this value & divide by the power of two you used earlier. This can be a good approximation for pretty much any division.
Multiplies are much easier to do, you can factor any integer in powers of 2. With the mul123 example for instance
64 + 32 + 16 + 8 + 2 + 1 = 123
Code: |
uint mul123 ( uint val ) {
return ( val << 6 ) + ( val << 5 ) + ( val << 4 ) + ( val << 3 ) + ( val << 1) + val;
}
|
_________________
devkitPro - professional toolchains at amateur prices
devkitPro IRC support
Personal Blog
#148590 - wintermute - Mon Jan 07, 2008 7:26 pm
keldon wrote: |
61 questions pretty much says that they just want 61 quick answers to me! |
Yeah, but this sort of binary arithmetic is a quick answer if you already know the answer. Not so much if you've never seen any of this kind of thing before.
Do things like this get taught on any programming courses anywhere? I remember a lot of this kind of theory from when I learned z80 a long, long time ago but I've worked with several graduate programmers who didn't have a firm grasp on binary logic never mind this sort of manipulation.
_________________
devkitPro - professional toolchains at amateur prices
devkitPro IRC support
Personal Blog
#148592 - Miked0801 - Mon Jan 07, 2008 7:35 pm
Or, if you can subtract, you can reduce your mul further to :
128 - 4 - 1
Code: |
return (val << 7) - (val << 2) - val;
|
Which is probably the answer they wanted. Bit-shifting and binary operators are kinda a lost art anymore. It's this lack of knowledge that makes most new programmers have no clue about how fixed point arithmetic works.
#148594 - Miked0801 - Mon Jan 07, 2008 7:42 pm
BTW, Jim E, I loved your solution. I immediately went to the cmp, adc, sbc solution, but yours is much nicer/quicker for the problem at hand. Factoring like that is brilliant, and I should have seen it earlier. I never realized that a divide by seven had such a regular pattern in binary. Very pretty.
A new technique has been added to my toolbox :)
#148595 - simonjhall - Mon Jan 07, 2008 7:44 pm
I've suggested this dividing question to the head of my group so they can stick it in their programming test. It's gotta be more interesting than macro questions that have '++' in them :-)
...but yeah, I do believe bit shifting and bit manipulation etc is becoming a lost art. But there's a reason for this - it nakes difficult to read and understand code!
I recently converted a piece of heavily-optimised assembler that I'd written to C (yeah...) and this C version was pretty unreadable as a result! It does make you look pretty 1337 though, and makes it harder (or easier?) for them to fire you.
_________________
Big thanks to everyone who donated for Quake2
#148606 - kusma - Mon Jan 07, 2008 9:22 pm
simonjhall wrote: |
But there's a reason for this - it nakes difficult to read and understand code! |
Not to mention that it's slow on modern CPUs :)
#148607 - Jim e - Mon Jan 07, 2008 9:23 pm
Quote: |
Do things like this get taught on any programming courses anywhere? I remember a lot of this kind of theory from when I learned z80 a long, long time ago but I've worked with several graduate programmers who didn't have a firm grasp on binary logic never mind this sort of manipulation. |
Most of the programming teachers I've met don't have a firm grasp of such low levelness. People just seem to take division, multiplication and square roots for granted.
Code: |
//-----------------
// Jim e
// Divide by 7 using reciprocal multiplication
// (2^32)/7 = 100 100 100 100 100 100 100 100 100 100 . 1
unsigned int div7(unsigned int val) {
unsigned long long num = val; // 0
num <<= 3;
num += val; // 1
num <<= 3;
num += val; // 2
num <<= 3;
num += val; // 3
num <<= 3;
num += val; // 4
num <<= 3;
num += val; // 5
num <<= 3;
num += val; // 6
num <<= 3;
num += val; // 7
num <<= 3;
num += val; // 8
num <<= 3;
num += val; // 9
num <<= 2;
//Correct accuracy pass 1/3 mark
if (val < 0x55555555 ) num += val;
else num += ((val>>1)+(val>>3));
return num>>32;
} |
I added correction for the accuracy, which happens to fail at 1.5 billion or so. So now its identical to divide by 7. Assembler its generally a good idea to unroll loops with few iterations, it saves registers and potentially a good chunk of time.
#148608 - Cearn - Mon Jan 07, 2008 9:46 pm
Divide-and-conquer is also fun here:
Code: |
/* N = (2^20+7-1)/7 = 149797
100100100100100101 =
1001001001001001*100+1 =
1001001 * 1000000001*100+1 =
(((1000+1)*1000+1)*(1000000000+1)*100+1
*/
unsigned int div7(unsigned int x)
{
unsigned int y;
y= (x<<3) + x;
y= (y<<3) + x;
y= (y<<9) + y;
y= (y<<2) + x;
return y >> 20;
}
|
Note that ARM can do x*(1 + 2^n) in one instruction, so this should be pretty fast (if multiplication is not allowed, of course).
Theoretically, the lower limit for an accurate result with 20 bits is roughly 350000, but you already overflow 32 bits on the 'multiplication' at 28672. Ironically, you get double the safe range by using less bits in this case, since the limit for 19bits is 57344.
#148638 - naleksiev - Tue Jan 08, 2008 3:31 am
wintermute wrote: |
What he's doing here is factoring 1/7 in powers of two.
(1 + 4 + 32 + 256 + 2048 + 16384 + 131072) / 1048576 = ~ 1/7
|
I'll try to explain again what I didn't understand.
actualy (0.5 + 4 + 32 + 256 + ...) >>32 should be more accurate
check the Jim e's post
100100100100100100100100100100 . 1
I did understand how it works from the explanation then I tried to write it myself without looking how Jim e implemented it and mine was wrong about 14% errors +/- 1. Then I run Jim e's implementation and I found no errors.
That's why I'm asking how he come up with the +1 instead of +0.5
#148642 - Jim e - Tue Jan 08, 2008 4:31 am
This value comes from saying (2^32) / 7 = 100100100100100100100100100100 . 1. However if you try 100100100100100100100100100100 . 1 * 7 it will not equal (2^32).
This is all 32 bits, plus that one decimal bit.
001 001 001 001 001 001 001 001 001 001 00 . 1
So that multiplied by 7 equals
111 111 111 111 111 111 111 111 111 111 11 . 1
No matter what, it will never fully restore to 2^32 unless you round up. Basically its the same problem calculators face with 1/3 = 0.3333333...
So I then have to make a choice of what digit to round up from. Logically I should add val>>1 + val>>4 + val>>7, but it just doesn't work out.
So for smaller numbers I round up the first digit in the fraction part, That's where num += val; when val is less than (2^32)/3 comes from.
When the number is larger I need a bit more accuracy to round up from.
Thats where num += ((val>>1)+(val>>3)); comes from, its a rounded up form of (val>>1) + (val>>4).
#148675 - Cearn - Tue Jan 08, 2008 8:04 pm
Miked0801 wrote: |
BTW, Jim E, I loved your solution. I immediately went to the cmp, adc, sbc solution, but yours is much nicer/quicker for the problem at hand. Factoring like that is brilliant, and I should have seen it earlier. I never realized that a divide by seven had such a regular pattern in binary. Very pretty.
A new technique has been added to my toolbox :) |
It seems that for every 2^n-1 you have such a pattern. For any integer B, you have:
Code: |
// No math symbols *sigh*
SUM(i=0..N, (1/B)^i ) = (1 - (1/B)^(N+1))/(1 - 1/B)
= (B - B^(-N))/(B-1)
= B/(B-1) - B^(-N)/(B-1)
For N -> inf :
SUM(i=0..inf, (1/B)^i) = B/(B-1)
or
SUM(i=1..inf, (1/B)^i) = 1/(B-1)
|
With B=8, you get 1/7 = SUM( (1/8)^i ) = 0.001001001... (binary)
With B=16, you get 1/15 = SUM( (1/16)^i ) = 0.000100010001... (binary)
With B=10, you get 1/9 = 0.111... (decimal). Multiply by 9 and you get 1 = 0.9999... :)
#148694 - Miked0801 - Tue Jan 08, 2008 11:27 pm
I had though the same thing. Probably something similiar with n+1 as well from 1/11 = 0.0909... and such.