This article is Part 5 in a 7-Part Series.
- Part 1 - How I calculate similarities in cookit?
- Part 2 - How to calculate 17 billion similarities
- Part 3 - Independent code in performance optimizations
- Part 4 - Using bit masks for high-performance calculations
- Part 5 - This Article
- Part 6 - Dividing a bit in two for performance
- Part 7 - Understanding OutOfMemoryException
This post was inspired by a discussion on Reddit that followed my previous post
In this post, I will cover a suggestion by BelowAverageITGuy that cut down the total execution time by almost one hour.
Recap
The last implementation of GetNonZeroIndexes
looked like this:
public IList<int> GetNonZeroIndexes()
{
List<int> ret= new List<int>();
for (var i = 0; i < _array.Length; i++)
{
if (_array[i] != 0L)// this one check saves me 64 ones
{
if (IsBitSet(_array[i], 0)) ret.Add(i * 64 + 0);
if (IsBitSet(_array[i], 1)) ret.Add(i * 64 + 1);
if (IsBitSet(_array[i], 2)) ret.Add(i * 64 + 2);
if (IsBitSet(_array[i], 3)) ret.Add(i * 64 + 3);
.
.
.
//You get the idea
.
.
.
if (IsBitSet(_array[i], 61)) ret.Add(i * 64 + 61);
if (IsBitSet(_array[i], 62)) ret.Add(i * 64 + 62);
if (IsBitSet(_array[i], 63)) ret.Add(i * 64 + 63);
}
}
return ret;
}
With IsBitSet
looking like this:
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static bool IsBitSet(long b, int pos)
{
return (b & (1L << pos)) != 0;
}
It is using bit operations that are very fast, so what can be done better?
Bit mask precalculation
His suggestion was to get rid of bit shifting and have the mask ready. This way I will get rid of IsBitSet
and end up with GetNonZeroIndexes
looking like this:
public IList<int> GetNonZeroIndexes()
{
List<int> ret= new List<int>();
for (var i = 0; i < _array.Length; i++)
{
var a = _array[i];
if (a != 0UL)
{
if ((a & 1UL) != 0UL) ret.Add(i * 64 + 0);
if ((a & 2UL) != 0UL) ret.Add(i * 64 + 1);
if ((a & 4UL) != 0UL) ret.Add(i * 64 + 2);
if ((a & 8UL) != 0UL) ret.Add(i * 64 + 3);
.
.
.
.
.
if ((a & 1152921504606846976UL) != 0UL) ret.Add(i * 64 + 60);
if ((a & 2305843009213693952UL) != 0UL) ret.Add(i * 64 + 61);
if ((a & 4611686018427387904UL) != 0UL) ret.Add(i * 64 + 62);
if ((a & 9223372036854775808UL) != 0UL) ret.Add(i * 64 + 63);
}
}
return ret;
}
Excel
I generate this code in Excel and after generating all the if statemenent the program started to crash. i wonder if You can spot the error:
Power of 2 | Value | |
---|---|---|
49 | 140 737 488 355 328 | |
50 | 281 474 976 710 656 | |
51 | 562 949 953 421 312 | |
52 | 1 125 899 906 842 620 | |
53 | 2 251 799 813 685 250 |
See it?
I’ve hit Excel (and Google docs for that matter) precision and in the 52th power of 2, it was rounded in a strange way. So instead of getting 1 125 899 906 842 624
I’ve got 1 125 899 906 842 620
:| There goes two hours lost on debugging the applications.
Results
How did it impact performance? It allowed the sample execution time to go down to 308.22 seconds. For all recipes I get:
(297 / 2199) * 182184 ~ 6,8 hours (starting from 34 hours)
This shaved off almost 1 hour(!) from the best time. It also shows that even things as fast as bit shifts can be made faster.