Making bits faster

Reading time ~2 minutes

This post was inspired by a discussion on Reddit that followed my previous post

In this post, I will cover a suggestion by BelowAverageITGuy that cut down the total execution time by almost one hour.

Saga

Before I go further here are some link to the previous posts on the problem of calculating similarities and then optimizing it grew to few post. Here are all of them:

Recap

The last implementation of GetNonZeroIndexes looked like this:

public IList<int> GetNonZeroIndexes()
{
    List<int> ret= new List<int>();
    for (var i = 0; i < _array.Length; i++)
    {
        if (_array[i] != 0L)// this one check saves me 64 ones
        {
            if (IsBitSet(_array[i], 0)) ret.Add(i * 64 + 0);
            if (IsBitSet(_array[i], 1)) ret.Add(i * 64 + 1);
            if (IsBitSet(_array[i], 2)) ret.Add(i * 64 + 2);
            if (IsBitSet(_array[i], 3)) ret.Add(i * 64 + 3);
            .
            .
            .
            //You get the idea
            .
            .
            .
            if (IsBitSet(_array[i], 61)) ret.Add(i * 64 + 61);
            if (IsBitSet(_array[i], 62)) ret.Add(i * 64 + 62);
            if (IsBitSet(_array[i], 63)) ret.Add(i * 64 + 63);
        }
    }
    return ret;
}

With IsBitSet looking like this:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
static bool IsBitSet(long b, int pos)
{
    return (b & (1L << pos)) != 0;
}

It is using bit operations that are very fast, so what can be done better?

Bit mask precalculation

His suggestion was to get rid of bit shifting and have the mask ready. This way I will get rid of IsBitSet and end up with GetNonZeroIndexes looking like this:

        public IList<int> GetNonZeroIndexes()
        {
            List<int> ret= new List<int>();
            for (var i = 0; i < _array.Length; i++)
            {
                var a = _array[i];
                if (a != 0UL)
                {
                    if ((a & 1UL) != 0UL) ret.Add(i * 64 + 0);
                    if ((a & 2UL) != 0UL) ret.Add(i * 64 + 1);
                    if ((a & 4UL) != 0UL) ret.Add(i * 64 + 2);
                    if ((a & 8UL) != 0UL) ret.Add(i * 64 + 3);
                     .
                    .
                    .
                    .
                    .
                    if ((a & 1152921504606846976UL) != 0UL) ret.Add(i * 64 + 60);
                    if ((a & 2305843009213693952UL) != 0UL) ret.Add(i * 64 + 61);
                    if ((a & 4611686018427387904UL) != 0UL) ret.Add(i * 64 + 62);
                    if ((a & 9223372036854775808UL) != 0UL) ret.Add(i * 64 + 63);
                }
            }
            return ret;
        }

Excel

I generate this code in Excel and after generating all the if statemenent the program started to crash. i wonder if You can spot the error:

Power of 2 Value  
49 140 737 488 355 328  
50 281 474 976 710 656  
51 562 949 953 421 312  
52 1 125 899 906 842 620  
53 2 251 799 813 685 250  

See it?
I’ve hit Excel (and Google docs for that matter) precision and in the 52th power of 2, it was rounded in a strange way. So instead of getting 1 125 899 906 842 624 I’ve got 1 125 899 906 842 620 :| There goes two hours lost on debugging the applications.

Results

How did it impact performance? It allowed the sample execution time to go down to 308.22 seconds. For all recipes I get:

(297 / 2199) * 182184 ~ 6,8 hours (starting from 34 hours)

This shaved off almost 1 hour(!) from the best time. It also shows that even things as fast as bit shifts can be made faster.

Understanding OutOfMemoryException

An analysis why OutOfMemoryException happens. Additionally a riddle why it happened this time. Continue reading

Dividing a bit in two for performance

Published on December 20, 2016

Using bit masks for high-performance calculations

Published on December 13, 2016