This post was inspired by a discussion on Reddit that followed my previous post

In this post, I will cover a suggestion by BelowAverageITGuy that cut down the total execution time by almost one hour.

## Saga

Before I go further here are some link to the previous posts on the problem of calculating similarities and then optimizing it grew to few post. Here are all of them:

- How I calculate similarities in cookit?
- How to calculate 17 billion similarities
- Independent code in performance optimizations
- Using bit masks for high performance calculatons
- Making bits faster
- Dividing a bit in two for performance
- Understanding OutOfMemoryException

## Recap

The last implementation of `GetNonZeroIndexes`

looked like this:

```
public IList<int> GetNonZeroIndexes()
{
List<int> ret= new List<int>();
for (var i = 0; i < _array.Length; i++)
{
if (_array[i] != 0L)// this one check saves me 64 ones
{
if (IsBitSet(_array[i], 0)) ret.Add(i * 64 + 0);
if (IsBitSet(_array[i], 1)) ret.Add(i * 64 + 1);
if (IsBitSet(_array[i], 2)) ret.Add(i * 64 + 2);
if (IsBitSet(_array[i], 3)) ret.Add(i * 64 + 3);
.
.
.
//You get the idea
.
.
.
if (IsBitSet(_array[i], 61)) ret.Add(i * 64 + 61);
if (IsBitSet(_array[i], 62)) ret.Add(i * 64 + 62);
if (IsBitSet(_array[i], 63)) ret.Add(i * 64 + 63);
}
}
return ret;
}
```

With `IsBitSet`

looking like this:

```
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static bool IsBitSet(long b, int pos)
{
return (b & (1L << pos)) != 0;
}
```

It is using bit operations that are very fast, so what can be done better?

## Bit mask precalculation

His suggestion was to get rid of bit shifting and have the mask ready. This way I will get rid of `IsBitSet`

and end up with `GetNonZeroIndexes`

looking like this:

```
public IList<int> GetNonZeroIndexes()
{
List<int> ret= new List<int>();
for (var i = 0; i < _array.Length; i++)
{
var a = _array[i];
if (a != 0UL)
{
if ((a & 1UL) != 0UL) ret.Add(i * 64 + 0);
if ((a & 2UL) != 0UL) ret.Add(i * 64 + 1);
if ((a & 4UL) != 0UL) ret.Add(i * 64 + 2);
if ((a & 8UL) != 0UL) ret.Add(i * 64 + 3);
.
.
.
.
.
if ((a & 1152921504606846976UL) != 0UL) ret.Add(i * 64 + 60);
if ((a & 2305843009213693952UL) != 0UL) ret.Add(i * 64 + 61);
if ((a & 4611686018427387904UL) != 0UL) ret.Add(i * 64 + 62);
if ((a & 9223372036854775808UL) != 0UL) ret.Add(i * 64 + 63);
}
}
return ret;
}
```

## Excel

I generate this code in Excel and after generating all the if statemenent the program started to crash. i wonder if You can spot the error:

Power of 2 | Value | |
---|---|---|

49 | 140 737 488 355 328 | |

50 | 281 474 976 710 656 | |

51 | 562 949 953 421 312 | |

52 | 1 125 899 906 842 620 | |

53 | 2 251 799 813 685 250 |

See it?

I’ve hit Excel (and Google docs for that matter) precision and in the 52th power of 2, it was rounded in a strange way. So instead of getting `1 125 899 906 842 624`

I’ve got `1 125 899 906 842 620`

:| There goes two hours lost on debugging the applications.

## Results

How did it impact performance? It allowed the sample execution time to go down to 308.22 seconds. For all recipes I get:

```
(297 / 2199) * 182184 ~ 6,8 hours (starting from 34 hours)
```

This shaved off almost 1 hour(!) from the best time. It also shows that even things as fast as bit shifts can be made faster.