One of the steps in cookit is calculating similar recipes. This is what you can see on the left on the recipe page like this

For the sake of clarity and manageability it’s scheduled as separate Hangfire jobs. Because cookit is running 5 workers, so similarities are calculated for 5 websites concurrently.
The process uses cosine similarity, so it allocates a huge list at start and calculates similarities. A very CPU heavy operation.

Continue reading...

This post is written more as a warning

Anyone who made any HackerRank problems considering performance has seen this phrase in the assignment: “watch out for slow IO”.
We are used to think about files, databases and such as potentially slow IO, but the Console?
Yes, and you will be amazed how much.

Continue reading...

One of the main processes in cookit is dealing with extracting recipe information from raw html. I know it isn’t the most elegant solution but it is the only universal one.

But to the point.

Every web page goes through a process involving html parsing, stemming, parsing, and n-gram token matching. Then it’s saved to Sql Server and after transformation to Solr. So a lot of string manipulation, math calculations and from time to time mostly 0-gen GC.

Continue reading...

In the previous post I’ve written about new features in Neo4j. One of the new game changing functions were stored procedures. But, as I experienced, getting them to run on a Windows / .NET environment wasn’t that easy, and I was seeing “There is no procedure with the name …” more often then I wished for. So here is a short how to. Hope to save you some googling.

Continue reading...

Last week I had the opportunity to attend Graph Connect Europe. Many great sessions, but one thing topped them all - Neo4j 3.0 is out!

And as with previous major release (it introduced Cypher) there are many bug fixes, tweaks, speed improvements, but here are my personal favorites:

Continue reading...