Warning this post contains some math. Better still, it shows how to use it to solve real-life problems.

This post describes how I calculate similarity between recipes in my pet project cookit.pl. For those not familiar with it, cookit is a search engine for recipes. It crawls websites extracting recipes, then parses them and tries to create a precise ingredient list replete with amounts and units.

By the time of writing it had:

  • 182 184 recipes
  • 2936 ingredients

This scale may not seem huge, but trust me - It’s enough to bring a slew of problems to light. And that cookit runs on a crappy server, partly by choice, can make things all the more complicated.

Continue reading...

This post is covering a subset of what I am talking in my talk How I stopped worrying and learned to love parallel processing (currently only in polish).

This will cover on how, in terms of performance, AsParallel can kick you in a place where it hurts a lot, simultaneously being a blessing in terms of… performance. How is that? Let’s look at some

History

AsParallel was introduced as an extension to LINQ with TPL in .NET 4.0. In theory, it’s God’s sent. The promise was that it will:

  • parallelize the LINQ query.
  • take care of all thread management and synchronization.
  • adjust the number of Tasks automatically.
  • not require any additional code changes except for .AsParallel()

And in the vast majority of cases, this promise was kept! For example look at this code:

Continue reading...

Diagnosing high memory usage can be tricky, here is the second part of how I found what was hogging to much memory in our system. In the previous post I’ve wrote how to create a memory dump and how many possibilities of catching just the right moment for it ProcDump has. When trying to analyze memory leaks, or high memory usage (not necessary meaning a leak) we have a few ways to approach it:

Continue reading...

I’m taking a short break from Hangfire series, but I will get back to it.

This time - Where did my memory go ? Or to be more exact: Why is this using so much memory?

The story starts with one IIS application pool using around 6 Gigabytes of memory on one of our test environments. It was several times above the values that we expected it to use, so we decided to investigate.

Without much thinking we fired up Visual Studio installed on the test server, and attached to the process. Since the application was build in Debug mode we had all the pdb files in the website folder.

Do I have your attention now? The above paragraph is of curse a joke and a bunch of anti patterns. Don’t do any of them!

Continue reading...

Parts 3, 4, and 5 covered the BackgroundJob class responsible for enqueuing single jobs (fire and forget). This post will cover RecurringJob class exposing API for recurring jobs (as the name suggests).

Recurring job

Before we go into the API, let’s take a look what is a recurring job in Hangfire. Recurring job is a timer that enqueues a job at specific time intervals defined with a cron expression. What is important is, that it does not execute the job. Only enqueues an ordinary Hangfire job. This implementation is very elegant, but it also means that if the queue is full the job will have to wait for its turn. So there is no guarantee about the time it will actually execute.

Continue reading...