Fast as Fuck Django, Part 1: Using a Profiler

by Rich Jones on Sep 10, 2012.
_________________

If you haven't noticed, we've made some serious updates to gun.io!

As our search results are now rendered on the client side rather than on the server, we had to write a whole new API to get the data into the client. Once we had written it, we found that it was far slower than what we considered to be acceptable - results were averaging at 1.7s - and we shoot for sub 200ms response times. Not good! But what was going wrong? Why was Django being so slow? It actually wasn't easy to see immediately, and the normally wonderful Django Debug Toolbar wasn't any use as it doesn't work on AJAX requests.

Using a Profiler

What we needed was a way to profile our code through the browser. Much digging around on the internet found a very old snippet of a Django profiling middleware which seemed perfect for the tasks. Unfortunately, it didn't work with modern versions of Django, but I quickly fixed it up and am including it below.

To use this middleware, save this file in your application as middleware.py. Then, in the MIDDLEWARE_CLASSES of your settings.py, include the line

'yourproject.yourapp.middleware.ProfileMiddleware',

(with the 'yourproject' and 'yourapp' changed to your appropriate values, of course.) You will also need to install the 'hotshot' package, so if you don't have that installed already then pip install hotshot.

Reading the Output

Now, to use the profiler, simply add the string '?prof' to the end of your URL, and you'll see the profile of that page. Awesome! But.. what does this wall of voodoo mean?

The profiling output is not very intuitive if you haven't seen it before, so let's dive in!

The columns it's showing you are the number of times a function is called, the "total time" that function took, the time per call, the cumulative time of that function, and the ratio of cumulative time to primitive calls. We're interested in 'cumtime', the fourth column. The results aren't sorted, so you'll have to look through the list and see which functions are taking the most time. Anything greater than 0.100 or even .050 could be worth investigating.

If the function has a low per-call time but a high cumulative time, your code is probably spending too much time in a loop - see if you can find a more efficient algorithm, use less nested loops or way to break earlier! The string associated with your problem line may actually not be immediately helpful, but hopefully it will serve as a good starting point for you - if you see many expensive calls to 'sql', it could be your database code, if you see many expensive calls to 'template', it could be your templating code, etcetera etcetera.

In our case, we had forgotten to use a 'selected_related()' on one of our database calls. Now our API is super fast, hooray!

Once you're done, remember to remove the middleware from your settings!

Hopefully this was helpful for you! If you liked this article, leave a comment below and I'll write some more posts on how to speed up Django, maybe one about all of the ways to use caching, one about speeding up the templating and one about building fast models and how to access them properly. What do you think?

Learn and Earn!

Sign up for great tutorials, guides, rants, raves and opportunities to earn more money!