Blog: Fixing Umbraco 100% CPU spikes and crashes
If you are reading this you’ve probably been in that horrible position where your website is down or slow and you don’t know why. The site doesn’t seem particularly busy – but it is down and not coming back until your technical guy gets onto the server and restarts the site.
At Moriyama we’ve helped a number of customers overcome these kind of situations, so we’ve decided to share some advice – or you could simply get us to help, we offer a fixed cost service to diagnose and troubleshoot these problems (call us for details).
What is going on?
Assuming that your site isn’t underpowered for the number of visitors (in hardware terms), 100% CPU spikes are normally due to one of a few of reasons.
- Concurrent threads accessing a dictionary/collection that isn’t thread safe
- Sleeping and locking threads
- Poor memory management and the subsequent impact on garbage collection.
Older releases of Umbraco have seen instances of #1 and #2.
What can I do about it?
The first and most obvious solution is to upgrade Umbraco – At Moriyama we offer fixed price Umbraco upgrades.
If at all possible you should upgrade to the latest version of Umbraco 7, but if not – Umbraco 6.2.6 has been released recently. It contains contributions from Moriyama which help to resolve an issue that could cause 100% CPU spikes.
What if an Upgrade doesn’t help?
We’ve also seen 100% CPU spikes in Umbraco where excessive Log4net logging is used – and Log4Net levels are set to DEBUG or similar in production. Often disabling Log4Net completely can provide a temporary solution and retrospectively reducing the amount of logging throughout the code base.
Umbraco also ships with a custom Log4net appender Umbraco.Core.Logging.AsynchronousRollingFileAppender we’ve often discovered better performance results when using the out of the box Log4Net loggers.
Also in Umbraco 6 we’ve seen 100% CPU spikes when media and content trees contain hundereds or thousands of child nodes – you should sort media and content folders into date/alphabetical folders. Umbraco 7 has resolved these issues when expanding the trees.
What if I still have a problem?
The three common reasons for 100% CPU spikes may be occurring in the code that you or your agency added to Umbraco.
Check the code for Thread.Sleep and lock statements and see if they are necessary and can be refactored. Also check for concurrent threads accessing dictionaries or other collections that may not be thread safe.
If the above doesn’t help then you need to get into the very technical world of analysing crash dumps.
Diagnose in production
The most important advice that we can give is to try and troubleshoot your Umbraco 100% CPU spikes on your production server. Almost all of these issues will be caused by several concurrent users performing actions at the same time – and will be incredibly difficult to replicate on a developer PC – without the use of complicated load testing tools.
To get started – we recommend this excellent blog post by Mark S. Rasmussen.
Beyond the advice given by Mark – we’d recommend that the debug diagnostic tool from Microsoft may be a more user friendly tool to begin analysing crash dumps with.
We’d also recommend that if you do resort to using WinDbg to get a CLR stack from a thread – it is probably best to install it on the production server, as it can be tricky to get dumps to load on your own PC.
This is all too Techie!
Tools like LeanSentry and NewRelic can be a quick and easy way to diagnose performance issues in your code – though we often find that it is just sugar coated user interface over the top of the information that you can get with the Microsoft tools.
I need this fixed now!
If you don’t have a techie on hand to help out with all of advice given above, please give us a call. We’re happy to provide a free assessment of your site – and a fixed cost solution.