skip to main content

TAMU Webmaster's Blog


Information and insight from the A&M Webmasters

“Today” Cache Troubleshooting (Part 1)

February 5th, 2015 by Charlie Harris

As I’ve mentioned before we’ve been using Varnish to improve our performance on WordPress sites for a while now. Though we are on our third iteration of the cache server, there always seems to be a few kinks to work out when we launch something new.  The “Today” news site is no exception.

Our first release on Today was not as smooth as we wanted. The performance was way below what we were expecting, and below what we saw on Tamutimes. We discovered that something was circumventing the cache server and directly accessing the backend web and database servers. Upon further inspection it was discovered that the culprit was some redirects from the old Tamutimes site. After fixing those, things were much smoother.

My point in sharing this is that, I thought I would take this post and the next going over two of the built-in Varnish tools for monitoring that helped us identify what was happening.

Varnishhist

Varnishhist is a neat tool that basically just shows a graphic of what is going on with your cache server. It is invoked just by running the following on the command line:

varnishhist

 

And here is an example of what our cache server looks like at a normal time:

Screenshot from 2015-02-02 16:03:11

 

So reading this is fairly simple. It really is just showing the cache “hits” and “misses” and the time it took to serve the request. The “|” symbols are the hits and “#” represents a miss, or a call to the backend server. Along the bottom is the time to serve the request. Basically the more to the left the hits are the better. 1e0 is equal to one second, so 1e-1 is equal to 1/10th and 1e-6 is equal to 0.000001 or around one microsecond. Anything in the -4 to -5 range is pretty good. 

Also from this graph you can roughly see that almost all of the requests are hitting the cache. None of these requests are even making it to the Apache server. This is massively helpful with WordPress. You can see how much slower the misses are when they have to go all the way through Apache and likely make a database call and return that information.

When we launched the Today site, we were seeing almost the exact opposite of this graphic. The peak was all on the misses side at around 1/10th to 1 second. We knew something was wrong immediately. While Varnishhist doesn’t show a lot of detail, it can be very helpful to have running to give you a very quick look at what is going on.

Next post I will go over the other main tool – varnishstat.

Thursday, February 5th, 2015 Systems
Share this article

No comments yet.

Leave a comment

Categories

Archives