Quick Summary: The website is running 20% faster across all pages (<2 seconds load time), especially book details pages. This was accomplished by prioritizing page load speed and fresh information for real visitors, rather than everyone (which included a LOT of web crawlers and bots).
Hey everyone, it’s been a while since we have talked on this blog so I wanted to give you an update of the work that has occurred over the past few weeks. While nobody has complained (since the site was still loading pages in less than 3 seconds), I still realized there was room for efficiency in the code that runs NovelRank. Why? Because at any given second there are 40-50 real people accessing the site, while many multiples of that number of pages are being accessed by robots: web crawlers, bots, rss readers, widgets, etc. That’s a lot of information to process, and while NovelRank’s servers (~$100/month in costs) have been up to the task, their load was starting to getting high.
So I went to work identifying the sources of the database server overload and came up with this list (highest to lowest):
- Charting requests, especially for long-term data (more than 30 days)
- Detail stats for salesrank: minimum, maximum, average, etc (refreshed automatically every 30 days)
- CSV exports by bad bots who do not obey NOINDEX and NOFOLLOW
- Activity tracking for individual books (how I can tell someone is still looking at it)