Monday, 17 April 2017

Step-by-step Guide to Optimizing your Apache site with mod_pagespeed

If you host your own website running Apache and are not already checking out mod_pagespeed, you really should.  Created by Google, mod_pagespeed is “an open-source Apache module which automatically applies web performance best practices to pages, and associated assets (CSS, JavaScript, images) without requiring that you modify your existing content or workflow.”
While there are never silver bullets in life, you can consider mod_pagespeed close to a “bronze bullet” for many common performance best practices. Although it’s a but lengthy, here is a great video to learn more:


In this post, I wanted to walk you through a hands-on demonstration showcasing the capabilities of mod_pagespeed. I think you’ll be quite impressed with both the capabilities and ease of installation.

Getting Started

Launching a Test Instance
First, I assume you have an Amazon AWS account. If you don’t, Amazon has a free tier available to get you started. After logged in, head over to your EC2 dashboard and launch a 64 bit “micro” instance of the 64 bit Amazon Linux AMI (make sure both SSH port 22 and HTTP port 80 are open). Configuring and connecting to this instance is beyond the scope of this post, but there are many great resources available to help, for example this help document from Amazon. If you primarily live in the Windows world, here’s a great guide on how to connect via the PuTTY tool.
Once your instance is launched, connect via your SSH tool using the keypair file used when launching. You will be connecting as the "ec2-user" user, so your connection will have an address like this (replacing with your own ec2 instance DNS address):
ec2-user@ec2-54-226-251-246.compute-1.amazonaws.com
You will need Linux root access for much of this exercise, so I recommend you start by entering this:
sudo su
Install Apache
Next up, you need to install and start Apache. This is quite easy:
yum install httpd
service httpd start
Whenever you need to recycle Apache (whenever config file changes are made below), you can easily do so via this:
service httpd restart
Download Sample
Now go to the base web directory:
cd /var/www/html
and download this sample:
wget http://zoompf.com/downloads/modpagespeed_sample.tar.gz
and uncompress it:
gunzip modpagespeed_sample.tar.gz
tar -xvf modpagespeed_sample.tar
6 sample files should have been extracted: sample.htmlsample1.csssample2.csssample1.jssample2.js, and nature.jpg.
To verify everything is working, load the sample.html file in a browser using your EC2 instance address, something like this:
http://ec2-54-226-251-246.compute-1.amazonaws.com/sample.html
You should see a very basic page like this:
modpagespeed1

Review the Unoptimized State

Before we install mod_pagespeed, lets first take a look at our current state. To do that, we are going to use an amazing HTTP tool, RedBot. (We discussed RedBot at length in a previous blog post). To start, let’s pull up our sample page in RedBot to look at the returned HTTP headers. If you examine your HTML file (http://(your ec2 instance)/sample.html) you’ll see something like this:
modpagespeed_redbot_before

Note the lack of this line:
Content-Encoding: gzip
In other words, no HTTP compression. This is one of the biggest performance wins you can make, so missing this is a big no-no. We also have an exhaustive blog post discussing HTTP Compression.
Also check out the URLs to some of the dependent resources, for example the sample1.js file:
modpagespeed_redbot_before2b

Note the lack of an Expires or Cache-Control header to allow for browser caching, another big performance best practice being missed.
Looking at the sample.html file, there are (intentionally) a number of other performance bad practices at play here:
modpagespeed2
Of particular note:
  • CSS files are included at the bottom (See Put Style Sheets at the Top)
  • CSS files are not combined or minified
  • Javascript files are not combined or minified
  • HTML comments and whitespace are extra bloat for production code
  • The image “nature.jpg” is needlessly large (543 kb!)
For a quick primer on these top performance best practices, check out our earlier blog How to Improve Your Conversion Rates With a Faster Website.

Install mod_pagespeed

Okay, now let’s install mod_pagespeed to see what improvements we can make.
To begin, in your terminal window go back to your home directory:
cd /home/ec2-user
And download mod_pagespeed:
wget https://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-stable_current_x86_64.rpm
Now install it:
yum install at
rpm -U mod-pagespeed-*.rpm
Restart Apache:
service httpd restart
Just like that, you’re done! That was easy.

Results

Let’s see what just happened. First of all, load up your http://(your ec2 instance)/sample.html URL in a browser again. This is important to do for a somewhat obscure reason: mod_pagespeed does not optimize many of your resources until first load, so you’ll need to “prime” the page with an initial load to kick off the optimization process. Also, there may be a slight delay in observing these optimizations, as the priming process runs in the background and may take several seconds to complete. I’ve found waiting 15 seconds is usually more then enough for this sample. Of course individual results may vary on larger sites, but the good news is this only need be done once.
After waiting 15 seconds, pull that URL up in redbot again. You should see something like this:

modpagespeed_after1
Notice (highlighted in yellow) 2 significant additions:
  1. The X-Mod-Pagespeed: (version) header, verifying this module is working
  2. The Content-Encoding: gzip header, verifying that compression is now being used.
Good! Now, let’s do a view-source on that sample.html file now:
modpagespeed_after2
You’ll notice a few things changed.
First, the URLs for the Javascript files were suffixed with a “pagespeed” string, in this case .pagespeed.jm.Gk61Cef300.js. This allows pagespeed to supply versioning to the file (for cache busting), while also allowing a rewrite to a controlled resource. If you click that link, you’ll see an automatically minified version of the javascript. Nice!
Secondly, the CSS includes also changed. But in this case, now there is only 1 link! mod_pagespeed did 2 things here, it combined sample1.css and sample2.css into one file, and then minified the results!
So in one fell swoop, we already got the following:
  1. Compression
  2. Minified CSS and Javascript for smaller file sizes
  3. Combined CSS files for less downloads
If you plug one of those CSS or JS URLs into redbot, you’ll also notice those links have Expires and Cache-Control headers (for browser caching) and gzip encoding.
modpagespeed_after3

Just like that we already knocked off a number of performance best practices! But we can go a lot further still…

Customizing mod_pagespeed

mod_pagespeed is incredibly configurable. By default, this module enables a set of conservative “filters” to apply various performance best practices in a manner that is generally considered “safest” for the wide majority (where safe=no visible difference in the page rendering).
Depending on how your site is coded, you can go a lot further by enabling “custom filters”. To see a full list of what’s possible (and what’s turned on by default), check out the list here.
For this exercise, we’re going to enable the following: move_css_to_head, move_css_above_scripts, convert_jpeg_to_webp, remove_comments, combine_javascript, and collapse_whitespace.
To get started, let’s edit the config file:
cd /etc/httpd/conf.d
vi pagespeed.conf
Scroll down to the section with commented out lines like this:
ModPagespeedEnableFilters (some filters...)
And add these lines.
ModPagespeedEnableFilters move_css_to_head,move_css_above_scripts
ModPagespeedEnableFilters remove_comments,collapse_whitespace
ModPagespeedEnableFilters combine_javascript,convert_jpeg_to_webp
If you don’t remember your vi shortcuts (as I didn’t), here’s a need to know primer:
  • Scroll with the arrow keys
  • hit “i” to begin editing
  • hit escape to stop editing
  • enter “:wq” to save and quit
Now restart apache
service httpd restart
And prime mod_pagespeed by loading the sample URL once:
http://(your ec2 instance)/sample.html
Wait a little bit for mod_pagespeed to do its thing in the background, now load the page again. You should see something like this:
 modpagespeed_after4
A number of new interesting things just happened:
  1. No more comments or extra spaces in the HTML file
  2. The CSS link is now in the head tag, where it should be
  3. The nature.jpg file now shows a .webp extension! (if you’re using a supported browser, like Chrome)
  4. The locally served javascript files were combined into one
The WebP conversion is particularly cool, as the .webp version of that .jpg file is significantly smaller (296 kb vs. 530 kb, or 44% smaller!), but WebP is not universally supported on all browsers. mod_pagespeed decides for you!

In Closing

While the example above was simplified for purposes of demonstration, I think you can see the true power of this module by walking through these steps. And this is just the tip of the iceberg! Check out the filter page for many more interesting customizations, including lazy loading images, deferring javascript execution, statistics for A/B testing and more!
Of course with anything, you should test these changes well when applying to your (more complex) site. The default filters were conservatively chosen by the google engineers to “do no harm”, but individual results may vary. What’s great is mod_pagespeed is highly customizable, allowing you to turn on and off filters at will based on the needs of your site.
I hope you found this walkthrough beneficial. In future posts  I will dive into more advanced customization scenarios.
And of course if you want to validate the success of turning on mod_pagespeed, as well as identify over 400 common causes of slow performance, check out our free website audit report at http://zoompf.com/free.

No comments:

Post a Comment