pak-web-site: speed up your web site
what's this then?
Techniques for speeding up a web site include merging background images and using CSS to present parts (tiles) of an image where needed, making dynamic pages static by inserting the dynamic content where possible, and compressing the single web page with gzip. This page details the work I've done recently to shrink this website by about 60%, or make it able to send much more content for the same bandwidth.
These techniques are worth a look as they will reduce page loading times for your end-users, and also reduce your server bandwidth requirements.
I did this work because this website is hosted from home on an ADSL link with only 128kbps (that's k bits per second) uplink speed. The web site here is written with SSI (Server Side Includes) to provide some dynamic content as well as static content like the left side panel and footer information.
Another benefit of the html file compression technique described here is that pages made static and compressed also become cacheable. Prior to being processed, the browsers have to reload the page for each refresh, rather than simply check for a more recent file datestamp.
techniques
The following techniques are used to shrink the web site files and improve the web content delivery speed.
- background images
This web page used
to have six images for background effects, that page title background
is made of three images, left and right end images, plus the
expanding middle image, then there's the bottom bar background,
another expanding (repeat-x) image. These
four images were merged into one image, and the CSS altered to
display tiles from that one image where required.
- Here are the matching CSS directives for the header and footer backgrounds:
#head {background:url(/image/abc.png) 0 0 repeat-x;height:48px} #head div {background:url(/image/abc.png) 0 -48px no-repeat;height:48px} #head div div {background:url(/image/abc.png) 100% -96px no-repeat;height:48px} #foot {background:url(/image/abc.png) 0 -144px repeat-x;height:22px}- Here's the file sizes for the four individual images and the merged image:
-rw-r--r-- 1 grant wheel 2045 2008-08-21 08:50 abc.png -rw-r--r-- 1 grant wheel 152 2008-08-21 08:50 bottom.png -rw-r--r-- 1 grant wheel 946 2008-08-21 08:49 topleft.png -rw-r--r-- 1 grant wheel 194 2008-08-21 08:49 topmid.png -rw-r--r-- 1 grant wheel 970 2008-08-21 08:49 topright.png
- Notice there's only a small savings in filesize, so why bother? It's about reducing the number of files requested from the web server, as we replaced four file requests only one file request for the background effect images. The fewer files to fetch, the faster a web page loads as the cached image file is reused for the other image tiles.
- Likewise for the left panel menu, the two image files were merged into one image, it's only one pixel high, displayed here 15 pixels high to make it visible.
Use repeat-y to fill the vertical area as required.
- Here are the matching CSS directives for the header and footer backgrounds:
- html files
- Most of the .html pages on this site use SSI (Server Side
Includes) to get the left panel menu and the footer text. These
includes are for editing convenience and rarely change. Some
pages require SSI pages at delivery time — these pages cannot
be compressed.
- So the process to compress the .html pages is to first insert those semi-static SSI pages so there's only one file, then compress that file to *.html.gz.
- text files
- Lots of them on this site, as it offers source code for many
scripts and so on. I used to symlink a 'normal' filename to the
gzipped textfile to trick the server into send compressed text,
however this trick may confuse some browsers.
- Doing the text file compression properly is a case of compressing the text file and letting the server decide when it can send the compressed version.
- gotchas: side effects
- One needs to remember to delete a compressed file.html.gz page before editing the uncompressed file so that the edit results can be seen immediately in the test browsers. Otherwise edit changes do not appear as the server is happily sending the old compressed version of the file instead of the freshly edited one.
adjusting the web server
Okay, now that you've seen how to compress and optimise your web pages, we turn to how to tell the web server when to deliver those compressed files to the user agent (browser).
This web server information is based on apache 1.3.41 and the settings are in the /etc/apache/httpd.conf file. You may adapt these techniques to .htaccess control files, but I have done no testing for that method here.
- ssi: server side includes
- The 'XBitHack Full' option uses the execute flag to decide on
processing a web page for SSI directives, this is why the pages here
are all .html instead of .shtml.
# SSI support AddType text/html .shtml AddHandler server-parsed .shtml XBitHack Full
- add the new compressed file types
- This is what tells the browser about the compressed content,
otherwise the browser may offer a download dialog for the content
or ignore the compressed files.
# for compressed html, txt AddEncoding x-gzip .gz AddType text/plain;charset=us-ascii .txt.gz AddType text/html;charset=utf-8 .html.gz
- tell the server when to send compressed files
- The server may only send .gz compressed files when the browser
indicates it can handle the compression, and if the compressed file
exists.
# perhaps send text files compressed RewriteCond %{HTTP:Accept-Encoding} gzip RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME}.gz -f RewriteRule (.*)\.html$ $1.html.gz [L] RewriteCond %{HTTP:Accept-Encoding} gzip RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME}.gz -f RewriteRule (.*)\.txt$ $1.txt.gz [L]
pak-web-site
The script presented here does the conversion from SSI (Server Side Includes) dynamic pages to static pages plus the compression. Some pages cannot be made static as they require SSI information available only at page delivery time.
- pak-web script
- A bash script to remove compressed files and restore *.html.src files when called as pak-web restore and to merge / compress web files when called as pak-web. There's also a special call used for semi-dynamic pages, and example is at the end of the junkshow script.
- pak-web-scan
- Bash + awk script to scan for ssi include files and collect their full pathnames for the pak-web-site script.
- pak-web-site script
- Awk script that reads the SSI files to memory, then checks and replaces the SSI directives with their contents. If a target file is found to have unrecognised SSI directives that file is not compressed.
sitewide edits on the command line
$ for x in $(find . -name "*.html" -type f); do mv $x $x~; awk '
/"-->/{sub(/"-->/, "\" -->")}
{print}' $x~ > $x; [ -x $x~ ] && chmod +x $x; done