what's this then?

Techniques for speeding up a web site include merging background images and using CSS to present parts (tiles) of an image where needed, making dynamic pages static by inserting the dynamic content where possible, and compressing the single web page with gzip. This page details the work I've done recently to shrink this website by about 60%, or make it able to send much more content for the same bandwidth.

These techniques are worth a look as they will reduce page loading times for your end-users, and also reduce your server bandwidth requirements.

I did this work because this website is hosted from home on an ADSL link with only 128kbps (that's k bits per second) uplink speed. The web site here is written with SSI (Server Side Includes) to provide some dynamic content as well as static content like the left side panel and footer information.

Another benefit of the html file compression technique described here is that pages made static and compressed also become cacheable. Prior to being processed, the browsers have to reload the page for each refresh, rather than simply check for a more recent file datestamp.

techniques

The following techniques are used to shrink the web site files and improve the web content delivery speed.

background images
tiled background image for header and footerThis web page used to have six images for background effects, that page title background is made of three images, left and right end images, plus the expanding middle image, then there's the bottom bar background, another expanding (repeat-x) image. These four images were merged into one image, and the CSS altered to display tiles from that one image where required.
Here are the matching CSS directives for the header and footer backgrounds:
#head		{background:url(/image/abc.png) 0 0 repeat-x;height:48px}
#head div	{background:url(/image/abc.png) 0 -48px no-repeat;height:48px}
#head div div	{background:url(/image/abc.png) 100% -96px no-repeat;height:48px}
#foot		{background:url(/image/abc.png) 0 -144px repeat-x;height:22px}

Here's the file sizes for the four individual images and the merged image:
-rw-r--r-- 1 grant wheel 2045 2008-08-21 08:50 abc.png
-rw-r--r-- 1 grant wheel  152 2008-08-21 08:50 bottom.png
-rw-r--r-- 1 grant wheel  946 2008-08-21 08:49 topleft.png
-rw-r--r-- 1 grant wheel  194 2008-08-21 08:49 topmid.png
-rw-r--r-- 1 grant wheel  970 2008-08-21 08:49 topright.png

Notice there's only a small savings in filesize, so why bother? It's about reducing the number of files requested from the web server, as we replaced four file requests only one file request for the background effect images. The fewer files to fetch, the faster a web page loads as the cached image file is reused for the other image tiles.
Likewise for the left panel menu, the two image files were merged into one image, it's only one pixel high, displayed here 15 pixels high to make it visible. left panel menu background combined image Use repeat-y to fill the vertical area as required.
html files
Most of the .html pages on this site use SSI (Server Side Includes) to get the left panel menu and the footer text. These includes are for editing convenience and rarely change. Some pages require SSI pages at delivery time — these pages cannot be compressed.
So the process to compress the .html pages is to first insert those semi-static SSI pages so there's only one file, then compress that file to *.html.gz.
text files
Lots of them on this site, as it offers source code for many scripts and so on. I used to symlink a 'normal' filename to the gzipped textfile to trick the server into send compressed text, however this trick may confuse some browsers.
Doing the text file compression properly is a case of compressing the text file and letting the server decide when it can send the compressed version.
gotchas: side effects
One needs to remember to delete a compressed file.html.gz page before editing the uncompressed file so that the edit results can be seen immediately in the test browsers. Otherwise edit changes do not appear as the server is happily sending the old compressed version of the file instead of the freshly edited one.

adjusting the web server

Okay, now that you've seen how to compress and optimise your web pages, we turn to how to tell the web server when to deliver those compressed files to the user agent (browser).

This web server information is based on apache 1.3.41 and the settings are in the /etc/apache/httpd.conf file. You may adapt these techniques to .htaccess control files, but I have done no testing for that method here.

ssi: server side includes
The 'XBitHack Full' option uses the execute flag to decide on processing a web page for SSI directives, this is why the pages here are all .html instead of .shtml.
	# SSI support
	AddType text/html .shtml
	AddHandler server-parsed	.shtml
	XBitHack Full

add the new compressed file types
This is what tells the browser about the compressed content, otherwise the browser may offer a download dialog for the content or ignore the compressed files.
	# for compressed html, txt
	AddEncoding x-gzip .gz
	AddType text/plain;charset=us-ascii		.txt.gz
	AddType text/html;charset=utf-8		.html.gz

tell the server when to send compressed files
The server may only send .gz compressed files when the browser indicates it can handle the compression, and if the compressed file exists.
        # perhaps send text files compressed
        RewriteCond %{HTTP:Accept-Encoding} gzip
        RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME}.gz -f
        RewriteRule (.*)\.html$         $1.html.gz              [L]

        RewriteCond %{HTTP:Accept-Encoding} gzip
        RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME}.gz -f
        RewriteRule (.*)\.txt$          $1.txt.gz               [L]

pak-web-site

The script presented here does the conversion from SSI (Server Side Includes) dynamic pages to static pages plus the compression. Some pages cannot be made static as they require SSI information available only at page delivery time.

pak-web script
A bash script to remove compressed files and restore *.html.src files when called as pak-web restore and to merge / compress web files when called as pak-web. There's also a special call used for semi-dynamic pages, and example is at the end of the junkshow script.
pak-web-scan
Bash + awk script to scan for ssi include files and collect their full pathnames for the pak-web-site script.
pak-web-site script
Awk script that reads the SSI files to memory, then checks and replaces the SSI directives with their contents. If a target file is found to have unrecognised SSI directives that file is not compressed.

sitewide edits on the command line

$ for x in $(find . -name "*.html" -type f); do mv $x $x~; awk '
	/"-->/{sub(/"-->/, "\" -->")}
	{print}' $x~ > $x; [ -x $x~ ] && chmod +x $x; done