You have your business engine all setup. Your processing algorithms have been optimized to the hilt. Your data model is as scalable as any. Any now you are publishing your data to the WWW. Well, the good news is you can still do more - all with plain old HTTP.
Herein I list 3 simple ways you can leverage HTTP to help you better serve your content:
Cache-Control
headersETag
+ If-None-Match
gzip
1. Cache-Control
headersCache-Control
is byfar is the most widely used of all http headers and for good reason. You generate your response and you set a
Cache-Control
response header with a validity period. The clients, the intermediaries and the web infrastructure at large all work overtime for you caching your content until the time that you have tagged it valid.
This of course works best for static resources or for such dynamic resources whose validity you can reasonably predict before hand.
2. ETag
+ If-None-Match
This is one of the most powerful but unfortunately, a not-frequently used technique. So your content is such that you can't reasonably predict its validity period. Which means #1 doesn't work for you. Your next best buddy is
ETag
s.
This is how it works: You generate your content and you set the http header
ETag
(entity tag). The
ETag
represents the state of your resource. Even if one bit of the resource content changes, so does its
ETag
. You can think of
ETag
as a simple hash of your content. Ok so you have set the
ETag
and sent the response. Now the next time the client tries to access the same URL, it will send an
If-None-Match
request header and it will be set to the same value as that of your
ETag
. Now you can either regenerate the content or you may have it cached on your server, if the
ETag
of your content matches the
If-None-Match
it implies that the content has not changed. The client has indicated to you thru the
If-None-Match
that it already has this content. So what do you do - send
nothing! Yes - simply set the response status to HTTP 304 (Not Modified) and the size of the content that you send this time is exactly 0. At a minimum, you gain in saved bandwidth (read performance) but if you have cached the content on your server, you also gain from saved computation.
3. gzip
With #1 and #2 you benefit by not having to resend your content in certain situations. But even when you can't get away from having to send content, you can still gain in bandwidth by simply compressing it. But of course you want to compress content only for clients that you know can decompress it and even then you need to tell the client that the content you sent is compressed and it needs to decompress it.
No hassle - http makes it fairly straightforward. If the client understands gzip, it sends an
Accept-Encoding
request header with the string
gzip
in it. If you (the server) read this header and find the string
gzip
, you gzip your content and set a response header
Content-Encoding
to
gzip
. This tells the client that the content is gzipped and it must decompress it before providing it to the user.
It's normal for gzip to compress text content in upwards of 70% and given how easy it is to compress content you should be compressing your content right about now.
Note that #2 + #3 put together have
problems in IE 6 and you might have to take that into account.
But all in all optimizing the delivery of your content with http is simple yet powerful and your web application can only benefit from it.