File expiration

Keeping cached objects consistent with the original object on the content server is known as maintaining cache freshness. For each document or other object that it caches, Caching Proxy computes a time at which the object expires.

For HTTP pages, the header of the document, generated by the content server, contains the expiration information.

Because the FTP protocol does not include equivalent expiration information, Caching Proxy generates its own Last-Modified: header for FTP files, based on the FTP directory information for each file, and uses this information to compute expiry times. If the proxy server cannot obtain directory information for the file from the FTP server, the default value that matches the FTP URL is used. In addition, because there is no standard date format for FTP servers, Caching Proxy might be unable to understand the date and time sent by some FTP servers. In that case, the proxy server's default expiry time value is used. This procedure allows the proxy to manage the caching of HTTP pages and FTP files in a similar manner.

Expiration can be specified by a content server in one of several ways (in order of preference):

  1. The content server specifies a header saying Cache-control: s-maxage=n. This tells the proxy that the object is fresh for n seconds after it is received.
  2. The content server specifies a header saying Cache-control: max-age=n. This tells the proxy that the object is fresh for n seconds after it is received.
  3. The content server specifies a header saying: Expires: n. This tells the proxy that the object is fresh until the time specified by n.
  4. The content server indicates when the document was last modified, using a Last-Modified: n header. The proxy server computes the length of time since the document was last modified, multiplies this by the Cache Last Modified factor set in the proxy configuration file, and assumes that the document is valid for that length of time. For example, if the content server indicates that the document was last modified one week (seven days) ago, and the Cache Last Modified factor is 0.14, then the proxy server assumes that the document is valid for about one day. See Configuring cache freshness for instructions on setting the Cache Last Modified factor.
  5. If none of the above information is specified by the content server, Caching Proxy looks for the Cache Default Expiry setting that matches the current URL and uses that for the expiry time. See Configuring cache freshness for instructions on setting the Cache Default Expiry values.

After the expiry time is computed as just described, Caching Proxy checks to see whether there is a Minimum Hold value that applies for this URL. If there is, and the time it specifies is longer than the computed expiry time, then the time specified by the Minimum Hold value is used as the object's expiry time. This is true even if Caching Proxy computes an expiry time of 0 minutes for a document. Therefore, to avoid serving stale content, be cautious about using the Minimum Hold setting. (To set the Minimum Hold value, use the CacheMinHold directive or the Cache Configuration -> Cache Expiry Settings: URL Expiration setting. Refer to Configuring cache freshness for additional information.)

The final expiry time value is checked against the time specified in the Time Margin setting. If the expiry time is greater than the Time Margin value, the document is cached; otherwise, it is not added to the cache. (To set the Time Margin value use the CacheTimeMargin directive or see the instructions in Configuring cache freshness.)

If the document is found in the cache, but it has expired, Caching Proxy issues a special request known as an if-modified-since request to the content server. This request causes the content server to send the document only if it has been modified since it was last received by the proxy. If the document has not been modified, the content server sends a message indicating that, and does not resend the page. In that case, the proxy serves the cached document. For FTP files, the proxy server simulates this if-modified-since process. If it determines that the file has not been changed at the FTP server, it serves the file from the cache. Otherwise, it gets the newer version from the FTP server.

Additional information about cache freshness

About dates in FTP

This applies to forward proxy configurations only.

Because the FTP protocol does not define dates and times as strictly as the HTTP protocol does, several factors can cause the Last-Modified header generated by the proxy for FTP files to be slightly different from the actual file date. These factors include the following:

When an FTP file expires from the cache, the proxy simulates the HTTP if-modified-since revalidation process for the FTP file. It does this by reissuing the FTP LIST command for the requested file, parsing the file date from the response returned by the FTP server, and comparing this date with the date that the proxy server generated for the Last-Modified header when the file was initially retrieved. If the file date has not changed, then the proxy server marks the cached FTP file as revalidated, sets a new expiration time for the file, and serves the file from the cache rather than retrieving it again from the FTP server. If the two file dates do not match, then the proxy retrieves the file from the FTP server again and caches the new copy with the new file date.

It is not always possible to obtain the directory information for the file from the FTP server. If the proxy is unable to determine the file date for the FTP file, it does not generate a Last-Modified header for the file. Instead, it uses the value specified for the CacheDefaultExpiry directive that matches the URL to determine the length of time to keep the file in the cache. When this time period expires, the proxy always retrieves the file from the FTP server again. If specific FTP files in your cache seem to be using the CacheDefaultExpiry directive very often and are frequently being retrieved (generating a high volume of network traffic), consider specifying a more granular CacheDefaultExpiry value for those specific files. Doing this holds them in the cache for a longer period of time.

To specify cache expiration settings in the Configuration and Administration forms, use the Cache Configuration -> Cache Expiry Settings -> Time Limit for Cached Files form. For more details on setting cached file expiration dates, see File expiration.

Configuring cache freshness

To specify the expiration times for cached files, in the Configuration and Administration forms, select Cache Configuration -> Cache Expiry Settings. The following forms are useful.

URL-based expiration

Use this form to set the minimum length of time that files are held in the cache, based on their URLs. You can specify different caching behavior for different URL request templates.

To set URL-based file expiration by editing the proxy configuration file, see the reference sections in Appendix B. Configuration file directives for the following directives:

Default expiration settings

Use the Cache Expiration Settings form to specify the default expiration settings for used or unused files. You can set different values for HTTP, FTP, and Gopher files, and you can set different values for used or unused files.

This form also contains additional file-expiration options:

To set default expiration settings by editing the proxy configuration file, see the reference pages for the following directives:

Last Modified Factor settings

Use the Last Modified Factor form to set the value that the proxy uses to calculate an expiration date for cached files with no expiration dates in their headers. You can set different values for files matching different request templates. The first matching template is used to calculate the expiration date.

To set the Last Modified factor by directly editing the proxy configuration file, see CacheLastModifiedFactor -- Specify the value for determining expiration dates.

Cache time limit

Use the Time Limit for Cached Files configuration form to set the maximum time that a file can remain in the cache. Time limits are set based on request templates, and you can specify that files are discarded or revalidated when the time limit expires. These settings can be used to maintain files whose expiration dates are invalid or files with extremely long expiration times.

To set the maximum expiration time limit for cached files by editing the proxy configuration file, see the following: