Caching at HTTP-Level

This is more of a proof-of-concept: I wanted to implement caching at http-level in an effort to save network bandwidth. It works, but I feel it is nowhere near where I’d use it in production, so any comments are welcome.


<?php

class HttpCacheFilter extends CFilter

{

	/**

	 * Timestamp for the last modification date. Must be a string parsable by {@link http://php.net/DateTime DateTime}.

	 * @var string

	 */

	public $lastModified;

	

	/**

	 * Seed for the ETag. Can be anything that passes through {@link http://php.net/serialize serialize()}.

	 * @var mixed

	 */

	public $etagSeed;

	

	public function preFilter($filterChain)

	{

		if($this->lastModified && $lastModified=new DateTime($this->lastModified))

		{

			if(key_exists('HTTP_IF_MODIFIED_SINCE', $_SERVER) && $lastModified->diff(new DateTime($_SERVER['HTTP_IF_MODIFIED_SINCE']))->format('r')!='+')

			{

				$this->send304();

				return false;

			}

			header('Last-Modified: '.$lastModified->format('r'));

		}

		else if($this->etagSeed)

		{

			$etag='"'.base64_encode(hash('ripemd160', serialize($this->etagSeed), true)).'"';

			if(key_exists('HTTP_IF_NONE_MATCH', $_SERVER) && $_SERVER['HTTP_IF_NONE_MATCH']==$etag)

			{

				$this->send304();

				return false;

			}

			header('ETag: '.$etag);

		}

		header('Cache-Control: must-revalidate, proxy-revalidate, private');

		return true;

	}

	

	/**

	 * Send the 304 HTTP status code to the client

	 */

	private function send304()

	{

		header($_SERVER['SERVER_PROTOCOL'].' 304 Not modified');

	}

}

This filter can be used quite similar to COutputCache (See the guide on filters for more). Last-Modified is favoured over ETags for SEO reasons. I decided not to use both at once since that would add unnecessary overhead: According to the RFCs, a client would have to check both.

actually I dont understand the idea of this code of checking if cached header…

Actually browser dont query the webpage if you cache it, so no request to server…

Maybe I’m missing something…

The idea is to cache entire pages on the client side. Caching on the server side will spare you from rendering the same thing over and over again. Caching on the client side will also eliminate the need to transmit that page ;)

Out of curiosity: Has anyone experienced any problems with this so far?

Thank you for a good filter. I’ve modified some codes direct to my opinion.

I think we should move lastModifed and eTag into lazy calculating. Why?

We do not need to apply this filter on all actions of a controller. For example, with a controller ‘Article’, we should apply this filter on action ‘view’ only, the others (edit, comment, vote) is out of this filter. So, if we always calculate ETag (or LastModified) when the Controller::filters() function is called, it wastes the resources.




<?php


/**

* @property string $lastModified

* @property string $eTag

*/

class HttpCacheFilter extends CFilter

{

	public $cachControl = 'max-age=3600, public';

	

	/**

	 * Timestamp for the last modification date. Must be a string parsable by {@link http://php.net/DateTime DateTime}.

	 * If its value is a callback function, the function's return value will be used as lastModified

	 * @var string|callback

	 */

	private $_lastModified;


	function setLastModified($v) { $this->_lastModified = $v; }

	

	/**

	* @return string

	*/

	function getLastModified()

	{

		if (!isset($this->_lastModified)) return null;

		

		if (is_callable($this->_lastModified)) $this->_lastModified = call_user_func($this->lastModified);

		return $this->_lastModified;

	}

	

	/**

	 * Value for ETag. It should be a string

	 * If its value is a callback function, the function's return value will be used as ETag

	 * @var string|callback

	 */

	private $_eTag;

	

	function setETag($v) { $this->_eTag = $v; }

	

	/**

	* @return string

	*/

	function getETag()

	{

		if (!isset($this->_eTag)) return null;

		

		if (is_callable($this->_eTag)) $this->_eTag = call_user_func($this->_eTag);

		return $this->_eTag;

	}

	

	/**

	* check for caching by ETag header

	* 

	* @return boolean true if cached, false on else

	*/

	protected function checkETag()

	{

		if (empty($this->eTag)) return false;

		

		header('ETag: '.$this->eTag);

		header('Cache-Control: ' . $this->cachControl);

		

		if (!array_key_exists('HTTP_IF_NONE_MATCH', $_SERVER)) return false;

		if ($_SERVER['HTTP_IF_NONE_MATCH'] != $this->eTag) return false;

		

		

		$this->send304();

		return true;

	}

	

	/**

	* check for caching by If-Modified-Since header

	* 

	* @return boolean true if cached, false on else

	*/

	protected function checkLastModified()

	{

		if (empty($this->lastModified)) return false;

		

		header('Last-Modified: '.$this->lastModified);

		header('Cache-Control: ' . $this->cachControl);

		

		if (!array_key_exists('HTTP_IF_MODIFIED_SINCE', $_SERVER)) return false;

		if (!($lastModified = new DateTime($this->lastModified))) return false;

		if ($lastModified->diff(new DateTime($_SERVER['HTTP_IF_MODIFIED_SINCE']))->format('r')=='-') return false;

		

		$this->send304(); 

		return true;

	}

	

	public function preFilter($filterChain)

	{

		if ($this->checkETag()) return false;

		if ($this->checkLastModified()) return false;

		

		return true;

	}

	

	/**

	 * Send the 304 HTTP status code to the client

	 */

	private function send304()

	{

		header($_SERVER['SERVER_PROTOCOL'].' 304 Not modified');

	}

}



Thanks for your input. The lazy calculation approach is a very valid point. However, I think a set of expressions like COutputCache.varyByExpression were a better way of implementing this, so we would have a eTagExpression and lastModifiedExpression. IMHO, that’s a cleaner way of doing this. evaluateExpression() would still allow to set callbacks in place.

I’ll be using this in a new project soon, so stay tuned for updates B)

You may have already researched this, but I was evaluating Symfony 2 which uses HTTP Cache.

Maybe this could help: http://symfony.com/doc/2.0/book/http_cache.html

That’s an interesting read, thanks. Especially the section concerning the Cache-Control header has been very valuable.

So, here’s an updated version:




<?php

class HttpCacheFilter extends CFilter

{

	/**

	 * Timestamp for the last modification date. Must be a string parsable by {@link http://php.net/strtotime strtotime()}.

	 * @var string

	 */

	public $lastModified;


	/**

	 * Expression for the last modification date. If set, this takes precedence over {@link lastModified}.

	 * @var string|callback

	 */

	public $lastModifiedExpression;

	/**

	 * Seed for the ETag. Can be anything that passes through {@link http://php.net/serialize serialize()}.

	 * @var mixed

	 */

	public $etagSeed;

	

	/**

	 * Expression for the ETag seed. If set, this takes precedence over {@link etag}. 

	 * @var string|callback

	 */

	public $etagSeedExpression;


	/**

	 * Http cache control headers

	 * @var string

	 */

	public $cacheControl = 'max-age=3600, public';

	

	public function preFilter($filterChain)

	{

		if($this->lastModified || $this->lastModifiedExpression)

		{

			if($this->lastModifiedExpression)

			{

				$value=$this->evaluateExpression($this->lastModifiedExpression);

				if(($lastModified=strtotime($value))===false)

					throw new CException("HttpCacheFilter.lastModifiedExpression evaluated to '{$value}' which could not be understood by strtotime()");

			}

			else

			{

				if(($lastModified=strtotime($this->lastModified))===false)

					throw new CException("HttpCacheFilter.lastModified contained '{$this->lastModified}' which could not be understood by strottime()");

			}

			

			if(key_exists('HTTP_IF_MODIFIED_SINCE', $_SERVER) && strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE'])>=$lastModified)

			{

				$this->send304();

				return false;

			}

			

			header('Last-Modified: '.date('r', $lastModified));

		}

		elseif($this->etagSeed || $this->etagSeedExpression)

		{

			if($this->etagSeedExpression)

				$etag=$this->generateEtag($this->evaluateExpression($this->etagSeedExpression));

			else

				$etag=$this->generateEtag($this->etagSeed);

			

			if(key_exists('HTTP_IF_NONE_MATCH', $_SERVER) && $_SERVER['HTTP_IF_NONE_MATCH']==$etag)

			{

				$this->send304();

				return false;

			}

			

			header('ETag: '.$etag);

		}

		

		header('Cache-Control: ' . $this->cacheControl);

		return true;

	}


	/**

	 * Send the 304 HTTP status code to the client

	 */

	private function send304()

	{

		header($_SERVER['SERVER_PROTOCOL'].' 304 Not modified');

	}

	

	private function generateEtag($seed)

	{

		return '"'.base64_encode(hash('ripemd160', serialize($seed), true)).'"';

	}

}

For fun and profit I put this into the Yii blog demo (which is a superb playground, btw!) under components/HttpCacheFilter.php and registered it in the PostController like this:




	/**

	 * @return array action filters

	 */

	public function filters()

	{

		return array(

			'accessControl', // perform access control for CRUD operations

			array(

				'HttpCacheFilter + index',

				'lastModifiedExpression'=>'Yii::app()->db->createCommand("SELECT FROM_UNIXTIME(MAX(`update_time`)) FROM {{post}} WHERE `status`=:status")->queryScalar(array(":status"=>Post::STATUS_PUBLISHED))',

			),

		);

	}



Result: When pulling the site index a second time with the right headers, response times are dropping from ~41ms to 9ms.

Update: I plan to release this as an extension to Yii. I think the chances of this landing in upstream Yii are rather slim. I’ve tried to tie the existing cache dependency system into the filter but that’ll greatly increase complexity while bringing little benefit over putting a CCacheDependency into HttpCacheFilter.etagSeed.

good to know thanks.

I branched my yii fork on github to contain this: https://github.com/DaSourcerer/yii/tree/http-caching

Excellent idea. Could be useful for api responses. Thanks for sharing!

Indeed. jQuery can make use of ETags (cf jQuery.ajay(): ifModified). But I really thought of pages that get polled often but are hardly changed - such as RSS/Atom feeds.

Hm, after consulting the RFC again, it looks like this filter isn’t complying to HTTP/1.1 … If a server is able to create both, last modified time and ETag, it should actually send them all out. I’ll work this over if I find the time.

Exciting news: This filter just made it into Yii \o/

Cool, good work and news! Thanks

nice one!

Da:Sourcerer

Don’t forget about docs ;)

I haven’t forgotten about that ;)