PHP Caching to Speed up Dynamically Generated Sites

PHP Caching to Speed up Dynamically Generated Sites

Why would you want to Cache ?

PHP is a dynamic scripting language, so every time there is a request for a page, the server must first parse the code in your PHP script in order to generate the resulting HTML code seen by a visitor's web-browser.

PHP is ideal for web pages that have content that is constantly updated, since each visitor gets a fresh copy of the page. So for instance if you have your PHP script pulling data from a database, as soon as there is new data in the database, it will automatically be in the generated HTML code for the next visitor requesting that page as well. In most cases the need to re-run PHP scripts for data that might not have changed in the first place, can be taxing on the server. By implementing APC you cut down on repeat PHP script executions, skipping the parsing and compiling steps.

When to implement cache?

Although this component is very fast, implementing it in cases that are not needed could lead to a loss of performance rather than gain. We recommend you check this cases before using a cache:

  • You are making complex calculations that every time return the same result (changing infrequently)
  • You are using a lot of helpers and the output generated is almost always the same
  • You are accessing database data constantly and these data rarely change
NOTE Even after implementing the cache, you should check the hit ratio of your cache over a period of time. This can easily be done, especially in the case of Memcache or Apc, with the relevant tools that the backends provide.

Caching Behavior

The caching process is divided into 2 parts:

  • Frontend: This part is responsible for checking if a key has expired and perform additional transformations to the data before storing and after retrieving them from the backend-
  • Backend: This part is responsible for communicating, writing/reading the data required by the frontend.

Caching Output Fragments

An output fragment is a piece of HTML or text that is cached as is and returned as is. The output is automatically captured from the ob_* functions or the PHP output so that it can be saved in the cache. The following example demonstrates such usage. It receives the output generated by PHP and stores it into a file. The contents of the file are refreshed every 172800 seconds (2 days).

Simple Introduction of Cache

This entire site, like many, is built in PHP. PHP provides the power to simply 'pull' content from an external source, in the case of my site this is flat files but it could just as easily be an MySQL database or an XML file etc..

The downside to this is processing time, each request for one page can trigger multiple database queries, processing of the output, and formatting it for display... This can be quite slow on complex sites (or slower servers)

Ironically, these so-called 'dynamic' sites probably have very little changing content, this page will almost never be updated after the day it is written - yet each time someone requests it the scripts goes and fetches the content, applies various functions and filters to it, then outputs it to you...

Enter Caching

This is where caching can help us out, instead of regenerating the page every time, the scripts running this site generate it the first time they're asked to, then store a copy of what they send back to your browser. The next time a visitor requests the same page, the script will know it'd already generated one recently, and simply send that to the browser without all the hassle of re-running database queries or searches.

An Illustration

This example shows a request for a "News" page on a website, the News changes daily so it makes sense to have it in a database rather than as a static file so it can be easily updated and searched, The News page is a PHP script which does the following;

  • Connect to an MySQL Database
  • Request 5 most recent news items
  • Sort news items from most recent to oldest
  • Read a template file and substitute variables for content
  • Output the finished page to the user

This takes a considerable amount of time, it's negligable if you get one or two visitors an hour, but if you get 500 visitors an hour it makes a big difference.

Consider the difference between this, and a straight forward request for a normal .html file. The web server doesnt have to do any hard work to serve up a .html file, it just finds the file and dumps it's contents to the browser... using caching allows you to experience this speed gain even with dynamic sites.

Continuing the same example, but where caching is in place, the first user to request the News page would cause the script to do exactly as above, and in addition actually increase the load by making it write the result to a file, as well as to the browser. However, subsequent requests would work something like this:

As you can see, the MySQL database and Templates aren't touched, the web server just sends back the contents of a plain .html file to the browser. The request is completed in a fraction of the time, the user gets their page faster, and your server has less load on it - everyone's happy.

Implementing a Cache in PHP

There are various ways of implementing a cache to do this, but the easiest to implement (if maybe not the most efficient) is to use a bit of extra PHP code in your scripts. Most of this example is based on this site, but could easily be applied to any site.

For the purposes of this example it helps to have a small understanding of my website. Basically each page location (e.g. "site/caching") has each / replaced by a . and that file (which contains all the content) is included into the template (so includes/design.caching in this case). The actual filename ends up in a variable called $reqfilename.

The Output Buffer

The Output Buffer, introduced in recent versions of PHP, is ideal for this. Basically if you call ob_start() at the start of your program, it supresses all output until you specifically flush the output buffer. You can therefore easily get at the output of any PHP script.

A Simple Cache

Lets look at the most basic, and rather useless, cache. This little snippet of code will save the output of a call for the "home" page into a file called home.html

 

<?php
// start the output buffer
ob_start(); ?>

//Your usual PHP script and HTML here ...

<?php
$cachefile = "cache/home.html";
// open the cache file "cache/home.html" for writing
$fp = fopen($cachefile, 'w');
// save the contents of output buffer to the file
fwrite($fp, ob_get_contents());
// close the file
fclose($fp);
// Send the output to the browser
ob_end_flush();
?>



Not tremendously useful, because now all we have is a script that generates a file called "cache/home.html" each time it is ran. But it's a good basis for a cache, it saves the content generated by the PHP script to a file. If you were to visit cache/home.html in a web browser you would see exactly the same page as if you visited the script the generated it, but that's no use unless the user knows where to look for it.

Using the cache files

Now we have our code to generate a cache file, we need to find a way of using these files constructively. There are two types of request a 'MISS' and a 'HIT'.

 

If a user requests a page that has not been requested before, or that was requested long enough ago that it might be out of date, that is considered a MISS, in this situation the script should regenerate the page from it's database (or whatever) sources, and save a new cache file.

If a user requests a page that has been requested recently, and is in the cache, the script just needs to pass that file to the user and doesnt need to do anything else. This is known as a HIT.

Checking to see if a page has already been cached is easy:




<?php

$cachefile = "cache/home.html";

if (file_exists($cachefile)) {

 

        // the page has been cached from an earlier request

        // output the contents of the cache file

        include($cachefile);

 

        // exit the script, so that the rest isnt executed

        exit;

}

 

?>



Placing that code at the start of your script will cause it to use the cached file if it exists, and then exit from the script (so the rest of it will never run). If you have a site that never changes then that's enough, but very few sites never change. The other time when this snippet along would be enough is if you had a site that only changed every day or so, then you could use cron to empty the cache directory each day. This wouldn't be suitable for many sites, we need a way of expiring content in the cache so that it isnt use idefinitely.

Expiring Cache Data

There are numerous ways to check if a cache file should be updated, we will look at the two most common here;

Simple Time Expiry

This is probably the best option for most sites, you give the cache files a life e.g. 5mins, 20mins, 1hour after which they will expire and the page be regenerated. The following example shows how this would work and when changes would be visible to the user if a 2 hour expiry time was used; The first visit of the day was at 12:00, there was no valid cache so the page was generated, this is valid until 1400. So although the database (and therefore the content of the generated page) was updated at 1320, any requests recieved between then and 1400, when the cache expires would contain the out of date information. The next request at 1400 will finally call on the database sources again, and the user will see the information added at 1320.

The database is then updated again at 1500, but these changes wont be visible until after 1600, one hour after they were made.

While this approach is suitable for most sites, it's obviously not appropriate for up-to-the-minute news sites, or sites with regularly changing content

To implement this we simply have to expand the: statement above to include a check of the cache file's modification time:

 

 

 

 

<?php

               // 5 minutes

 

       $cachetime = 5 * 60;

       // Serve from the cache if it is younger than $cachetime

       if (file_exists($cachefile) &&

           (time() - $cachetime < filemtime($cachefile)))

       {

 

              include($cachefile);

              echo "<!-- From cache generated ".date('H:i',

           filemtime($cachefile))."

              -->n";

 

               exit;

       }

 

?>

Putting this together with the previous code we get a basic structure that will cache the output of a page for 5 minutes:

// 5 minutes
// Serve from the cache if it is younger than $cachetime


// start the output buffer


.. Your usual PHP script and HTML here ...

<?php

     $cachefile = "cache/".$reqfilename.".html";

      $cachetime = 5 * 60; // 5 minutes

      // Serve from the cache if it is younger than $cachetime

     if (file_exists($cachefile) && (time() - $cachetime

         < filemtime($cachefile)))

     {

         include($cachefile);

          echo "<!-- Cached ".date('jS F Y H:i', filemtime($cachefile))."

         -->n";

 

         exit;

     }

     ob_start(); // start the output buffer

?>

   

.. Your usual PHP script and HTML here ...

<?php
       // open the cache file for writing
      $fp = fopen($cachefile, 'w');

 

       // save the contents of output buffer to the file
           fwrite($fp, ob_get_contents());

               // close the file

 

       fclose($fp);

               // Send the output to the browser

       ob_end_flush();
?>

Regenerate only When Necessary

An alternative method involves checking to see if the data sources have been modified, this increases the load of each request slightly, because it requires a database connection in the case of DB-based sites, or a query of the file modification time of potentially a few files, it also makes the script slightly more complicated. However, this method prevents unecessary LARGE queries, such as those required to retrieve data for inclusion in a page, and prevents regenerating pages regularly even when nothing has changed. This is the approach used on this site.

All that is involved here is changing the clause, for example:

<?php
       $cachefile = "cache/".$reqfilename.".html";

 

       // Serve from the cache if it is the same age or younger than the last

       // modification time of the included file (includes/$reqfilename)

       if (file_exists($cachefile) && (filemtime("includes/".$reqfilename)

           < filemtime($cachefile))) {
             include($cachefile);
          echo "<!-- Cached ".date('H:i', filemtime($cachefile))."

          -->n";
             exit;
       }
                 // start the output buffer
       ob_start();
?>
  
.. Your usual PHP script and HTML here ...
<?php
       // open the cache file for writing

       $fp = fopen($cachefile, 'w');
               // save the contents of output buffer to the file
       fwrite($fp, ob_get_contents());
                 // close the file
       fclose($fp);
               // Send the output to the browser
      ob_end_flush();
?>

This could be easily adapted to query a database containing a column for 'datemodified' or something similar.

APC or Alternative PHP Cache, is a free open-source opcode (operation code) caching plugin for PHP. With APC caching your PHP script executions can run more efficiently, by cutting down on dynamic PHP executions.

 XCache is a open-source opcode cacher, which means that it accelerates the performance of PHP on servers. It optimizes performance by removing the compilation time of PHP scripts by caching the compiled state of PHP scripts into the shm (RAM) and uses the compiled version straight from the RAM. This will increase the rate of page generation time by up to 5 times as it also optimizes many other aspects of php scripts and reduces serverload.
The XCache project is lead by mOo who is also a developer of Lighttpd. Lighttpd is one of the fastest webserver programs and outperforms Apache and many other open source webserving projects so the same is being done to XCache.

Redis is an open source (BSD licensed), in-memory data structure store, used as database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs and geospatial indexes with radius queries. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.
You can run atomic operations on these types, like appending to a string; incrementing the value in a hash; pushing an element to a list; computing set intersection, union and difference; or getting the member with highest ranking in a sorted set.
Redis is written in ANSI C and works in most POSIX systems like Linux, *BSD, OS X without external dependencies. Linux and OS X are the two operating systems where Redis is developed and more tested, and we recommend using Linux for deploying. Redis may work in Solaris-derived systems like SmartOS, but the support is best effort. There is no official support for Windows builds, but Microsoft develops and maintains a Win-64 port of Redis.

1) Used for indexing the cache content , over the cluster . I have more than billion keys in spread over redis clusters , redis response times is quite less and stable .
2) Basically , its a key/value store , so where ever in you application you have something similar, one can use redis with bothering much.
3) Redis persistency, failover and backup (AOF ) will make your job easier .

Memcached Free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.
Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.
Memcached is simple yet powerful. Its simple design promotes quick deployment, ease of development, and solves many problems facing large data caches. Its API is available for most popular languages.       

1) yes , an optimized memory that can be used as cache . I used it for storing cache content getting accessed very frequently (with 50 hits/hour)with size less than 1 MB .
2) I allocated only 2GB out of 16 GB for memcached that too when my single content size was >1MB .
3) As the content grows near the limits , occasionally i have observed higher response times in the stats(not the case with Redis) .

What's the difference between APC and other types of caching?

APC for PHP is one of the most widely used PHP opcode caching solutions in use today. You can utilize APC on a VPS  or dedicated server that is running PHP as either DSO or FastCGI. You might want to read about choosing the best PHP handler for your specific needs, and also a more in-depth explanation on what DSO and FastCGI are.

Another common caching module is Memcached, and the main difference between it and APC is that Memcached is distributed and more robust generic caching platform, while APC is specific to PHP. APC is great when you need local caching of objects for your PHP scripts that are relatively small and frequently accessed.

A=> APC (Alternative PHP cache): Opcode and object cache for php.

B=> Memcache: Free & open source, high-performance, distributed memory object caching system.

    1. Opcode cache: apc is opcode cache and caches php bytecodes. So it can lead to faster php execution. Opcode cache is not applicable for memcache as it is out of process in-memory object cache.
    2. Object cache: Both apc and memcache can be used for object cache. Apc being in process cache can be little faster. But Memcache is better if data is large as it be distributed to multiple servers.
    3. Apache restart: Apache restart resets apc cache but it does not reset memcache. This is a good thing as cache warming won’t be needed again. So memcache is better option from this perspective.
    4. Administration: Since memcache is managed by external process, it can be accessed by non php processes also (e.g. python or shell utils like memdump, memccat).
    5.  How to configure manually.

 

Lifetime

A “lifetime” is a time in seconds that a cache could live without expire. By default, all the created caches use the lifetime set in the frontend creation. You can set a specific lifetime in the creation or retrieving of the data from the cache:

Setting the lifetime when retrieving:

Setting the lifetime when saving:

Multi-Level Cache

This feature of the cache component, allows the developer to implement a multi-level cache. This new feature is very useful because you can save the same data in several cache locations with different lifetimes, reading first from the one with the faster adapter and ending with the slowest one until the data expires:

<?php

use Libray\Cache\Multiple;
use Libray\Cache\Backend\Apc as ApcCache;
use Libray\Cache\Backend\File as FileCache;
use Libray\Cache\Frontend\Data as DataFrontend;
use Libray\Cache\Backend\Memcache as MemcacheCache;

$ultraFastFrontend = new DataFrontend(
    array(
        "lifetime" => 3600
    )
);

$fastFrontend = new DataFrontend(
    array(
        "lifetime" => 86400
    )
);

$slowFrontend = new DataFrontend(
    array(
        "lifetime" => 604800
    )
);

// Backends are registered from the fastest to the slower
$cache = new Multiple(
    array(
        new ApcCache(
            $ultraFastFrontend,
            array(
                "prefix" => 'cache',
            )
        ),
        new MemcacheCache(
            $fastFrontend,
            array(
                "prefix" => 'cache',
                "host"   => "localhost",
                "port"   => "11211"
            )
        ),
        new FileCache(
            $slowFrontend,
            array(
                "prefix"   => 'cache',
                "cacheDir" => "../app/cache/"
            )
        )
    )
);

// Save, saves in every backend
$cache->save('my-key', $data);
?>

Frontend Adapters

The available frontend adapters that are used as interfaces or input sources to the cache are:

Implementing your own Frontend adapters

The \Cache\FrontendInterface interface must be implemented in order to create your own frontend adapters or extend the existing ones.

Backend Adapters

The backend adapters available to store cache data are:

Implementing your own Backend adapters

The \Cache\BackendInterface interface must be implemented in order to create your own backend adapters or extend the existing ones.

File Backend Options

This backend will store cached content into files in the local server. The available options for this backend are:

Memcached Backend Options

This backend will store cached content on a memcached server. The available options for this backend are:

APC Backend Options

This backend will store cached content on Alternative PHP Cache (APC). The available options for this backend are:

Mongo Backend Options

This backend will store cached content on a MongoDB server. The available options for this backend are:

XCache Backend Options

This backend will store cached content on XCache (XCache). The available options for this backend are:

Redis Backend Options

This backend will store cached content on a Redis server (Redis). The available options for this backend are:

 

 

 

Where not to use Caching

Caching should not be used for some things, the most obvious being search results, forums etc... where the content has to be up-to-the-minute and changes depending on user's input. It's also advisable to avoid using this method for things like a "Latest News" page, in general dont use it on any page that you wouldn't want the end users browser or proxy to cache. McKillop

To view or add a comment, sign in

More articles by Akram Abbasi

  • Is It Worth Being An "Ideal Worker"?

    Whether you work on Wall Street or I.I.

  • The 'Classic' Combination CV Template

    Either of the Classic CV or Skills CV formats can be used for any application for many jobs including managerial…

  • 7 Tips To Becoming A Leader At Work

    Becoming a leader at work can be a challenge. You want to be a leader but you do not carry the title.

  • From AMP To PWA

    A few months ago, The Washington Post launched support for Accelerated Mobile Pages (AMP HTML). Since then, we’ve been…

  • How to Make Your Site Insanely Fast

    Nothing is more frustrating than a slow website. A slow website is bad not only for the end-user, but also for search…

  • Object-Oriented Software Engineering

    Object encapsulates both data (attributes) and data manipulation functions (called methods, operations, and services) a…

    2 Comments
  • Key Differences Between successful and unsuccessful people

    Successful people embrace change; unsuccessful people fear it Successful people talk about ideas; unsuccessful people…

  • Resetting a lost MySQL root password

    The MySQL root password allows full access to the MySQL database and allows for all actions to be undertaken including…

  • Cloud Computing!

    Cloud computing is the term given when you use files and programs stored on the cloud (a server accessed via internet),…

    2 Comments

Others also viewed

Explore content categories