Getting Started with Icicle

Write asynchronous code in PHP with synchronous coding techniques using promises and coroutines

October 21, 2015

PHP is normally used to write synchronous code that is run on a per-request basis within a web server. However, PHP can also be used to create stand-alone, long-running programs. These programs often need to handle many clients or tasks at once without blocking on a single task. Asynchronous operations allow many tasks to be performed cooperatively without blocking, but PHP does not immediately lend itself to asynchronous programming. Icicle is a library to facilitate writing asynchronous code using synchronous coding techniques to create asynchronous programs written using only PHP.

Asynchronous programming has been popularized in the last few years, particularly by node.js, a server-side interpreter for JavaScript. Asynchronous programs use non-blocking I/O to create a single thread of execution that is able to continuously run available tasks without waiting for an external operation to complete. Asynchronous code can be difficult to write and debug due to its reliance on callback functions that generally cannot return values or throw exceptions.

Icicle is a library for writing asynchronous code in PHP that does more than simply enable asynchronous programming. Icicle uses promises to create cooperative coroutines that allow programmers to use synchronous coding techniques to write asynchronous code.

What is Asynchronous Programming?

A synchronous program defines a set of sequential instructions (statements, function calls, etc.) that are executed in order, from top to bottom. If data is needed from a resource outside of that program, such accessing a file or making a network request, the program waits until the external operation has completed before continuing execution. This is called a blocking request, since the execution of the program is blocked until the external operation has completed.

PHP scripts are generally written using blocking requests. For example, calling the function file_get_contents() to fetch the contents of a file will block execution of the script until the operation has completed. While the process is blocked, no other code can be run within that PHP process.

Asynchronous programs rely on non-blocking code to continuously process available tasks within a single thread of execution. Operations are only made with data that is immediately available. Since all data required by most programs cannot be immediately available, requests need to be made for data outside the program. Asynchronous programs must then use a different strategy for external operations that would normally cause the program to block. To avoid blocking, asynchronous programs use functions that also accept a callback function that is executed once the external request has been completed. Instead of blocking until the request is completed, the program is able to continue execution even though the result of the request is not available. Below is an example of such a function from node.js that resolves the IP address of a domain.

dns.resolve('example.com', 'A', function (err, addresses) {
    // Callback invoked when operation completes or fails.
});
// Code below is executed immediately.

Execution does not block on the call to dns.resolve(), rather the function only initiates the operation and returns immediately. Any code after the call to dns.resolve() is immediately executed, before the DNS query has completed. The result of the DNS query will not be available until the callback passed to the function is invoked when the query has completed.

Asynchronous programs rely on an event loop that schedules tasks and invokes callbacks when an event occurs (known as the reactor pattern). An event might be the completion of an external task, available data on a network socket, or expiration of a timer. Event callbacks are executed in an unpredictable order because they often depend on the timing of external operations. The call to dns.resolve() in the code block above is really scheduling a series of tasks in the event loop that will resolve the DNS query. The final step in this series of tasks will be to invoke the callback function with a list of IP addresses (or with an error).

Creating an Event Loop

PHP does not include an event loop implementation, so to build an asynchronous framework like Icicle, an event loop must be created from pieces available in the language. Fortunately, the PHP core includes all the components necessary to create an event loop, no extensions required! (There are some extensions available to create event loops that are more performant, more on this later.)

An event loop needs to provide some essential functionality to build an asynchronous program. This includes, but is not limited to, polling network sockets for available data and executing timers, as well as scheduling and invoking callback functions. The stream_select() function included in PHP uses the select() system call to poll stream sockets for available data or the ability to write to the stream socket. This function accepts arrays of stream sockets and a timeout, then blocks until either one of the streams can be read from or written to without blocking or until the timeout has expired. The timeout parameter given to stream_select() is based on other conditions in the event loop. If there are other events pending in the loop, the timeout can be 0 to quickly poll for stream socket data, returning immediately from stream_select(). If there are timers in the event loop, the timeout parameter can be set to the remaining time on the next pending timer. stream_select() may return before the timeout expires if there is data available on a stream socket, but it will not block longer than the timeout given. If there are no timers, the timeout may be null, causing stream_select() to block indefinitely.

One scenario cannot be covered using stream_select() alone: if there are no stream sockets in the event loop waiting to read or write, but there are pending timers. In this case, the usleep() function is used to sleep the process until the next pending timer expires.

Icicle combines stream_select() and usleep() to create an event loop that will work on any installation of PHP. The code below contains pseudo-code based on the event loop implementation in Icicle that uses these two functions.

/*
 * $poll and $await are arrays of stream sockets to poll for
 * data or space to write. $timeout is null or the maximum
 * number of seconds to block.
 */
if ($poll || $await) {
    $seconds = (int) $timeout;
    $microseconds = ($timeout - $seconds) * 1e6;
    
    $read   = $poll;
    $write  = $await;
    $except = null;
    
    $count = stream_select(
        $read,
        $write,
        $except,
        null === $timeout ? null : $seconds,
        $microseconds
    );
    
    if ($count) {
        // $read and $write modified to contain only stream
        // sockets with pending data or space to write.
        // Invoke callbacks associated with stream sockets.
    }
} elseif (0 < $timeout) {
    usleep($timeout * 1e6);
}

This code provides some insight into how the core components of PHP can be used to create an event loop, but is only a small portion of an entire event loop implementation. The code above provides no details on how callbacks are associated with stream sockets or how timers and scheduled functions are invoked. If this interests you, please take a look at the source of the Loop component of Icicle.

There are four PHP extensions currently supported by Icicle that can provide a more performant event loop: ev, event, and libevent for PHP 5.x and uv for PHP 7. These extensions move much of the event loop logic from PHP code to C code, improving performance. These extensions also use a faster internal mechanism to poll sockets for data compared to select(), such as kqueue() or poll(), further improving performance.

Getting Started

Icicle can be installed using Composer by adding the icicleio/icicle package to your project requirements. This package contains all the basic components necessary to write an asynchronous program in PHP, including an event loop and additional tools to make writing asynchronous code easier, which will be examined in the following sections.

The code snippet below shows how an executable PHP script can be created with Icicle that can be run from the command line or as a daemon.

#!/usr/bin/env php
<?php
require 'vendor/autoload.php';

use Icicle\Loop;

// Create server or initial tasks.

Loop\run(); // Run the event loop.

A script using Icicle should first create a server or an initial set of tasks, then call Icicle\Loop\run() to run the event loop. This function does not return until the event loop is stopped or there are no pending tasks in the event loop.

The active event loop should be accessed using functions defined in the Icicle\Loop namespace, such as Icicle\Loop\timer() to create a timer or Icicle\Loop\stop() to stop the event loop. Often the only event loop function that a program will need to call is Icicle\Loop\run() since other library components abstract tasks and mitigate direct interaction with the event loop. An event loop instance is automatically created based on available extensions, but automatic creation can be overridden using Icicle\Loop\loop() if your application requires a specific or custom event loop implementation.

For more on installation and using Icicle, please see the documentation.

Promises

Asynchronous programs can be difficult to write and debug since callback functions that cannot return values or throw exceptions must rely on side-effects to control program flow. A callback function for a single operation can be easy to write, but what happens when another asynchronous operation is initiated in a callback function that then invokes another callback function when it completes? And then that operation initiates another operation that invokes yet another callback function. This results in a set of nested callback functions, often referred to as "callback hell."

Promises offer a solution to not only avoid "callback hell," but also a means to model problems using interdependencies between values synonymous with functional composition in synchronous programming. Instead of accepting a callback function as a parameter, asynchronous operations in components designed for Icicle return promises.

Promises are objects that act as placeholders for the future value of an asynchronous operation. A promise may be in one of three states: pending, fulfilled, or rejected. Pending promises may either be fulfilled or rejected with any value (note that in Icicle, if a promise is rejected with a non-exception, it is encapsulated in an exception). Once a promise is fulfilled or rejected (resolved), it cannot become pending again and the resolution value cannot change. A promise may also be resolved with another promise, adopting the state of that promise, fulfilling or rejecting with the same value as the resolving promise.

Callback functions are the primary way of accessing the resolution value of promises. Unlike other APIs that use callbacks, promises provide an execution context for callback functions, allowing them to return values and throw exceptions. Callback functions are registered to a promise using the then() method (PromiseInterface refers to Icicle\Promise\PromiseInterface):

PromiseInterface::then(
    callable $onFulfilled = null,
    callable $onRejected = null
): PromiseInterface;

This method accepts two callback functions: the first is executed if the promise is fulfilled, the second if the promise is rejected. Each callback is given a single parameter, either the fulfillment value or rejection reason (exception). then() returns a new promise that is fulfilled with the return value of the invoked callback or rejected with the exception thrown from the invoked callback. The code below shows an example of a call to then() on a promise.

$promise2 = $promise1->then(
    function ($value) {
        // Executed if $promise1 is fulfilled.
        // Fulfills or rejects $promise2.
    },
    function (Exception $exception) {
        // Executed if $promise1 is rejected.
        // Fulfills or rejects $promise2.
    }
);

If the on-fulfilled callback is omitted, the promise returned from then() will be fulfilled with the same value as the parent promise. If the on-rejected callback is omitted, the promise returned from then() will be rejected with the same exception as the parent promise.

Calls to then() can be chained together to create a sequence of interdependent operations in a time-independent way, as registered callback functions are only invoked once a promise is resolved. If a promise has already been resolved when a callback is registered with then(), the callback will still be invoked with the resolution value of the promise. Either of the two callbacks may also be omitted when calling then().

Icicle promises also include several other methods for registering callbacks with different behaviors. A few of the more important and useful methods are listed below.

  • done(callable $onFulfilled = null, callable $onRejected = null) - Similar to then(), but returns nothing instead of another promise. Callbacks registered with done() should consume the fulfillment value or handle rejection, as return values are ignored and any exceptions thrown from a callback registered with done() cannot be caught.
  • capture(callable $onRejected) - Registers a callback function to handle rejection. If a type-hint is given for the exception parameter, the callback function will only be invoked if the rejection exception type matches the type-hint. Acts like the catch portion of a try/catch block.
  • cleanup(callable $onResolved) - Called when the promise is resolved (either fulfilled or rejected). Acts like the finally portion of a try/catch/finally block.

The code below shows a simple example of how calls to methods on promises can be chained together to transform a value and handle errors.

$promise
    ->then(function ($value) {
        if (0 === $value) {
            throw new Exception('Value cannot be 0.');
        }
        return 100 / $value;
    })
    ->then(function ($value) {
        return $value * $value;
    })
    ->then(function ($value) {
        return log($value);
    })
    ->capture(function (Exception $exception) {
        return 0;
    })
    ->done(function ($value) {
        printf("Value: %f\n", $value);
    });

Note that omitting the on-rejected callback from calls to then() in code above allow errors to propagate down the chain to the callback defined in the call to capture(). Structuring a promise chain in this way is analogous to a try/catch block in synchronous code.

The callback functions in the code above are simple and only meant to demonstrate how method calls on promises may be chained together. Each callback could initiate another asynchronous operation, returning a promise that would resolve the promise originally returned from then() or capture(). This allows multiple asynchronous operations to be interdependent without creating a tree of nested, imperative callbacks. Registering callback functions with then() simply defines dependencies between operations, it does not imply anything about when a value will be available. The order in which operations are executed is determined from these dependencies, similar to functional composition in a synchronous program.

The code below demonstrates a more practical use of promises. This example uses the promise-based DNS and socket components of Icicle to asynchronously resolve the IP addresses for a domain name, then connect to the first of the resolved IP addresses.

use Icicle\Coroutine\Coroutine;
use Icicle\Dns\Executor\Executor;
use Icicle\Dns\Resolver\Resolver;
use Icicle\Socket\Client\Connector;

$connect = function ($domain, $port) {
    $resolver = new Resolver(new Executor('8.8.8.8'));
    
    // A Coroutine is type of promise created from a generator (more below).
    $promise1 = new Coroutine($resolver->resolve($domain));
    
    $promise2 = $promise1->then(
        function (array $ips) use ($port) {
            $connector = new Connector();
            
            // Return a new promise. $promise2 adopts state of returned promise.
            return new Coroutine($connector->connect($ips[0], $port));
        }
    );

    return $promise2;
};

$promise = $connect('example.com', 80);

$promise1 will either be fulfilled with an array of IP addresses or rejected if resolving the domain fails. When $promise1 fulfills, the on-fulfilled callback function registered to $promise1 will be invoked, fulfilling $promise2 with the connected client socket if connecting to the IP address succeeds, otherwise rejecting $promise2 if connecting fails. If $promise1 is rejected, $promise2 will be immediately rejected without invoking the callback function, so no connection attempt will be made.

Coroutines

Promises provide an execution context for callback functions and a means to define interdependencies between values, but they do not eliminate the need to create callback functions. To make using promises simpler and avoid registering callback functions to promises, Icicle combines promises and generators to create interruptible functions called coroutines.

Generators usually use the yield keyword to yield a value from a set to implement an iterator. Generators written to be coroutines use the yield keyword to define interruption points, temporarily interrupting execution of the coroutine. The local scope of the coroutine is preserved between interruptions, so local variables maintain their value and execution resumes at the point in which it was interrupted. Coroutines are also cooperative, allowing tasks such as I/O, timers, and other coroutines to run whenever a value is yielded from a coroutine.

When a coroutine yields a promise, execution of the coroutine is interrupted and does not resume until the promise is resolved. Once the promise resolves, the fulfillment value will be sent to the generator or the exception used to reject the promise will be thrown into the generator. This means if a yielded promise is fulfilled, the statement that yielded the promise will evaluate to the fulfillment value of the promise. For example, $value = (yield $promise); will set $value to the fulfillment value of $promise when the coroutine resumes. If a yielded promise is rejected, the statement that yielded the promise would behave identically to a throw statement that threw the exception used to reject the promise. No callbacks need to be registered on the yielded promise. The fulfillment value of the promise can be accessed through a simple variable assignment to the yield statement. Exceptions rejecting the promise are thrown into the coroutine and can be caught using try/catch blocks.

Coroutines in Icicle are also promises. A coroutine is fulfilled with the last value yielded (or fulfillment value of the last yielded promise) and rejected if an exception is thrown from the coroutine’s generator. (Note that generators in PHP 7 will be able to explicitly return values and will be used in the future to return values from coroutines.) A coroutine may then yield to other coroutines, interrupting execution of the calling coroutine until the yielded coroutine has completed execution. If the coroutine throws an exception (is rejected), the exception will be thrown into the calling coroutine. This allows coroutines to be composed of other coroutines, allowing coroutines to be built using functional composition. Calling a coroutine within another coroutine is similar to synchronously calling a function that can return a value or throw an exception. Coroutines may also yield generators directly to create another coroutine and automatically yield to that coroutine, removing the need to explicitly create a coroutine from a generator within another coroutine.

The code below demonstrates how the code using promises in the previous code block can be re-written into a coroutine to avoid registering callbacks on promises.

use Icicle\Coroutine\Coroutine;
use Icicle\Dns\Executor\Executor;
use Icicle\Dns\Resolver\Resolver;
use Icicle\Socket\Client\Connector;

$connect = function ($domain, $port) {
    $resolver = new Resolver(new Executor('8.8.8.8'));
    $ips = (yield $resolver->resolve($domain));
    
    $connector = new Connector();
    yield $connector->connect($ips[0], 80);
};

$coroutine = new Coroutine($connect('example.com', 80));

Instead of registering a callback to access the fulfillment value of the promise returned by $resolver->resolve(), the array of IP addresses is simply assigned to $ips when the promise is fulfilled. If the promise is rejected, the exception will be thrown into the coroutine, bypassing the remaining code and immediately rejecting the coroutine.

Since coroutines make promises much easier to use, the previous code block can be quickly improved with a loop to attempt to connect to other IP addresses resolved for the domain if the first does not succeed. Creating loops based on promise fulfillment values or rejection reasons is considerably simpler in a coroutine versus using promises alone. The code below shows how a simple foreach loop can be used in a coroutine with loop termination determined by the resolution of a promise.

use Icicle\Coroutine\Coroutine;
use Icicle\Dns\Executor\Executor;
use Icicle\Dns\Resolver\Resolver;
use Icicle\Socket\Client\Connector;

$connect = function ($domain, $port) {
    $resolver = new Resolver(new Executor('8.8.8.8'));
    $ips = (yield $resolver->resolve($domain));
    
    $connector = new Connector();
    foreach ($ips as $ip) {
        try {
            yield $connector->connect($ip, 80);
            return; // Halts coroutine execution.
        } catch (Exception $exception) {
            // Ignore connection failure and try next IP.
        }
    }
    
    // Could not connect to any IP, so reject coroutine.
    throw new Exception(
        sprintf('Error connecting to %s:%d', $domain, $port)
    );
};

$coroutine = new Coroutine($connect('example.com', 80));

Example: RESTful DNS Service

Icicle is designed to make creating web services quick and easy. Below is a complete PHP script that implements a simple RESTful DNS service. The service accepts GET requests for URIs of the form /{domain-name}/{record-type} (e.g., http://localhost:8053/example.com/a), performs the corresponding DNS query, and responds with the results in JSON format.

#!/usr/bin/env php
<?php
require __DIR__ . '/vendor/autoload.php';

use Icicle\Dns\Exception\FailureException;
use Icicle\Dns\Exception\InvalidTypeException;
use Icicle\Dns\Exception\MessageException;
use Icicle\Dns\Executor\Executor;
use Icicle\Http\Message\RequestInterface;
use Icicle\Http\Message\Response;
use Icicle\Http\Server\Server;
use Icicle\Loop;

$executor = new Executor('8.8.8.8');

$server = new Server(function (RequestInterface $request)
    use ($executor)
{
    $response = new Response();
    $response = $response->withHeader(
        'Content-Type',
        'application/json'
    );

    if ($request->getMethod() !== 'GET') {
        yield $response->getBody()->end(
            json_encode(['error' => 'Only GET allowed.'])
        );
        yield $response->withStatus(405);
        return;
    }

    if (!preg_match(
        '/^\/((?:[a-z0-9\-]+\.)*[a-z]{2,})\/([a-z0-9]+)$/i',
        $request->getRequestTarget(),
        $matches
    )) {
        yield $response->getBody()->end(
            json_encode(['error' => 'Invalid uri format.'])
        );
        yield $response->withStatus(404);
        return;
    }

    list( , $domain, $type) = $matches;

    try {
        $message = (
            yield $executor->execute($domain, $type)
        );
    } catch (InvalidTypeException $e) {
        yield $response->getBody()->end(
            json_encode(['error' => 'Invalid record type.'])
        );
        yield $response->withStatus(404);
        return;
    } catch (FailureException $e) {
        yield $response->getBody()->end(
            json_encode(['error' => 'Invalid domain name.'])
        );
        yield $response->withStatus(404);
        return;
    } catch (MessageException $e) {
        yield $response->getBody()->end(
            json_encode(['error' => 'DNS lookup failed.'])
        );
        yield $response->withStatus(503);
        return;
    }

    $json = [];
    
    foreach ($message->getAnswerRecords() as $record) {
        $json[] = [
            'type'  => $record->getType(),
            'ttl'   => $record->getTtl(),
            'rdata' => (string) $record->getData()
        ];
    }

    yield $response->getBody()->end(json_encode($json));
    
    yield $response->withStatus(200);
});

$server->listen(8053);

Loop\run();

While it would be better to delegate the tasks in the code above to separate functions or methods of a class, this example is meant to demonstrate how a simple, yet powerful server can be created with very little code. The server above is capable of handling many clients simultaneously because all the tasks necessary to process a client request are performed asynchronously and cooperatively.

Going Forward

Icicle includes all the basic components needed to create an asynchronous network server or client written in only PHP. Writing truly asynchronous code means never using any code that will result in a blocking call. Unfortunately, most of the functions available in PHP that access an external data source will block because they were designed to be used in sequential, synchronous code. Currently this represents the biggest hurdle to writing asynchronous code in PHP. However, there are many fantastic libraries available for PHP that do not make blocking calls and can be used within asynchronous code, such as dependency injection containers, collection libraries, validators, routers, and many others.

To get the most out of Icicle, asynchronous compatible components are needed to replace any operations that would block. There are several packages already available for Icicle that provide asynchronous, non-blocking implementations of key components:

  • Stream - Asynchronous stream interfaces and implementations of memory streams and pipes.
  • Socket - Asynchronous socket server, datagram, and client connector.
  • Concurrent - Provides easy concurrent execution of code (blocking or non-block) without blocking the main thread.
  • DNS - Asynchronous DNS query executor and client connector.
  • Filesystem - Asynchronous file access.
  • HTTP - Creates an asynchronous HTTP server or perform asynchronous HTTP requests.

Additional components listed below are currently planned. If you are interested in contributing to any of these components or would like to propose another component, please contact the project on twitter @icicleio, on GitHub, or send an email to hello@icicle.io.

  • WebSocket - Implements the web socket protocol to create asynchronous web socket servers and clients.
  • Memcached - Asynchronous client to memcached.
  • Redis - Asynchronous client to redis.
  • MySQL - Asynchronous client for MySQL.
  • PostgreSQL - Asynchronous client for PostgreSQL.
  • MongoDB - Asynchronous client for MongoDB.