What are "futures", and why should you care?

Motivation

Last week we were discussing, among many other things, ways to speed up Firefox during startup.  One obvious option was to move more of our I/O off of the main thread.  This in turn involves making more code asynchronous, and asynchronous code is simply harder to manage.  Mike Shaver mentioned something about "futures" as a possible way to handle the extra complexity, and then the discussion moved on to something else.

I'm not exactly an expert, but I've not just used futures, I've written my own implementations in JavaScript and even Object Pascal (in hindsight I'm not sure the latter was a good idea, but it was certainly an interesting exercise).  Futures seem esoteric, but they really shouldn't be -- the idea is really quite simple.  In this post I'll try to explain what futures are and how they can be used to make asynchronous programming easier.


What exactly is a future anyway?

In the simplest form, a future works like an IOU.  I can't give you the money you've asked for right now, but I can give you this IOU.  At some point in the future, you can give me the IOU and I'll give you the money -- if I have it.  If I don't have it yet, then you can wait until I do.  I get paid on Friday.

Alternatively there's the dry cleaner metaphor.  You drop your clothes off on Monday and the clerk gives you a ticket that you can use later to reclaim your clothes after they've been cleaned.  The clothes will be ready on Tuesday morning, but if you show up too early, you'll have to wait.  On the other hand, if there's no hurry, you can just do other stuff on Tuesday and show up on Wednesday with a reasonable expectation that they'll be ready when you arrive.  You'll just hand your ticket over, collect your clothes, and be on your way.

A future is similar to the IOU (or the dry cleaning ticket).  It gives you a way to represent the result of a computation that has not yet completed, and it allows you to access that result once it becomes available.  So you can call a function which starts some asynchronous process but doesn't wait for it to finish.  Nevertheless the function can return you a useful result: a future which can be used to claim the real result later.

Of course if you ask for the result too soon, you'll have to wait.  On the other hand, if the result becomes available before you want it, then it will wait for you.


A simple example

Here's an example of what this might look like in pseudo-JavaScript:

function doStuff() {
  var cleanClothesFuture = dryCleaner.dropOff(dirtyClothes);
  runErrands();
  work();
  eat();
  watchTv();
  sleep();
  var cleanClothes = cleanClothesFuture.get();  // block if the result is not ready yet
}

Compare this to the traditional way we'd handled this in JavaScript, using a callback:

var cleanClothes = null;

function doStuff() {
  dryCleaner.dropOff(dirtyClothes, function (clothes) { cleanClothes = clothes; });
  runErrands();
  work();
  eat();
  watchTv();
  sleep();
}

These examples are not one hundred percent semantically identical, but they should be close enough to illustrate the point.  I contend that the first function is easier to write, easier to read, and easier to reason about.  I also contend that the difference isn't enough to get excited about.  It's when things get more complicated that futures become really useful.


A more complicated example

Imagine that I have a web page that sends an AJAX request to a server and then displays the results in an IFRAME -- and furthermore does it automatically on page load.  I have to wait for both the AJAX request to return data and for the IFRAME to finish loading -- only then can I display the results.  This can be done fairly simply using callbacks:

function showData(dataUrl, iframeUrl) {
  var data = null;
  var iframeBody = null;
 
tryToShowData { if (data && iframeBody) { showDataInIframe(data, iframeBody); } }
  requestDataFromServer(dataUrl, function (response) {data = response.data;  tryToShowData() });
  insertIframeBody(iframeUrl, function (iframeDoc) {iframeBody = iframeDoc.body; tryToShowData() });
}

Now, imagine the same thing done with futures:

function showData(dataUrl, iframeUrl) {
  var dataFuture = requestDataFromServer(dataUrl);
  var iframeBodyFuture = insertIframeBody(iframeUrl);
  showDataInIframe(dataFuture.get(), iframeBodyFuture.get());
}

Again, these two examples are not semantically equivalent -- notably there's no blocking in the first example.  Now let's imagine that we had a way to turn an ordinary function into a new function which takes futures as arguments and which returns a future in turn.  As soon as all the future arguments became available, the base function would be called automatically -- and once the base function completed, its result would be accessible through the future returned earlier.  I'll call this capability "callAsap": call a function as soon as possible after all of its future arguments become available.  Using callAsap(), the previous example might be rewritten as:

function showData(dataUrl, iframeUrl) {
  var dataFuture = requestDataFromServer(dataUrl);
  var iframeBodyFuture = insertIframeBody(iframeUrl);
  showDataInIframe.callAsap(dataFuture, iframeBodyFuture);
}

In this case we don't care about the return value for showDataInFrame.  This example is much closer in behavior to the earlier callback-based example.  In fact, the callAsap() method would be implemented with callbacks underneath, but they would all be nicely abstracted away under the hood.

One of the nice things about callAsap() is that it can nicely handle cases where you are waiting on more than one future.  Imagine that you've asynchronously requested data from two different servers:

function showData(dataUrl1, dataUrl2, iframeUrl) {
    var dataFuture1 = requestDataFromServer(dataUrl1);
    var dataFuture2 = requestDataFromServer(dataUrl2);
    showDataInIframe.callAsap(dataFuture1, dataFuture2, iframeBodyFuture);
}

This segues nicely into the next topic: Arrays of futures.


Arrays of futures

Imagine if you have not one, or two, or three futures, but rather an arbitrary number of futures.  What we'd really like to have is a way to take an array of futures and produce from it a single future for an array of concrete values.  Something like:

function showData(dataUrlArray, iframeUrls) {

  // The "dataFutureArray" is a concrete array of futures.
  var dataFutureArray = requestDataFromServers(dataUrlArray);

  // The "dataArrayFuture" is a future for a concrete array of concrete values.
  var dataArrayFuture = Future.createArrayFuture(dataFutureArray);

  showDataInIframe.callAsap(dataArrayFuture, iframeBodyFuture);
}

What this example might look like rewritten in callback style is left as an exercise to the reader.


An advanced example

OK, now for a more elaborate example.  Imagine a function which retrieves the first page of Google search results for a particular query, and then goes through and re-orders the results based on its own ranking system.  Furthermore, imagine that this ranking is computed based on the contents of each web page.  We'll need to to make requests to many different servers for many different web pages.  This will be fastest if we issue all the requests at once.

function search(query) {

  // Take a concrete search result and return a future to a
  // [searchResult, ranking] pair.

  function requestWebPageAndRanking(searchResult) {
    var webPageFuture = requestWebPage(searchResult.url);
    var rankingFuture = computeRankingFromContent.callAsap(webPageFuture);
    return Future.createArrayFuture([webPageFuture, relevanceFuture]);

  }

  // Take a concrete array of search results and return a future to
  // an array of [searchResult, ranking] pairs, sorted by ranking.

  function requestSearchResultsSortedByRanking(searchResultArray) {

    var
rankingArrayFuture = Future.createArrayFuture(
      [requestWebPageAndRanking(searchResult) for (searchResult in searchResultArray)]
    );
    return sortArraysByKeyIndex.callAsap(
rankingArrayFuture, 1);
  }

  // Request search results, re-rank them, and then display them.
  var searchResultArrayFuture = requestGoogleResults(query);
  var sortedRankingArrayFuture =
     
requestSearchResultsSortedByRanking.callAsap(searchResultArrayFuture);
  showSearchResults
.callAsap(sortedRankingArrayFuture);
}

In all fairness, this is not as simple as a synchronous blocking implementation.  Keeping your arrays of futures and futures of arrays straight is a little bit taxing.  Imagine what a callback model might look like, however, with callbacks inside callbacks.  One advantage of using futures is that you can often write traditional blocking code and then in straightforward fashion translate that code into asynchronous code using futures.

 

Notes, in no particular order

  • The examples may look like JavaScript, but they are, in fact, pseudo-code.  The implementation of some of the helper methods, psuedo-code or otherwise, are left to the imagination.
  • I have completely glossed over error handling, including such interesting topics as exception tunneling, fallback values (nulls, empty arrays, NullObjects), not to mention timeouts and retries.  If this sound scary, it's because error handling in any kind of async code is a difficult topic.  Futures don't make the situation any worse, and might make it better.
  • The name "callAsap" is my invention, although I'm certain the underlying idea has been invented independently many times.  Also note that callAsap() and Future.createArrayFuture() are fundamentally quite similar.
  • Java futures (java.util.concurrent.future) use a blocking get() method like the one in the first example.  I don't actually know how you could do a blocking get in conventional single-threaded JavaScript, which is the whole genesis of callAsap().  Practical JavaScript futures need to be "listenable futures" which raise events when they resolve.  The methods callAsap() and Future.createArrayFuture() can then be implemented using this capability.  Client code can then use these methods to avoid writing explicit callbacks.
  • The re-ranking Google search results example is contrived, but it's based on a similar project I did a few years ago.  In that project I used callbacks, and it was quite painful.