What are "futures", and why should you care?

Motivation

Last week we were discussing, among many other things, ways to speed up Firefox during startup.  One obvious option was to move more of our I/O off of the main thread.  This in turn involves making more code asynchronous, and asynchronous code is simply harder to manage.  Mike Shaver mentioned something about "futures" as a possible way to handle the extra complexity, and then the discussion moved on to something else.

I'm not exactly an expert, but I've not just used futures, I've written my own implementations in JavaScript and even Object Pascal (in hindsight I'm not sure the latter was a good idea, but it was certainly an interesting exercise).  Futures seem esoteric, but they really shouldn't be -- the idea is really quite simple.  In this post I'll try to explain what futures are and how they can be used to make asynchronous programming easier.


What exactly is a future anyway?

In the simplest form, a future works like an IOU.  I can't give you the money you've asked for right now, but I can give you this IOU.  At some point in the future, you can give me the IOU and I'll give you the money -- if I have it.  If I don't have it yet, then you can wait until I do.  I get paid on Friday.

Alternatively there's the dry cleaner metaphor.  You drop your clothes off on Monday and the clerk gives you a ticket that you can use later to reclaim your clothes after they've been cleaned.  The clothes will be ready on Tuesday morning, but if you show up too early, you'll have to wait.  On the other hand, if there's no hurry, you can just do other stuff on Tuesday and show up on Wednesday with a reasonable expectation that they'll be ready when you arrive.  You'll just hand your ticket over, collect your clothes, and be on your way.

A future is similar to the IOU (or the dry cleaning ticket).  It gives you a way to represent the result of a computation that has not yet completed, and it allows you to access that result once it becomes available.  So you can call a function which starts some asynchronous process but doesn't wait for it to finish.  Nevertheless the function can return you a useful result: a future which can be used to claim the real result later.

Of course if you ask for the result too soon, you'll have to wait.  On the other hand, if the result becomes available before you want it, then it will wait for you.


A simple example

Here's an example of what this might look like in pseudo-JavaScript:

function doStuff() {
  var cleanClothesFuture = dryCleaner.dropOff(dirtyClothes);
  runErrands();
  work();
  eat();
  watchTv();
  sleep();
  var cleanClothes = cleanClothesFuture.get();  // block if the result is not ready yet
}

Compare this to the traditional way we'd handled this in JavaScript, using a callback:

var cleanClothes = null;

function doStuff() {
  dryCleaner.dropOff(dirtyClothes, function (clothes) { cleanClothes = clothes; });
  runErrands();
  work();
  eat();
  watchTv();
  sleep();
}

These examples are not one hundred percent semantically identical, but they should be close enough to illustrate the point.  I contend that the first function is easier to write, easier to read, and easier to reason about.  I also contend that the difference isn't enough to get excited about.  It's when things get more complicated that futures become really useful.


A more complicated example

Imagine that I have a web page that sends an AJAX request to a server and then displays the results in an IFRAME -- and furthermore does it automatically on page load.  I have to wait for both the AJAX request to return data and for the IFRAME to finish loading -- only then can I display the results.  This can be done fairly simply using callbacks:

function showData(dataUrl, iframeUrl) {
  var data = null;
  var iframeBody = null;
 
tryToShowData { if (data && iframeBody) { showDataInIframe(data, iframeBody); } }
  requestDataFromServer(dataUrl, function (response) {data = response.data;  tryToShowData() });
  insertIframeBody(iframeUrl, function (iframeDoc) {iframeBody = iframeDoc.body; tryToShowData() });
}

Now, imagine the same thing done with futures:

function showData(dataUrl, iframeUrl) {
  var dataFuture = requestDataFromServer(dataUrl);
  var iframeBodyFuture = insertIframeBody(iframeUrl);
  showDataInIframe(dataFuture.get(), iframeBodyFuture.get());
}

Again, these two examples are not semantically equivalent -- notably there's no blocking in the first example.  Now let's imagine that we had a way to turn an ordinary function into a new function which takes futures as arguments and which returns a future in turn.  As soon as all the future arguments became available, the base function would be called automatically -- and once the base function completed, its result would be accessible through the future returned earlier.  I'll call this capability "callAsap": call a function as soon as possible after all of its future arguments become available.  Using callAsap(), the previous example might be rewritten as:

function showData(dataUrl, iframeUrl) {
  var dataFuture = requestDataFromServer(dataUrl);
  var iframeBodyFuture = insertIframeBody(iframeUrl);
  showDataInIframe.callAsap(dataFuture, iframeBodyFuture);
}

In this case we don't care about the return value for showDataInFrame.  This example is much closer in behavior to the earlier callback-based example.  In fact, the callAsap() method would be implemented with callbacks underneath, but they would all be nicely abstracted away under the hood.

One of the nice things about callAsap() is that it can nicely handle cases where you are waiting on more than one future.  Imagine that you've asynchronously requested data from two different servers:

function showData(dataUrl1, dataUrl2, iframeUrl) {
    var dataFuture1 = requestDataFromServer(dataUrl1);
    var dataFuture2 = requestDataFromServer(dataUrl2);
    showDataInIframe.callAsap(dataFuture1, dataFuture2, iframeBodyFuture);
}

This segues nicely into the next topic: Arrays of futures.


Arrays of futures

Imagine if you have not one, or two, or three futures, but rather an arbitrary number of futures.  What we'd really like to have is a way to take an array of futures and produce from it a single future for an array of concrete values.  Something like:

function showData(dataUrlArray, iframeUrls) {

  // The "dataFutureArray" is a concrete array of futures.
  var dataFutureArray = requestDataFromServers(dataUrlArray);

  // The "dataArrayFuture" is a future for a concrete array of concrete values.
  var dataArrayFuture = Future.createArrayFuture(dataFutureArray);

  showDataInIframe.callAsap(dataArrayFuture, iframeBodyFuture);
}

What this example might look like rewritten in callback style is left as an exercise to the reader.


An advanced example

OK, now for a more elaborate example.  Imagine a function which retrieves the first page of Google search results for a particular query, and then goes through and re-orders the results based on its own ranking system.  Furthermore, imagine that this ranking is computed based on the contents of each web page.  We'll need to to make requests to many different servers for many different web pages.  This will be fastest if we issue all the requests at once.

function search(query) {

  // Take a concrete search result and return a future to a
  // [searchResult, ranking] pair.

  function requestWebPageAndRanking(searchResult) {
    var webPageFuture = requestWebPage(searchResult.url);
    var rankingFuture = computeRankingFromContent.callAsap(webPageFuture);
    return Future.createArrayFuture([webPageFuture, relevanceFuture]);

  }

  // Take a concrete array of search results and return a future to
  // an array of [searchResult, ranking] pairs, sorted by ranking.

  function requestSearchResultsSortedByRanking(searchResultArray) {

    var
rankingArrayFuture = Future.createArrayFuture(
      [requestWebPageAndRanking(searchResult) for (searchResult in searchResultArray)]
    );
    return sortArraysByKeyIndex.callAsap(
rankingArrayFuture, 1);
  }

  // Request search results, re-rank them, and then display them.
  var searchResultArrayFuture = requestGoogleResults(query);
  var sortedRankingArrayFuture =
     
requestSearchResultsSortedByRanking.callAsap(searchResultArrayFuture);
  showSearchResults
.callAsap(sortedRankingArrayFuture);
}

In all fairness, this is not as simple as a synchronous blocking implementation.  Keeping your arrays of futures and futures of arrays straight is a little bit taxing.  Imagine what a callback model might look like, however, with callbacks inside callbacks.  One advantage of using futures is that you can often write traditional blocking code and then in straightforward fashion translate that code into asynchronous code using futures.

 

Notes, in no particular order

  • The examples may look like JavaScript, but they are, in fact, pseudo-code.  The implementation of some of the helper methods, psuedo-code or otherwise, are left to the imagination.
  • I have completely glossed over error handling, including such interesting topics as exception tunneling, fallback values (nulls, empty arrays, NullObjects), not to mention timeouts and retries.  If this sound scary, it's because error handling in any kind of async code is a difficult topic.  Futures don't make the situation any worse, and might make it better.
  • The name "callAsap" is my invention, although I'm certain the underlying idea has been invented independently many times.  Also note that callAsap() and Future.createArrayFuture() are fundamentally quite similar.
  • Java futures (java.util.concurrent.future) use a blocking get() method like the one in the first example.  I don't actually know how you could do a blocking get in conventional single-threaded JavaScript, which is the whole genesis of callAsap().  Practical JavaScript futures need to be "listenable futures" which raise events when they resolve.  The methods callAsap() and Future.createArrayFuture() can then be implemented using this capability.  Client code can then use these methods to avoid writing explicit callbacks.
  • The re-ranking Google search results example is contrived, but it's based on a similar project I did a few years ago.  In that project I used callbacks, and it was quite painful.
16 responses
This concept exists in many other languages and libraries, usually by another name. "Thunk" seems the most common. "Lazy" describes a language that does this by default. Several languages, notably variants of Lisp, call these operations "delay" and "force". The XCB library uses this technique for calls into the X Window System.
@Anonymous:

These ideas are related, but they're not exactly the same. A lazy value's value is not available until you ask for it because computation does not start until you ask for the value the first time. A future's value may not be available when you ask for it because the computation has not *completed* yet.

Another way to look at this: when you get a lazy "value" back from a function, what you really get is a "thunk", which is actually a function itself. When you try to use the lazy value (as opposed to just passing it around), the thunk gets invoked, and the value computed by the thunk is returned.

For a future, the computation is generally happening on another thread, in another process, or even on another computer, and typically that computation will get started independently of someone asking for the future's value. As typically implemented, the future doesn't have any way to start the computation even if it wanted to, all it can do is wait.

Keep in mind that the terminology here is a little slippery. "Thunk" in particular is heavily overloaded, although the meaning I'm using here goes back to it's invention in the Algol standard, if I'm not mistaken.

This looks rather like Twisted's Deferreds. It's a jolly useful technique, although wrapping your head around it can be a challenge...
@sil: It looks like Twisted Deferreds are "listenable futures". It should be pretty easy to build callAsap() on top of them, and then use that instead of writing explicit callbacks. You'd still be writing asynchronous code, so you'd still need to wrap your head around that. On the other hand, you wouldn't have to structure your code "backwards" (or maybe "inside-out") like you do when writing explicit callbacks.
The .NET Async Pattern is another implementation of futures. In C#:
IAsyncResult ar = dude.BeginSomething(...);
// ar is your future
var value = dude.EndSomething(); // retrieve value, blocking if the operation hasn't completed yet

Of course the callback mechanisms available are even nicer. Google F# async workflows for some good stuff ...

I've done a lot of work with Twisted Python recently, and I've fallen in love with Deferreds - they're a very, very (very) well-thought-out system for dealing with asynchronous computation, including error-handling and propagation. You might be interested to know that MochiKit includes a pure-JavaScript implementation (not surprising, since MochiKit was written by a Twisted developer):

http://mochikit.com/doc/html/MochiKit/Async.html

The c# have been talking a lot about implementing real futures.

But it can become dificult to implement them in a lock free language as javascript. I think the idea of futures in javascript leads to something like http://www.croczilla.com/oni

Screwtape: You might be interested to hear that dojo framework also has a nice Deferred implementation.
I recently wrote how waiting for multiple futures at once could be implemented efficiently: http://beza1e1.tuxen.de/articles/futures.html
I know way too little about all those async concepts, but I guess something like an AreWeThereYet function or property might help there as well, so I can determine the future is already available when I would have time to handle it or still wait and do other stuff.
@Curtis Bartley: While laziness and thunks don't always represent the same concept as a future, you can use them to implement futures. You just need to carefully control which part of the computation you put into the thunk and which part you go ahead and start doing.

The XCB library I mentioned comes the closest to the concept of futures. Make an X request, get a cookie back. Do other stuff. Force the cookie to get the reply from the server, blocking at that point if you don't have it yet. This allows you to hide the latency of a round-trip to the X server by doing multiple requests in parallel before checking for the replies.

Neil Mix implemented futures in JavaScript using coroutines and code generation:
http://www.neilmix.com/narrativejs/doc/overview.html

I've been playing with Oni (linked above) recently. In my opinion, it's more powerful than futures for many types of concurrent programs, but futures still seem more expressive for the specific case where you want to launch some requests immediately, do some other work in the meantime, and then use the results of the requests.

Last year I wrote a four-part series dissecting a different solution to the same problem:

http://hyperstruct.net/2008/5/17/synchronous-invocation-in-javascript-part-1-problem-and-basic-solution

I'm using it as part of http://www.sameplace.cc and actual implementation is less than 60 lines of code: http://github.com/bard/xmpp4moz/blob/4cb392d69ad889337f871d6bfb77d657ccb87076/modules/task.jsm

I've created a JavaScript library for Futures, Promises, Subscriptions, Joins, etc at http://github.com/coolaj86/futures
2 visitors upvoted this post.