We need a new word for "Open"

When I say I'm developing an "open" API, what does that mean to you?

To Facebook, it means something where you can get at some of your own data, as long as you abide by their restrictions on what you can do with it.

To Twitter, it means something where you get back what you put in.

To Flickr, it means something where you can get every single piece of data that you've ever put in the system, enriched by any value that Flickr can provide.

To Google, it means something where you can get back what you put in, you can host your own version of the API on your own server, and your server has exactly the same status as any Google server has, and that Google's servers and your server will interact and work as a system.

We need new words. Words that have more specific meaning--which encompass the broad idea of "open", yet remain accessible and recognizable by everyone. Federated is a good word, but it's possible to have a closed, federated system.

What words should we use?

How do we kick our synchronous addiction?

Asynchronous programming is superior both in memory usage and in overall throughput when compared to synchronous programming . We've known this fact for years. If we look at Django or Ruby on Rails, arguably the two most promising new web application frameworks to emerge in the past few years, both of them are written in such a way that synchronous programming is assumed. Why is it that even in 2010 we're still writing programs that rely on synchronous programming ?

The reason that we're stuck on synchronous programming is twofold. Firstly, the programming model required for straightforward asynchronous implementations is inconvenient. Secondly, popular and/or mainstream languages lack the built-in language constructs that are needed to implement a less-straightforward approach to asynchronous programming.

Asynchronous programming is too hard

Let's first examine the straightforward implementation: an event loop. In this programming model, we have a single process with a single loop that runs continuously. Functionality is achieved by writing functions to execute small tasks quickly, and inserting those functions into that event loop. One of those functions might read some bytes from a socket, while another function might write a few bytes to a file, and yet another function might do something computational like calculating an XOR on the data that's been buffered from that first socket.

The most important part about this event loop is that only one thing is ever happening at a time. That means that you really have to break your logic up into small chunks that can be performed incrementally. If any one of our functions blocks, it hogs the event loop and nothing else can execute during that time.

We have some really great frameworks geared towards making this event loop model easier to work with. In Python, there's Twisted and, more recently, Tornado. In Ruby there's EventMachine. In PERL there's POE. What these frameworks do is twofold: provide constructs for more easily working with an event loop (e.g. Deferreds or Promises), and provide asynchronous implementations of common tasks (e.g. HTTP clients and DNS resolution).

But these frameworks stop very short of making asynchronous programming easy for two reasons. The first reason is that we really do have to completely change our coding style. Consider what it would take to render a simple blog web page with comments. Here's some JavaScript code to demonstrate how this might work in a synchronous framework:

function handleBlogPostRequest(request, response, postSlug) {
    var db = new DBClient();
    var post = db.getBlogPost(postSlug);
    var comments = db.getComments(post.id);
    var html = template.render('blog/post.html',
        {'post': post, 'comments': comments});
    response.write(html);
    response.close();
}

Now here's some JavaScript code to demonstrate how this might look in an asynchronous framework. Note several things here: We've specifically written this in such a way that it doesn't become nested four levels deep. We've also written these callback functions inside of the handleBlogPostRequest function to take advantage of closure so as to retain access to the request and response objects, the template context, and the database client. Both the desire to avoid nesting and the closure are things that we need to think about as we write this code, that were not even considerations in the synchronous version:

function handleBlogPostRequest(request, response, postSlug) {
    var context = {};
    var db = new DBClient();
    function pageRendered(html) {
        response.write(html);
        response.close();
    }
    function gotComments(comments) {
        context['comments'] = comments;
        template.render('blog/post.html', context).addCallback(pageRendered);
    }
    function gotBlogPost(post) {
        context['post'] = post;
        db.getComments(post.id).addCallback(gotComments);
    }
    db.getBlogPost(postSlug).addCallback(gotBlogPost);
}

I've chosen JavaScript here to prove a point, by the way. People are very excited about node.js right now, and it's a very cool framework, but it doesn't hide all of the complexities involved in doing things asynchronously. It only hides some of the implementation details of the event loop.

The second reason why these frameworks fall short is because not all IO can be handled properly by a framework, and in these cases we have to resort to bad hacks. For example, MySQL does not offer an asynchronous database driver, so most of the major frameworks end up using threads to ensure that this communication happens out of band.

Given the inconvenient API, the added complexity, and the simple fact that most developers haven't switched to using this style of programming, leads us to the conclusion that this type of framework is not a desirable final solution to the problem (even though I do concede that you can get Real Work done today using these techniques, and many people do). That being the case, what other options do we have for asynchronous programming? Coroutines and lightweight processes, which brings us to our next major problem.

Languages don't support easier asynchronous paradigms

There are a few language constructs that, if implemented properly in modern programming languages, could pave the way for alternative methods of doing asynchronous programming that don't have the drawbacks of the event loop. These constructs are coroutines and lightweight processes.

A coroutine is a function that can suspend and resume its execution at certain, programmatically specified, locations. This simple concept can serve to transform blocking-looking code to be non-blocking. At certain critical points in your IO library code, the low-level functions that are doing IO can choose to "cooperate". That is, it can choose to suspend execution in order for another function to resume execution and continue on.

Here's an example (it's Python, but fairly understandable for all I hope):

def download_pages():
    google = urlopen('http://www.google.com/').read()
    yahoo = urlopen('http://www.yahoo.com/').read()

Normally the way this would work is that a socket would be opened, connected to Google, an HTTP request sent, and the full response would be read, buffered, and assigned to the google variable, and then in turn the same series of steps would be taken for the yahoo variable.

Ok, now imagine that the underlying socket implementation were built using coroutines that cooperated with each other. This time, just like before, the socket would be opened and a connection would be made to Google, and then a request would be fired off. This time, however, after sending the request, the socket implementation suspends its own execution.

Having suspended its execution (but not yet having returned a value), execution continues on to the next line. The same thing happens on the Yahoo line: once its request has been fired off, the Yahoo line suspends its execution. But now there's something else to cooperate with--there's actually some data ready to be read on the Google socket--so it resumes execution at that point. It reads some data from the Gooogle socket, and then suspends its execution again.

It jumps back and forth between the two coroutines until one has finished. Let's say that the Yahoo socket has finished, but the Google one has not. In this case, the Google socket just continues to read from its socket until it has completed, because there are no other coroutines to cooperate with. Once the Google socket is finally finished, the function returns with all of the buffered data.

Then the Yahoo line returns with all of its buffered data.

We've preserved the style of our blocking code, but we've used asynchronous programming to do it. Best of all, we've preserved our original program flow--the google variable is assigned first, and then the yahoo variable is assigned. In truth, we've got a smart event loop going on underneath the covers to control who gets to execute, but it's hidden from us due to the fact that coroutines are in play.

Languages like PHP, Python, Ruby, and Perl simply don't have built-in coroutines that are robust enough to implement this kind of behind-the-scenes transformation. So what about these lightweight processes?

Lightweight processes are what Erlang uses as its main concurrency primitive. Essentially these are processes that are mostly implemented in the Erlang VM itself. Each process has approximately 300 words of overhead and its execution is scheduled primarily by the Erlang VM, sharing no state at all amongst processes. Essentially, we don't have to think twice about spawning a process, as it's essentially free. The catch is that these processes can only communicate via message passing.

Implementing these lightweight processes at the VM level gets rid of the memory overhead, the context switching, and the relative sluggishness of interprocess communication provided by the operating system. Since the VM also has insight into the memory stack of each process, it can freely move or resize those processes and their stacks. That's something that the OS simply cannot do.

With this model of lightweight processes, it's possible to again revert back to the convenient model of using a separate process for all of our asynchronous programming needs. The question becomes this: can this notion of lightweight processes be implemented in languages other than Erlang? The answer to that is "I don't know." To my knowledge, Erlang takes advantage of some features of the language itself (such as having no mutable data structures) in its lightweight process implementation.

Where do we go from here?

The key to moving forward is to drop the notion that developers need to learn to think about all of their code in terms of callbacks and asynchrony, as the asynchronous event loop frameworks require them to do. Over the past ten years, we can see that most developers, when faced with that decision, simply choose to ignore it. They continue to use the inferior blocking methodologies of yesteryear.

We need to look at these alternative implementations like coroutines and lightweight processes, so that we can make asynchronous programming as easy as synchronous programming. Only then will we be able to kick this synchronous addiction.

You care about Facebook, you just might not know it yet

I recently had a conversation with a few friends that I know through the Python community. These are people who I respect a great deal, and look to for advice and insight when it comes to web development. During the course of this conversation, the subject of Facebook came up. What I wasn't expecting at all was that a large number of them said that they have no interest in Facebook, and quite a few of them proudly didn't even have accounts.

To put it bluntly, I believe that ignorance of Facebook is a major handicap today, both for developers and for entrepreneurs, and people who are not paying attention to it are burying their head in the sand.

Facebook has gotten to a point where it's the destination where the most time is spent online. It gets an estimated 260 billion (that's a B on that number) pageviews per month. In a single week, new properties are able to garner 8.6 million new active users. Sites which implement Facebook Connect typically see a 1.3x-2x boost in registrations. Companies building on Facebook's platform are being sold for 400 million dollars and are making 100 million dollars yearly. There are over 300 million active registered users. In other words, if you talk to a person that spends any time on the internet at all, they likely have a Facebook account. These numbers, by the way, are trending up and to the rightâ„¢.

Take the word Facebook out of the above paragraph, and replace it with icanhazcheeseburger. Replace it with 4chan, or somethingawful, or barbie, or anything. No matter what you replace as the company name, it doesn't change the fact that those numbers are compelling enough to be irresponsible to ignore.

Let me be perfectly clear: I don't particularly enjoy using Facebook. I find its UI cluttered, its privacy controls confusing, and its content fairly trivial. From the development side, Facebook's APIs are clunky at best. I'm definitely not advocating that you should log in every day and love it. I don't. The mass populous, however, does. I'm simply advocating that you should at least have an account, log in every once in a while, and keep tabs on the announcements that they make for developers.

As people spend more and more time consuming and producing content inside of the Facebook ecosystem, it's going to be those who change and embrace it who succeed, and those who fail to adapt that stand to lose. Note that I said producing content "inside of the Facebook ecosystem", and not "inside Facebook", as Facebook has been making very strong pushes to extend the reach of its platform well outside of its destination site. You can augment your site to have people pushing content into Facebook from your own destination sites.

Facebook is willing to give you access to a person's information, to their entire social graph, and to allow your user to become an advocate of your site, all while making registration on your site as simple as one click. Sure, you may evaluate all of that as an option and decide that it's not important enough to implement, but surely it's an important enough option to warrant your cognizance.

Do I think that Facebook will be relevant in 10 years? Probably. I'm not willing to bet much on it. One thing you can be absolutely sure about though, is that if another service with this much power comes along, I'll have an account and be acutely aware of its developer resources.

My Thoughts on NoSQL

Over the past few years, relational databases have fallen out of favor for a number of influential people in our industry. I'd like to weigh in on that, but before doing so, I'd like to give my executive summary of the events leading up to this movement:

In the late nineties and early thousands, websites were mostly read-only--a publisher would create some content and users would consume that content. The data access patterns for these types of applications became very well-understood, and as a result many tools were created and much research and development was done to further develop these technologies.

As the web has grown more social, however, more and more it's the people themselves who have become the publishers. And with that fundamental shift away from read-heavy architectures to read/write and write-heavy architectures, a lot of the way that we think about storing and retrieving data needed to change.

Most people have done this by relying less on the features provided by traditional relational databases and engineering more database logic in their application code. Essentially, they stop using relational databases the way they were intended to be used, and they instead use them as dumb data stores.

Other people have engineered new database systems from the ground up, each with a different set of tradeoffs and differences from their relational database brethren. It's these new databases that have some in our industry excited, and it's these databases that I'm going to focus on primarily in this post.

(By the way, there's a whole lot more theory behind the movement away from SQL. Primarily of interest is the CAP theorem and the Dynamo paper. Both of these illustrate the necessary tradeoffs of between different approaches to designing databases.)

Let's get this out of the way

I love SQL. More than even that, I love my precious ORM and being able to query for whatever information I want whenever I want it. For the vast majority of sites out there (we're talking 99.9% of the sites out there, folks) it suits their needs very well, providing a good balance of ease of use and performance.

There's no reason for them to switch away from SQL, and there's no way they will. If there's one thing I don't like about this whole NoSQL movement, it's the presumption that everyone who's interested in alternative databases hates the status quo. That's simply not true.

But we're not talking about most sites out there, we're not talking about the status quo, we're talking about the few applications that need something totally different.

Tokyo Cabinet / Tokyo Tyrant

Tokyo Cabinet (and its network interface, Tokyo Tyrant) is the logical successor to Berkeley DB--a blazing fast, open-source, embeddable key-value store that does just about what you would expect from its description. It supports 3 modes of operation: hashtable mode, b-tree mode, and table mode.

(Table mode still pretty much sucks, and I'm not convinced it's a good idea for the project since it's added bloat and other systems like RDBMs are probably better for storing tabular data, so I'm going to skip it.)

Essentially, the API into Tokyo Cabinet is that of a gigantic associative array. You give it a key and a value, and then later, given a key, it will give you back the value you put in. Its largest assets are that it's fast and straightforward.

If your problem is such that you have a small to medium-sized amount of data, which needs to be updated rapidly, and can be easily modeled in terms of keys and values (almost all scenarios can be rewritten in terms of keys and values, but some problems are easier to convert than others), then Tokyo Cabinet and Tokyo Tyrant are the way to go.

CouchDB

CouchDB is similar to Tokyo Cabinet in that it essentially maps keys to data, but CouchDB's philosophy is completely different. Instead of arbitrary data, its data has structure--it's a JSON object. Instead of only being able to query by keys, you can upload functions that index your data for you and then you can call those functions. All of this is done over a very simple REST interface.

But none of this really matters. None of these really set CouchDB apart, because you could just encode JSON data and store it in Tokyo Cabinet, you can maintain your own indexes of data fairly easily, and you can build a simple REST API in a matter of days, if not hours.

What really sets CouchDB apart from the pack is it's innovative replication strategy. It was written in such a way that nodes which are disconnected for long periods of time can reconnect, sync with each other, and reconcile their differences in a way that no other database (since Lotus Notes?) could do.

It's functionality that allows for interesting and new distributed types of applications and data that I think could possibly change the way we take our applications offline. I imagine that some day every computer will come with CouchDB pre-installed and it'll be a data store that we use without even knowing that we're using it.

However, I wouldn't choose it for a super high scalability site with lots of data and sharding and replication and high availability and all those buzzwords, because I'm not convinced it's the right tool for that job, but I am convinced that its replication strategy will keep it relevant for years to come.

Redis

Wow, looking at the bullet points this database seems to do just about everything, perfectly! Yeah, it's a bit prone to hyperbole and there are some great things about it, but a lot of it is hot air. For example, it claims to support sharding but really all it does is have the client run a hash function on its key and use that to determine which server to send its value to. This is something that any database can do.

When you get down to it, Redis is a key-value store which provides a richer API than something like Tokyo Cabinet. It does more operations in memory, only periodically flushing to disk, so there's more of a risk that you could lose data on a crash. The tradeoff is that it's extremely fast, and it does some neat things like allow you to append a value to the end of a list of items already stored for a given key.

It also has atomic operations. This is honestly the only reason I find this project interesting, because the atomic operation support that it has means that it can be turned into a best-of-breed tally server. If you are building a server to keep real-time counts of various things, you would be remiss to overlook Redis as a very viable option.

Cassandra

It's good to save the best for last, and that's exactly what I've done as I find Cassandra to be easily the most interesting non-relational database out there today. Originally developed by Facebook, it was developed by some of the key engineers behind Amazon's famous Dynamo database.

Cassandra can be thought of as a huge 4-or-5-level associative array, where each dimension of the array gets a free index based on the keys in that level. The real power comes from that optional 5th level in the associative array, which can turn a simple key-value architecture into an architecture where you can now deal with sorted lists, based on an index of your own specification. That 5th level is called a SuperColumn, and it's one of the reasons that Cassandra stands out from the crowd.

Cassandra has no single points of failure, and can scale from one machine to several thousands of machines clustered in different data centers. It has no central master, so any data can be written to any of the nodes in the cluster, and can be read likewise from any other node in the cluster.

It provides knobs that can be tweaked to slide the scale between consistency and availability, depending on your particular application and problem domain. And it provides a high availability guarantee, that if one node goes down, another node will step in to replace it smoothly.

Writing about all the features of Cassandra is a whole different post, but I am convinced that its data model is rich enough to support a wide variety of applications while providing the kind of extreme scalability and high availability features that few other databases can achieve--all while maintaining a lower latency than other solutions out there.

Conclusion

There are many other non-relational databases out there: HBase and Hypertable, which are replicating Google's BigTable despite its complexity and problems with single points of failure. MongoDB is another database that has been getting some traction, but it seems to be a jack of all trades, master of none. In short, the above databases are the ones that I find interesting right now, and I would use each of them for different use cases.

What do you all think about this whole non-relational database thing? Do you agree with my thoughts or do you think I'm full of it?

Flojax: A unobtrusive and easy strategy for creating AJAX-style web applications

Writing AJAX-style web applications can be very tedious. If you're using XML as your transport layer, you have to parse the XML before you can work with it. It's a bit easier if you're using JSON, but once you have parsed the data, the data still needs to be turned into HTML markup that matches the current markup on the page. Finally, the newly created markup needs to be inserted into the correct place in the DOM, and any event handlers need to be attached to the appropriate newly-inserted markup.

So there's the parsing, the markup assembly, the DOM insertion, and finally the event handler attachment. Most of the time, people tend to write custom code for each element that needs asynchronous updating. There are several drawbacks with this scenario, but the most frustrating part is probably that the presentation logic is implemented twice--once in a templating language on the server which is designed specifically for outputting markup, and again on the client with inline Javascript. This leads to problems both in the agility and in the maintainability of this type of application.

With flojax, this can all be accomplished with one generalized implementation. The same server-side logic that generates the data for the first synchronous request can be used to respond to subsequent asynchronous requests, and unobtrusive attributes specify what to do for the rest.

The Basics

The first component for creating an application using the flojax strategy is to break up the content that you would like to reload asynchronously into smaller fragments. As a basic example of this, let's examine the case where there is a panel of buttons that you would like to turn into asynchronous requests instead of full page reloads.

The rendered markup for a fragment of buttons could look something like this:

<div class="buttons">
    <a href="/vote/up/item1/">Vote up</a>
    <a href="/vote/down/item1/">Vote down</a>
    <a href="/favorite/item1/">Add to your favorites</a>
</div>

In a templating language, the logic might look something like this:

<div class="buttons">
    {% if voted %}
        <a href="/vote/clear/{{ item.id }}/">Clear your vote</a>
    {% else %}
        <a href="/vote/up/{{ item.id }}/">Vote up</a>
        <a href="/vote/down/{{ item.id }}/">Vote down</a>
    {% endif %}
    {% if favorited %}
        <a href="/favorite/{{ item.id }}/">Add to your favorites</a>
    {% else %}
        <a href="/unfavorite/{{ item.id }}/">Remove from your favorites</a>
    {% endif %}
</div>

(Typically you wouldn't use anchors to do operations that can change state on the server, so you can imagine this would be accomplished using forms. However, for demonstration and clarity purposes I'm going to leave these as links.)

Now that we have written a fragment, we can start using it in our larger templates by way of an include, which might look something like this:

...
<p>If you like this item, consider favoriting or voting on it:</p>
{% include "fragments/buttons.html" %}
...

To change this from being standard links to being asynchronously updated, we just need to annotate a small amount of data onto the relevant links in the fragment.

<div class="buttons">
    {% if voted %}
        <a href="/vote/clear/{{ item.id }}/" class="flojax" rel="buttons">Clear your vote</a>
    {% else %}
        <a href="/vote/up/{{ item.id }}/" class="flojax" rel="buttons">Vote up</a>
        <a href="/vote/down/{{ item.id }}/" class="flojax" rel="buttons">Vote down</a>
    {% endif %}
    {% if favorited %}
        <a href="/favorite/{{ item.id }}/" class="flojax" rel="buttons">Add to your favorites</a>
    {% else %}
        <a href="/unfavorite/{{ item.id }}/" class="flojax" rel="buttons">Remove from your favorites</a>
    {% endif %}
</div>

That's it! At this point, all of the click events that happen on these links will be changed into POST requests, and the response from the server will be inserted into the DOM in place of this div with the class of "buttons". If you didn't catch it, all that was done was to add the "flojax" class onto each of the links, and add a rel attribute that refers to the class of the parent node in the DOM to be replaced--in this case, "buttons".

Of course, there needs to be a server side component to this strategy, so that instead of rendering the whole page, the server just renders the fragment. Most modern Javascript frameworks add a header to the request to let the server know that the request was made asynchronously from Javascript. Here's how the code on the server to handle the flojax-style request might look (in a kind of non-web-framework-specific Python code):

def vote(request, direction, item_id):
    item = get_item(item_id)

    if direction == 'clear':
        clear_vote(request.user, item)
    elif direction == 'up':
        vote_up(request.user, item)
    elif direction == 'down':
        vote_down(request.user, item)

    context = {'voted': direction != 'clear', 'item': item}

    if request.is_ajax():
        return render_to_response('fragments/buttons.html', context)

    # ... the non-ajax implementation details go here

    return render_to_response('items/item_detail.html', context)

There are several advantages to writing your request handlers in this way. First, note that we were able to totally reuse the same templating logic from before--we just render out the fragment instead of including it in a larger template. Second, we have provided a graceful degradation path where users without javascript are able to interact with the site as well, albeit with a worse user experience.

That's really all there is to writing web applications using the flojax strategy.

Implementation Details

I don't believe that the Javascript code for this method can be easily reused, because each web application tends to have a different way of showing errors and other such things to the user. In this post, I'm going to provide a reference implementation (using jQuery) that can be used as a starting point for writing your own versions. The bulk of the work is done in a function that is called on every page load, called flojax_init.

function flojax_clicked() {
    var link = $(this);
    var parent = link.parents('.' + link.attr('rel'));

    function successCallback(data, textStatus) {
        parent.replaceWith(data);
        flojax_init();
    }
    function errorCallback(request, textStatus, errorThrown) {
        alert('There was an error in performing the requested operation');
    }

    $.ajax({
        'url': link.attr('href'),
        'type': 'POST',
        'data': '',
        'success': successCallback,
        'error': errorCallback
    });

    return false;
}

function flojax_init() {
    $('a.flojax').live('click', flojax_clicked);
}

There's really not a lot of code there. It POSTS to the given URL and replaces the specified parent class with the content of the response, and then re-initializes the flojax handler. The re-initialization could even be done in a smarter way, as well, by targeting only the newly inserted content. Also, you might imagine that an alert message probably wouldn't be such a great user experience, so you could integrate error messages into some sort of Javascript messaging or growl-style system.

Extending Flojax

Often times you'll want to do other things on the page when the asynchronous request happens. For our example, maybe there is some kind of vote counter that needs to be updated or some other messages that need to be displayed.

In these cases, I have found that using hidden input elements in the fragments can be useful for transferring that information from the server to the client. As long as the value in the hidden elements adheres to some predefined structure that your client knows about (it could even be something like JSON if you need to go that route).

If what you want can't be done by extending the fragments in this way, then flojax isn't the right strategy for that particular feature.

Limitations

This technique cannot solve all of the world's problems. It can't even solve all of the problems involved in writing an AJAX-style web application. It can, however, handle a fair amount of simple cases where all you want to do is quickly set up a way for a user's action to replace content on a page.

Some specific examples of things that flojax can't help with are if a user action can possibly update many items on a page, or if something needs to happen without a user clicking on a link. In these situations, you are better off coding a custom solution instead of trying to shoehorn it into the flojax workflow.

Conclusion

Writing AJAX-style web applications is usually tedious, but using the techniques that I've described, a large majority of the tedious work can be reduced. By using the same template code for rendering the page initially as with subsequent asynchronous requests, you ensure that code is not duplicated. By rendering HTML fragments, the client doesn't have to go through the effort of parsing the output and converting the result into correct DOM objects. Finally, by using a few unobtrusive conventions (like the rel attribute and the flojax class), the Javascript code that a web application developer writes is able to be reused again and again.

I don't believe that any of the details that I'm describing are new. In fact, people have been doing most of these things for years. What I think may in fact be new is the generalization of the sum of these techniques in this way. It's still very much a work in progress, though. As I use flojax more and more, I hope to find not only places where it can be extended to cover more use cases, but also its limitations and places where it makes more sense to use another approach.

What do you think about this technique? Are you using any techniques like this for your web applications? If so, how do they differ from what I've described?

Search

Badges

  • django badge
  • apache badge
  • GeoURL
  • XFN Friendly
  • Valid HTML 4.01 Transitional