CouchApps at OSCON 2011

Here are the slides from today’s OSCON presentation on CouchApps with CouchDB, JavaScript & HTML5:

Related links:

Learning CouchDB at OSCON 2011

Here are the slides from today’s OSCON Data workshop on Learning CouchDB:

Related links:

Exploring RabbitMQ and PHP

I’m exploring the possibility of using RabbitMQ for an upcoming project. RabbitMQ is a free/open source message broker platform. It uses the open Advanced Message Queuing Protocol (AMQP) standard and is written in Erlang using the Open Telecom Platform (OTP). It promises a high level of availability, throughput, scalability, and portability. Since it is built using open standards, it is interoperable with other messaging systems and can be accessed from any platform.

I’ll be using RabbitMQ first from PHP, but I plan on using it to send and receive messages to and from other systems. Following are the steps I used to get RabbitMQ and PHP’s AMQP library setup on my development machine.

First, I installed RabbitMQ using MacPorts:

$ sudo port install rabbitmq-server

Then, I started RabbitMQ:

$ sudo rabbitmq-server -detached

Next, I installed the librabbitmq library using a slight variation of the instructions on PHP’s AMQP Installation page (you may need to install Mercurial first):

$ hg clone http://hg.rabbitmq.com/rabbitmq-c/rev/3c549bb09c16 rabbitmq-c
$ cd rabbitmq-c
$ hg clone http://hg.rabbitmq.com/rabbitmq-codegen/rev/f8b34141e6cb codegen
$ autoreconf -i && ./configure && make && sudo make install

Then, I installed the AMQP extension using PECL:

$ sudo pecl install amqp-beta

To test that everything works, I opened up two interactive PHP shells using php -a. I ran the following code in the first PHP shell:

$exchangeName = 'messages';
$routeKey = 'routeA';
$message = 'Hello, world.';
$connection = new AMQPConnection();
$connection->connect();
$exchange = new AMQPExchange($connection);
$exchange->declare($exchangeName);

I then ran the following code in the second PHP shell:

$exchangeName = 'messages';
$routeKey = 'routeA';
$connection = new AMQPConnection();
$connection->connect();
$queue = new AMQPQueue($connection);
$queue->declare($exchangeName);
$queue->bind($exchangeName, $routeKey);

Back in the first PHP shell:

$exchange->publish($message, $routeKey);

Back in the second PHP shell:

$message = $queue->get();
print_r($message);

Here is the output I got from the print_r statement:

Array
(
    [routing_key] => routeA
    [exchange] => messages
    [delivery_tag] => 1
    [Content-type] => text/plain
    [count] => 0
    [msg] => Hello, world.
)

There are several other options that can be set, and a lot more to learn about RabbitMQ and AMP. Check out the documentation for PHP’s AMQP extension for details about working with AMQP servers from PHP.

CouchDB jQuery Plugin Reference

I’ve had a difficult time finding documentation on the CouchDB jQuery plugin that ships with CouchDB. So, I’ve decided to create my own reference and share it with you. This should cover almost the entire CouchDB API that is available through the version of the plugin that ships with CouchDB 1.1.0.

Setup

The same-origin policy effectively requires that the HTML from which the JavaScript is loaded must be served up from CouchDB (you could instead use a reverse proxy server). This can be done by attaching an HTML document to a CouchDB document. You can do this manually, or through the use of CouchApps.

CouchDB ships with several useful JavaScript libraries, including jQuery. Here is an example of an HTML document that takes advantage of CouchDB’s included JavaScript libraries:

<!DOCTYPE html>
<html>
  <head>
    <title>CouchDB jQuery Examples</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <script src="/_utils/script/json2.js"></script>
    <script src="/_utils/script/sha1.js"></script>
    <script src="/_utils/script/jquery.js?1.4.2"></script>
    <script src="/_utils/script/jquery.couch.js?0.11.0"></script>
    <script src="/_utils/script/jquery.dialog.js?0.11.0"></script>
  </head>
  <body>
  </body>
</html>

By default, CouchDB will use an empty string as its URL prefix. However, you can configure this before your first API call. For example:

$.couch.urlPrefix = "http://localhost:5984";

You will typically not want to change this value since the same-origin policy prevents you from accessing CouchDB databases hosted at a different origin.

Server API

Server Information

JavaScript example:

$.couch.info({
    success: function(data) {
        console.log(data);
    }
});

Resulting HTTP request:

GET http://localhost:5984/ 200 OK

Console output:

{
    "couchdb"="Welcome",
    "version"="1.1.0"
}

All DBs

JavaScript example:

$.couch.allDbs({
    success: function(data) {
        console.log(data);
    }
});

Resulting HTTP request:

GET http://localhost:5984/_all_dbs 200 OK

Console output:

[
    "_replicator",
    "_users"
]

User Signup

JavaScript example:

var userDoc = {
    _id: "org.couchdb.user:bob",
    name: "bob"
};
$.couch.signup(userDoc, "supersecurepassword", {
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    }
});

Resulting HTTP request:

PUT http://localhost:5984/_users/org.couchdb.user%3Abob 201 Created

Console output:

{
    "ok"=true,
    "id"="org.couchdb.user:bob",
    "rev"="1-230dc0625bd3c4aac735846cc152c296"
}

Login

JavaScript example:

$.couch.login({
    name: "bob",
    password: "supersecurepassword",
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    }
});

Resulting HTTP request:

POST http://localhost:5984/_session 200 OK

Console output:

{
    "ok":true,
    "name":"bob",
    "roles":[]
}

This will result in an AuthSession cookie being set and then sent back to the server on subsequent requests, authenticating the user on each request. The session length is 10 minutes by default, but can be set in CouchDB’s couch_httpd_auth.timeout configuration option.

Session

JavaScript example:

$.couch.session({
    success: function(data) {
        console.log(data);
    }
});

Resulting HTTP request:

GET http://localhost:5984/_session 200 OK

Console output:

{
    "ok":true,
    "userCtx":{
        "name":"bob",
         "roles":[]
    },
    "info":{
        "authentication_db":"_users",
        "authentication_handlers":[
            "oauth",
            "cookie",
            "default"
        ],
        "authenticated":"cookie"
    }
}

Logout

JavaScript example:

$.couch.logout({
    success: function(data) {
        console.log(data);
    }
});

Resulting HTTP request:

DELETE http://_:_@localhost:5984/_session 200 OK

Console output:

{
    "ok":true
}

This will set the AuthSession cookie to an empty string, effectively logging the client out.

Active Tasks

JavaScript example:

$.couch.activeTasks({
    success: function(data) {
        console.log(data);
    }
});

Resulting HTTP request:

GET http://localhost:5984/_active_tasks 200 OK

Console output:

[]

Server Configuration

JavaScript example:

$.couch.config({
    success: function(data) {
        console.log(data);
    }
}, "uuids", "algorithm");

Resulting HTTP request:

GET http://localhost:5984/_config/uuids/algorithm 200 OK

Console output:

"sequential"

The second and third parameters are the section and option parameters, respectively. Omit the option parameter if you’d like to retrieve the entire section. Omit both the section and option parameters if you’d like to see the entire server configuration. Add an optional fourth parameter to set the configuration option’s value, which will result in a PUT request with the value.

User DB

JavaScript example:

$.couch.userDb(function(data) {
    console.log(data);
});

Resulting HTTP request:

GET http://localhost:5984/_session 200 OK

Console output:

{
    "name"="_users",
    "uri"="../_users/"
}

The URI will be relative to the HTML executing the script.

New UUID

JavaScript example:

var uuid = $.couch.newUUID();
console.log(uuid);

Resulting HTTP request:

GET http://localhost:5984/_uuids?count=1 200 OK

Console output:

d12ee5ea1df6baa2b06451f44a0156fa

This function takes a count parameter as an optional first parameter. If the count parameter is greater than one, then it will keep a cache of UUIDs that will be returned on subsequent calls, rather than making another request to the server.

Replicate

JavaScript example:

$.couch.replicate("mydb", "otherdb", {
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    }
}, {
    create_target: true
});

Resulting HTTP request:

POST http://localhost:5984/_replicate 200 OK

Console output:

{
    "ok":true,
    "session_id":"705797bae87456c52f5b76f44fe5f245",
    "source_last_seq":27,
    "replication_id_version":2,
    "history":[
        {
            "session_id":"705797bae87456c52f5b76f44fe5f245",
            "start_time":"Wed, 13 Jul 2011 22:40:09 GMT",
            "end_time":"Wed, 13 Jul 2011 22:40:09 GMT",
            "start_last_seq":0,
            "end_last_seq":27,
            "recorded_seq":27,
            "missing_checked":0,
            "missing_found":14,
            "docs_read":14,
            "docs_written":14,
            "doc_write_failures":0
        }
    ]
}

Database API

Create Database

JavaScript example:

$.couch.db("mydb").create({
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    }
});

Resulting HTTP request:

PUT http://localhost:5984/mydb/ 201 Created

Console output:

{
    "ok":true
}

Database Info

JavaScript example:

$.couch.db("mydb").info({
    success: function(data) {
        console.log(data);
    }
});

Resulting HTTP request:

GET http://localhost:5984/mydb/ 200 OK

Console output:

{
    "db_name":"mydb",
    "doc_count":0,
    "doc_del_count":0,
    "update_seq":0,
    "purge_seq":0,
    "compact_running":false,
    "disk_size":79,
    "instance_start_time":"1310597000825187",
    "disk_format_version":5,
    "committed_update_seq":0
}

Save New Document

JavaScript example:

var doc = {};
$.couch.db("mydb").saveDoc(doc, {
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    }
});

Resulting HTTP request:

POST http://localhost:5984/mydb/ 201 Created

Console output:

{
    "ok":true,
    "id":"d12ee5ea1df6baa2b06451f44a019ab9",
    "rev":"1-967a00dff5e02add41819138abb3284d"
}

Open Document

JavaScript example:

$.couch.db("mydb").openDoc("d12ee5ea1df6baa2b06451f44a019ab9", {
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    }
});

Resulting HTTP request:

GET http://localhost:5984/mydb/d12ee5ea1df6baa2b06451f44a019ab9 200 OK

Console output:

{
    "_id":"d12ee5ea1df6baa2b06451f44a019ab9",
    "_rev":"1-967a00dff5e02add41819138abb3284d"
}

Note that the HTTP response includes an Etag HTTP header. If you do a subsequent request for the same document, and your client supports caching and conditional HTTP requests, then an If-None-Match header should be sent using the value from the previous response’s Etag header. If the document has not been modified, then CouchDB will send a 304 Not Modified response and no response body, saving bandwidth and speeding up the response.

Save Updated Document

JavaScript example:

var doc = {
    _id: "d12ee5ea1df6baa2b06451f44a019ab9",
    _rev: "1-967a00dff5e02add41819138abb3284d",
    foo: "bar"
};
$.couch.db("mydb").saveDoc(doc, {
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    }
});

Resulting HTTP request:

PUT http://localhost:5984/mydb/d12ee5ea1df6baa2b06451f44a019ab9 201 Created

Console output:

{
    "ok":true,
    "id":"d12ee5ea1df6baa2b06451f44a019ab9",
    "rev":"2-13839535feb250d3d8290998b8af17c3"
}

Remove Document

JavaScript example:

var doc = {
    _id: "d12ee5ea1df6baa2b06451f44a019ab9",
    _rev: "2-13839535feb250d3d8290998b8af17c3"
};
$.couch.db("mydb").removeDoc(doc, {
     success: function(data) {
         console.log(data);
    },
    error: function(status) {
        console.log(status);
    }
});

Resulting HTTP request:

DELETE http://localhost:5984/mydb/d12ee5ea1df6baa2b06451f44a019ab9?rev=2-13839535… 200 OK

Console output:

{
    "id":d12ee5ea1df6baa2b06451f44a019ab9
    "ok":true,
    "rev":"3-1f04f977685e1108b4664f70b09c6f65"
}

Bulk Save

JavaScript example:

$.couch.db("mydb").bulkSave({"docs": [{}, {}]}, {
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    }
});

Resulting HTTP request:

POST http://localhost:5984/mydb/_bulk_docs?successStatus=201 201 Created

Console output:

[
    {
        "id":"d12ee5ea1df6baa2b06451f44a01a0d8",
        "rev":"1-967a00dff5e02add41819138abb3284d"
    },
    {
        "id":"d12ee5ea1df6baa2b06451f44a01a75a",
        "rev":"1-967a00dff5e02add41819138abb3284d"
    }
]

Bulk Remove

JavaScript example:

var docs = [
    {
        _id: "d12ee5ea1df6baa2b06451f44a01a0d8",
        _rev: "1-967a00dff5e02add41819138abb3284d"
    },
    {
        _id: "d12ee5ea1df6baa2b06451f44a01a75a",
        _rev: "1-967a00dff5e02add41819138abb3284d"
    }
];
$.couch.db("mydb").bulkRemove({"docs": docs}, {
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    }
});

Resulting HTTP request:

POST http://localhost:5984/mydb/_bulk_docs?successStatus=201 201 Created

Console output:

[
    {
        "id":"d12ee5ea1df6baa2b06451f44a01a0d8",
        "rev":"2-eec205a9d413992850a6e32678485900"
    },
    {
        "id":"d12ee5ea1df6baa2b06451f44a01a75a",
        "rev":"2-eec205a9d413992850a6e32678485900"
    }
]

This operates almost exactly like the bulk save, but instead sets the deleted flag to true on the documents.

All Documents

JavaScript example:

$.couch.db("mydb").allDocs({
    success: function(data) {
        console.log(data);
    }
});

Resulting HTTP request:

GET http://localhost:5984/mydb/_all_docs 200 OK

Console output:

{
    "total_rows":11,
    "offset":0,
    "rows":[
        {
            "id":"_design/default",
            "key":"_design/default",
            "value":{
                "rev":"9-5212dde9da06f1933dbe29811fc380d4"
            }
        },
        {
            "id":"d12ee5ea1df6baa2b06451f44a002cef",
            "key":"d12ee5ea1df6baa2b06451f44a002cef",
            "value":{
                "rev":"1-967a00dff5e02add41819138abb3284d"
            }
        },
        …
    ]
}

All Design Documents

JavaScript example:

$.couch.db("mydb").allDesignDocs({
    success: function(data) {
        console.log(data);
    }
});

Resulting HTTP request:

GET 
http://localhost:5984/mydb/_all_docs?startkey=%22_design%22&endkey=%22_design0%22 200 OK

Console output:

{
    "total_rows":11,
    "offset":0,
    "rows":[
        {
            "id":"_design/default",
            "key":"_design/default",
            "value":{
                "rev":"9-5212dde9da06f1933dbe29811fc380d4"
            }
        }
    ]
}

All Apps

JavaScript example:

$.couch.db("tutorial").allApps({
    success: function(data) {
        console.log(data);
    },
    eachApp: function(appName, appPath, ddoc) {
        console.log(appName);
        console.log(appPath);
        console.log(ddoc);
    }
});

Resulting HTTP requests:

GET http://localhost:5984/tutorial/_all_docs?startkey=%22_design%22&endkey=%22_design0%22 200 OK
GET http://localhost:5984/tutorial/_design/tutorial 200 OK

Console output:

tutorial
/tutorial/_design/tutorial/index.html
{
    "_id":"_design/tutorial",
    "_rev":"5-276897ecff6dc272c0b9143aa23e8a2e",
    …
}

This is intended as a convenience to find metadata about CouchApps.

Query

JavaScript example:

var mapFunction = function(doc) {
    emit();
};
$.couch.db("mydb").query(mapFunction, "_count", "javascript", {
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    },
    reduce: false
});

Resulting HTTP request:

POST http://localhost:5984/mydb/_temp_view?reduce=false 200 OK

Console output:

{
    "total_rows":10,
    "offset":0,
    "rows":[
        {
            "id":"d12ee5ea1df6baa2b06451f44a002cef",
            "key":null,
            "value":null
        },
        {
            "id":"d12ee5ea1df6baa2b06451f44a0037b5",
            "key":null,
            "value":null
        },
        {
            "id":"d12ee5ea1df6baa2b06451f44a003b06",
            "key":null,
            "value":null
        },
        …
    ]
}

This creates a temporary view as defined by your map and reduce functions. Temporary views are useful in development, but should be replaced with views permanently saved to design documents for production. Temporary views are slow, especially with a large number of documents.

View

JavaScript example:

$.couch.db("mydb").view("default/all", {
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    },
    reduce: false
});

Resulting HTTP request:

GET http://localhost:5984/mydb/_design/default/_view/all?reduce=false 200 OK

Console output:

{
    "total_rows":10,
    "offset":0,
    "rows":[
        {
            "id":"d12ee5ea1df6baa2b06451f44a002cef",
            "key":null,
            "value":null
        },
        {
            "id":"d12ee5ea1df6baa2b06451f44a0037b5",
            "key":null,
            "value":null
        },
        {
            "id":"d12ee5ea1df6baa2b06451f44a003b06",
            "key":null,
            "value":null
        },
        …
    ]
}

List

JavaScript example:

$.couch.db("mydb").list("default/all", "all", {
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    },
    reduce: false
});

Resulting HTTP request:

GET http://localhost:5984/mydb/_design/default/_list/all/all?reduce=false 200 OK

The console output will depend on your list’s behavior.

Compact

JavaScript example:

$.couch.db("mydb").compact({
    success: function(data) {
        console.log(data);
    }
});

Resulting HTTP request:

POST http://localhost:5984/mydb/_compact 202 Accepted

Console output:

{
    "ok":true
}

View Cleanup

JavaScript example:

$.couch.db("mydb").viewCleanup({
    success: function(data) {
        console.log(data);
    }
});

Resulting HTTP request:

POST http://localhost:5984/mydb/_view_cleanup 202 Accepted

Console output:

{
    "ok":true
}

Compact View

JavaScript example:

$.couch.db("mydb").compactView("default", {
    success: function(data) {
        console.log(data);
    }
});

Resulting HTTP request:

POST http://localhost:5984/mydb/_compact/default 202 Accepted

Console output:

{
    "ok":true
}

Changes

JavaScript example:

$.couch.db("mydb").changes().onChange(function(data) {
    console.log(data);
});

Resulting HTTP requests:

GET http://localhost:5984/mydb/ 200 OK
GET http://localhost:5984/mydb/_changes?heartbeat=10000&feed=longpoll&since=34 200 OK

This will keep a connection open until changes are detected, and then open up another connection to listen for subsequent changes starting from the next sequence.

Console output when a new document is created:

{
    "results":[
        {
            "seq":35,
            "id":"d12ee5ea1df6baa2b06451f44a01b7b5",
            "changes":[
                {
                    "rev":"1-967a00dff5e02add41819138abb3284d"
                }
            ]
        }
    ],
    "last_seq":35
}

This is very powerful feature. It allows you to listen for any changes to documents in the database and immediately respond to those changes.

Copy Document

JavaScript example:

$.couch.db("mydb").copyDoc("d12ee5ea1df6baa2b06451f44a01b7b5", {
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    }
}, {
    beforeSend: function(xhr) {
        xhr.setRequestHeader("Destination", "aNewDocId");
    }
});

Resulting HTTP request:

COPY http://localhost:5984/mydb/d12ee5ea1df6baa2b06451f44a01b7b5 201 Created

Console output:

{
    "id":"aNewDocId",
    "rev":"1-967a00dff5e02add41819138abb3284d"
}

Drop Database

JavaScript example:

$.couch.db("mydb").drop({
    success: function(data) {
        console.log(data);
    },
    error: function(status) {
        console.log(status);
    }
});

Resulting HTTP request:

DELETE http://localhost:5984/mydb/ 200 OK

Console output:

{
    "ok":true
}

O'Reilly OSCONI’ll be giving a presentation on CouchApps at OSCON if you want to learn more about building CouchApps. This presentation will be a part of the JavaScript and HTML5 track. You can save 20% on registration with the code OS11RAD. I’ll also be presenting a tutorial on Learning CouchDB at OSCON Data.

You can also check out my books Writing and Querying MapReduce Views in CouchDB and Scaling CouchDB if you’re interested in learning more about CouchDB in general.

Update (7/15/2011): Max Ogden has pointed me to Dale Harvey‘s documentation generator for jquery.couch.js and a copy of the generated jQuery CouchDB documentation. Max and Dale are also working on a more flexible jquery.couch2.js.

Iterative vs. Incremental

I’ve found that people often conflate the terms “iterative” and “incremental” when it comes to software and/or product development—they often use “iterative” when they really mean “incremental”. I’ve been guilty of this in the past, but feel like I have a good handle on the differences now.

Iterative

Iterative development involves a cyclical process. While one may still have a general product road map, learning from one iteration informs the next iteration. This learning can come from end-users, testers, or the developers themselves. An iterative process embraces the fact that it is very difficult to know upfront exactly what the final product should look like (no matter how smart or well-informed you might be) and builds in as many learning opportunities as possible.

Incremental

Incremental development involves breaking a large chunk of work into smaller portions. This is typically preferable to a monolithic approach where all development work happens in one huge chunk. Unlike those using an iterative approach, those taking an incremental approach will stick as closely as possible to the original road map. Few, if any, opportunities for feedback exist until the final product launch.

Conclusion

No one uses a purely iterative or incremental process. Those using an incremental process are unlikely to completely ignore any feedback or learning that happens to seep into the process. Those using an iterative process are unlikely to completely let go of their original vision and preconceived notions and rely purely on outside learning.

Both processes have their places. If the problem and solution are well known, then an incremental process can work well. However, many software and product development initiatives are a search for an unknown solution to a known problem. If the solution is unknown, then an iterative process is critical. If both the problem and the solution are unknown, then an iterative process alone isn’t enough—but that’s a topic for another day.

Testing PHP 5.4

Rasmus Lerdorf today posted instructions for testing the upcoming PHP 5.4 release. Running the PHP tests and submitting the associated report will help the PHP team get PHP 5.4 ready faster as it gives them reports of failed tests from a variety of platforms. I just did this today for the first time and can tell you that it is very easy. Following are the steps that Rasmus outlined, in a bit more detail.

First, make sure that you have subversion, autoconf, automake, gcc, bison, flex, and re2c installed. I’m using a Mac, so I was able to install most of these using MacPorts (Homebrew would have worked, too). Next, do a checkout of the PHP 5.4 branch:

$ svn co https://svn.php.net/repository/php/php-src/branches/PHP_5_4 php54

Then:

$ cd php54
$ ./buildconf
$ ./configure
$ make
$ make test

Both the make and make test steps may take awhile. Assuming you get some failed tests like I did, then you will see the following message and prompt:

You may have found a problem in PHP.
This report can be automatically sent to the PHP QA team at
http://qa.php.net/reports and http://news.php.net/php.qa.reports
This gives us a better understanding of PHP's behavior.
If you don't want to send the report immediately you can choose
option "s" to save it.    You can then email it to qa-reports@lists.php.net later.
Do you want to send this report now? [Yns]:

I opted to send the report now, so entered Y. Next, I was prompted for my email address:

Please enter your email address.
(Your address will be mangled so that it will not go out on any
mailinglist in plain text):

I entered my email address and then:

Posting to http://qa.php.net/buildtest-process.php

Thank you for helping to make PHP better.

Pretty straightforward. At this point, your report should be aggregated on the PHP Test Reports Summary page. If you want to run more tests, then configure as many extensions as possible during the configure step. Take a look at the shell script in Rasmus’ instructions for details. If you’d like, you can examine the failed tests and see if you can come up with a patch to fix the problem. Again, see Rasmus’ instructions for details.

O’Reilly Radar on CouchDB and HTML5

I was recently interviewed by Audrey Watters of O’Reilly Radar on what CouchDB can do for HTML5, web apps and mobile. We discussed CouchDB, CouchApps, JavaScript, HTML5, web applications, mobile application development (Android/iOS), Web Storage, IndexedDB, replication, and “ground computing”. Read the full interview for the details of what we talked about.

O'Reilly OSCONI’ll be giving a presentation on CouchApps at OSCON if you want to learn more about CouchApps, JavaScript, and HTML5. This presentation will be a part of the JavaScript and HTML5 track. You can save 20% on registration with the code OS11RAD. I’ll also be presenting a tutorial on Learning CouchDB at OSCON Data.

You can also check out my books Writing and Querying MapReduce Views in CouchDB and Scaling CouchDB if you’re interested in learning more about CouchDB in general.

Speaking at ZendCon 2011

Zend PHP ConferenceI’m happy to say that I will be speaking at this year’s ZendCon—the Zend PHP Conference for Developers. I’ll be presenting a tutorial on Learning CouchDB during the first day of the conference. From the session’s description:

CouchDB is a document-oriented database that uses JSON documents, has a RESTful HTTP API, and is queried using map/reduce views. Each of these properties alone, especially MapReduce views, may seem foreign to developers more familiar with relational databases. This tutorial will teach web developers the concepts they need to get started using CouchDB in their projects. Several CouchDB libraries are available for PHP and we will take a look at the more popular ones.

This will be the 7th annual ZendCon and will take place October 17-20, 2011 in Santa Clara, California. Registration is open—hope to see you there!

CouchDB 1.1 Released

CouchDB 1.1 was released this past Friday. CouchBase has a CouchDB Release 1.1 Feature Guide that provides details on the new features available in CouchDB 1.1. Some of the key new features include:

See the CouchDB Release 1.1 Feature Guide for the complete list and more detail.

Database Indexes As First-Class Citizens

I’ve found that working with CouchDB has increased my proficiency with MySQL. These are two completely different types of databases, so what could I possibly learn about one from using the other? One thing that both CouchDB and MySQL (or any other relational database) have in common is that they both use B-tree (technically B+ tree in CouchDB) indexes to make data retrieval more efficient. In CouchDB, it’s not possible to query against anything other than an index, whereas in MySQL you can create arbitrary SQL queries without having an index. In CouchDB, you manually build your indexes—called views—using MapReduce functions that are run incrementally against your documents. This gives you an incredible amount of flexibility, but does require some thought when designing your indexes/views. If you want to create an index in MySQL, you simply tell MySQL what column or combination of columns you want indexed and what type of index you want.

Working with CouchDB has taught me to treat database indexes as first-class citizens. A common approach to working with MySQL involves simply adding indexes where needed when your queries start to slow down. A better approach is to carefully consider the design of your indexes as part of your overall database design—not as an afterthought when your database starts to get slow.

The other day, I ran into a simple problem in MySQL that I solved differently then I would have before I started working with CouchDB. This was not a very complicated problem by any stretch of the imagination, but it does illustrate how CouchDB has influenced my approach to designing databases in MySQL. I wanted to track successful and failed login attempts by IP address. Here is the first table design that came to mind:

CREATE TABLE `login_attempt` (
 `ip_address` varchar(39) NOT NULL,
 `success` tinyint(3) unsigned NOT NULL default '0',
 `fail` tinyint(3) unsigned NOT NULL default '0',
 PRIMARY KEY  (`ip_address`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Notes:

  • The IP address field is varchar(39) to account for IPv6 addresses.
  • It would be more efficient to store IP addresses as integers. I could have used MySQL’s INET_ATON and INET_NTOA functions to convert IP addresses to and from their integer representations. However, these functions don’t support IPv6 yet so I didn’t bother.

This design is workable, but has a few potential problems:

  • Whenever I need to increment the `success` or `fail` fields, I would need to first check if a row exists yet for the IP address. This complicates the logic and creates unnecessary work for the database. I would have to do a SELECT and then either an INSERT or an UPDATE (or just try an INSERT and, if it fails, then go ahead and do an UPDATE—just as much work for the database).
  • Each update would temporarily lock the row that is being updated (since I’m using the InnoDB storage engine). In this scenario, it’s unlikely that multiple login attempts would be coming from the same IP address concurrently. However, it’s still a potential bottleneck and could be a problem in other use cases where rows might see more concurrent updates.

Here is the alternative table design:

CREATE TABLE `login_attempt` (
 `ip_address` varchar(39) NOT NULL,
 `success` tinyint(1) NOT NULL,
 `timestamp` timestamp NOT NULL default CURRENT_TIMESTAMP,
 KEY `login_attempt_ip_address_success` (`ip_address`,`success`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

With this design, a new row is inserted for each login attempt (astute readers will notice that this is similar, conceptually, to Event Sourcing). Rows are treated as immutable and are never updated. The `success` column can have one of two values: 0 for false or 1 for true. An example of a query to find the number of successful and failed login attempts:

SELECT `success`, COUNT(*) AS `count`
FROM `login_attempt`
WHERE `ip_address`='127.0.0.1'
GROUP BY `success`;

An example of the results:

+---------+-------+
| success | count |
+---------+-------+
|       0 |    13 |
|       1 |    19 |
+---------+-------+

This tells us that there are 13 failed login attempts and 19 successful login attempts from the given IP address. Without an index on the combination of `ip_address` column (for the  SELECT clause) and `success` column (for the GROUP BY clause), the above query would be extremely inefficient. However, with the correct indexing it's almost as efficient as the original approach with the added benefit of avoiding row-level locking on updates. This design has the added benefit of being able to "prune" login attempts older than a certain date (although I should index the `timestamp` field if I were to do that).

What do you think of this approach? The `login_attempt` table will certainly grow larger with this alternative approach (storing IP addresses as integers could help with this). Even with indexing, this alternative approach will probably be slightly less efficient than the original for reads (but more efficient for writes). Regardless, the point is that you should carefully design your indexes as part of your upfront database design as it could allow for alternatives that you might not have otherwise considered.