The Road to API 0.7

Its been years since the last OpenStreetMap API update. Partly this is a good thing as it seems to be mostly powerful enough for people to map the things which they want to, editor support is pretty good, and there has been time for documentation and tutorials to be written.

However, there are some known problems as well.

  1. Areas aren't understood by the API. They can be modelled using ways and relations, but figuring this out isn't easy and sometimes results in weirdness, differences between implementations and confusion for users.
  2. Atomic Changesets (or, rather, lack thereof). Because hysterical raisins, changesets aren't atomic. Uploads are atomic, but there's no way to get that information later on. This information is critical to better caching and transparent proxying, which would make the API faster for everyone.
  3. XML is the only format the API understands, but JSON or PBF would make more sense nowadays. JSON for browsers, PBF for compactness. There's no good reason why, like a good REST API, it can't support multiple representations.
  4. Incorrect use of HTTP - the API steals some status codes for its own use - makes it difficult to write clients and editors.
  5. Map call isn't cacheable, it's like WMS. This means slower downloads, and makes transparent caching almost impossible. This has a huge impact on people at mapping parties.
  6. Uploads are a single, transactional API call. When, inevitably, a connection times out on a particularly large upload, there's no easy way to tell if it got committed or not. This can lead to confusion, frustration and duplication.

Additionally, there are some issues lurking beneath the surface which make it more diffcult to maintain and operate the API:

  1. The API is monolithic, embedded within the website. This means that when the website is deployed, the API is deployed. It means a bug in the website can affect the API and vice-versa. It means that the API can't scale or evolve separately from the website. All of this slows down development and makes it riskier and harder to operate either.
  2. Being part of the same code, the API has become too tied to Rails. This means much of the API is either undoing things that Rails has done or suffering because Rails is doing too much. Don't get me wrong: Rails is awesome, and can really help. Just not in this instance.

What to do?

We can make something better, but it will take time. No one is supporting this work, so it will go slowly. Whatever the plan, it must support gradual replacement of parts and incremental upgrades so that each bit of work is a manageable size.

This is just a plan, it's not official. In the absence of any better plan - if you have one, let's talk! - it's the one I'm following.

  1. Add OAuth support to cgimap.
  2. Get full read-only API coverage in cgimap.
  3. Get full write API coverage in cgimap.
  4. Finish cgimap-ruby, and replace the API controllers with calls to it.
  5. Make the website use the API.
  6. Create a user service, and use it in cgimap & website.
  7. Split the database into website bits and API bits.

NOTE: cgimap is an "acceleration layer" that's been used alongside Rails for many years to speed up some API calls. As such it's currently a partial re-implementation of the API, but not in Ruby.

By the end of this process, the fate of the website and API are loosely coupled. This means they can both evolve, relatively independently. It means they can scale relatively independently. It means they can move and adapt more quickly.

The way I've described each step is quite terse, and might not make a lot of sense, so here's a bit more detail on each of those steps.

OAuth support

I wrote about this in a previous post. The short version is that not only is OAuth the basis for supporting more than read-only calls in cgimap, but also it allows us to do more appropriate rate limiting for people at mapping parties and HOT events who are all editing from behind a single firewall.

Full read-only support

At the moment, cgimap supports only a few of the available API calls, and none of the ones which aren't 'geographic' such as user details. The support needs to be added here, so that we can start routing all read-only API calls to cgimap and get some confidence that they're all implemented correctly.

There's nothing terribly complicated to do here: It's a matter of looking at the Rails code to see what it's doing, writing tests for cgimap to try and cover all the different conditions, then writing the code.

If you want to get involved, this is a great place to start. Please get in touch - I'd be very happy to help you with whatever info you need to get started.

Full write support

A read-only API is all very good for taking the load off Rails and spreading it over database replicas, but it's only half of the answer. The other half is being able to support writing new data to the API.

This is going to require some major changes to how cgimap works internally, but could bring some major improvements to upload latency which, at the moment, can sometimes feel quite slow.

Finish cgimap-ruby

Until this point, cgimap and the websites will be completely separate projects. But this means we have two different implementions of the API, which makes it twice as much work to change anything.

Cgimap-ruby is a project to embed cgimap inside the website. At first this would seem like a strange thing to do, but what it means is that anyone can continue to download and work on the website on their local computer without needing to also run cgimap separately. It's already difficult to get started developing the website, and we don't want to make it any harder.

Make the website use the API

At the moment, the website directly accesses the same bits of data as the API, via Rails, from the database. This means the website and API are tightly coupled and it's impossible to change one without changing the other.

It also means it's not simple to implement any sort of caching or replication to make the website faster. This is because the read-write connections that the API uses are indistinguishable from the read-only connections to Rails' internals.

This step might be quite difficult, as there are a number of places in the website where it directly interacts with database objects, for example in the "browse" pages. In some cases, the website is using functionality or data which isn't currently available in any API, for example the next and previous objects for pagination.

Create and use a user service

Even after the website stops using API functions internally, there's still an overlap between the data needed by the website and the API; user information. The common subset between the two is quite small, and it's cacheable data which changes infrequently. This points to extracting the common set as its own "microservice".

Extracting the existing API into a microservice would probably be quite easy. The tough part might be getting it to work in a way that keeps it easy to develop against locally. It might be necessary to keep a stub implementation in the website, or figure out how to run it as a parallel "helper" process, managed by Rails.

Split the database

This is the final step in the process, and the only one which would require significant downtime. At the end of this step, we would have three independent databases, each able to be scaled, managed and evolved separately. One for the website; diary entries, messages, and so forth. One for users. And the largest for the API. I have avoided talking about GPS traces, as they're a candidate for yet another separate database.

That seems really complicated

In an ideal world, we'd just write a new system from scratch. However, in the real world there are very good reasons for making this a gradual change, done in many small, low-risk steps.

  1. Rewrites always miss something. There will be some "legacy" part of the old system which is overlooked or deliberately left out which turns out to be important. Often important enough to set the rewrite back significantly.
  2. The server is less than half of the system. By making a huge change in the server in one step, it would place the burden of catching up onto everyone who develops editors, QA tools and other 3rd party software which talks to the API.
  3. All software projects are at risk of a problem called "feature creep" which is when new features, which seem like a good idea at the time, get added during the project and the new features drastically slow down or divert effort away from getting the original features finished. Structuring the API changes as a series of smaller projects, each with smaller scope, helps manage feature creep.
  4. Doing something incrementally means it's possible to do it slowly. As each stage is finished and brings tangible benefits, it is "saved" and locked in. This means that even if the development stalls or slows, forward progress is still possible. In large "rewrite all the things" projects, the whole monolithic project can be at risk if it stalls or people move away to other things.

It's a bit like renovating a house: If you do it all in one go, it might get done quicker, but you can't live in it while every room is torn apart and rebuilt. If you live somewhere else, and the renovation stalls, it's at risk of dragging on and on, or even never finishing.

Renovating room by room is less fun, more frustrating, and it can drag on too. But each room locks in forward progress towards getting it all done.

If you are interested in renovating the API then please take a look at the cgimap and cgimap-ruby repositories. Come talk to me ('zere') on IRC #osm-dev or leave a message in the comments below. Let's get this renovation underway!

Comments

Comments loading...