OAuth support in Cgimap

Recently, I've been hacking on adding OAuth support to cgimap. This seems like a fairly technical detail, but it's a key part of making important improvements that'll really help people editing, especially at mapping parties.

To explain why adding OAuth support to cgimap might help people at mapping parties, first we have to take a little detour through what OAuth actually is and one of the main mechanisms that's used to protect the OSM APIs, rate-limiting.

What's OAuth, then?

OAuth is a standard protocol used by the OpenStreetMap API so that you can authorise your editor to make edits on your behalf, or delegate capabilities to a 3rd party tool or website, without having to tell it your password. This makes it more secure for everyone, as you don't have to worry about the tool remembering your password or sending it to someone you don't want it to; it gets only the privileges you grant it, and you can revoke those at any time through the OSM website.

Generally, the parts of the OSM API which require the use of the OAuth protocol are the bits which write to the database, for example when you're uploading your edits or GPS traces to OSM. And, maybe, one day cgimap will be able to do those things. But in the near future we can use it to do other very useful things too.

What's rate-limiting?

Whenever there's a large, public API, such as OpenStreetMap, there will be some people who use more than their fair share of the resources:

  • They scrape the live API instead of using an offline dump or extract,
  • They hammer various elements instead of using the diff stream for updates,
  • In extreme cases they sell apps or commercial services which just re-package OSM's tiles or APIs.

If these resources aren't protected and maintained for editors then the APIs and tiles which editors rely on would quickly become unusably slow. Some might say that ship has already sailed.

The most effective technique we have at the moment for allocating the editing API's resources fairly is rate-limiting clients: returning an error saying "You've downloaded too much!" if the they've downloaded too much in the last few minutes.

The rate limiting means we have to know how to tell one client from another, and the way we do this at the moment is by looking at their IP address. We need some attribute that's hard to fake, otherwise it's not effective as a way of protecting resources for editors' use. Unfortunately this has a downside: if many people share the same IP address it counts all of their usage together, and will result in people being blocked too quickly.

For many people to share the same IP address is generally quite rare, but can happen inside large corporate networks or when people share a common internet connection. The latter, sadly, happens more often than we'd like at mapping parties and HOT activations.

How we can use OAuth for more intelligent rate-limiting

The good news is that so many people gathered together to edit the map! This allows us an alternative way to do rate-limiting: Because each of these people is editing, they have an OSM user account, and it's simple for their editor to prove to the API which user this is by using OAuth. This means that editors could transparently switch to using authenticated requests when they detect that they would be rate-limited. And everyone can do the editing they wanted to without dealing with spurious errors just because they gathered together in one place to do their editing.

The state of the code

The code on the OAuth branch is currently passing some very basic unit tests to do with the details of how to build up various internal parameters that OAuth needs. These have to be tested really, really well or we'll end up finding out strange corner-case failures while people are using the code, which wouldn't be good...

The following stuff still needs doing:

  • More test cases! It's important to get crypto stuff right, and try to cover all the corner cases. Otherwise someone will discover one of those cases when they're trying to upload.
  • Finish off the signature-checking code.
  • A way to check and store nonces, an essential part of the algorithm used to prevent replay attacks.

Once all that's done, we can put this up on the development server and test it, and hopefully convince editor developers that it's a great addition to their editor!

Comments

Comments loading...