Routing Benchmarking: Part II

Last time, I ran a simple benchmark on OSRM and GraphHopper and presented some of the results. I'd like to expand on those results in this post in more detail.


Some results on the scaling of responses were presented before in the table of response time distribution by percentile. However, this does not tell the whole story and it's interesting to take a look at a more detailed break-down of the response times.

I took the reponse data and binned them by route length, which is acting as a proxy for number of route hops only to compare between the two bits of software. Each bin is then analysed to find the maximum, minimum, median, 5% and 95% response times.

The median response times can be found in the figure below, with a best-fit line through the points (it doesn't look like a line, but that's because of the log-log scale).

What this shows is that OSRM is paying a higher fixed cost per route, 12.3 ms versus 4 ms for GraphHopper. OSRM's author, Dennis Luxen, explains that it's likely because OSRM looks up the closest point to the input latitude and longitude in an external memory (on-disk) index and, because the benchmark machine used spinning disks rather than SSDs and the lookups are to essentially random locations on disk, there's a cost associated with seeks. In a machine with SSDs this would be drastically reduced, and if you have such a machine then you should conduct your own benchmarking.

GraphHopper's lower fixed overhead is indicative that either it does not have such an external memory index (i.e: it has one in RAM), that the lower memory usage of GraphHopper allows the OS to effectively cache the index, or that GraphHopper's lookups are approximate.

The scaling, the rate at which query time increases with extra distance, is also interesting. I was expecting the result to scale with the logarithm of the distance, due to the use of contraction hierarchies. However, a linear fit was better, and scaled as 9.4 μs/km for OSRM and 11.9 μs/km for GraphHopper.

That both scaled linearly with route length suggests that the most expensive part of the query may not be the route finding. If we assume that the cost of finding the route is logarithmic in the number of hops, and that route length is a proxy for that, then we might need to look for other expenses which are linear in the number of hops or route length. One possibility is the expansion of the route result into geometry or its encoding into JSON. Both routing engines were run with the same option to return the geometry of the route, and it could be that, at large route lengths, this comes to dominate.

Even looking at the median response time does not tell the whole story, as within each result bin there is a large degree of variation. The figure below shows this as a box-and-whisker plot where the box section covers the 5% -- 95% range and the whiskers cover the full min -- max range.

Note that the x-direction offset between the two sets of results is artificial, to increase the readability of the graph. The results were binned into the same ranges for both routing engines.

The first thing which I noticed about this result is the very large maximum response times coming from GraphHopper along the middle of the graph. I don't know for sure where this is coming from, but because the 5% -- 95% box is so much lower and GraphHopper is written in Java, my first suspect would be garbage collection. This is worth keeping in mind if your application has a hard real time requirement.

Otherwise, the graph is telling much the same story as the previous one; OSRM pays a fixed overhead due to disk lookups, and the gap between the two closes as the routes get longer. The first place that the 5% -- 95% boxes overlap is above one million km.


Comments loading...