How RIPE Atlas Helped Wikipedia Users

Translate this post

This post by Emile Aben is cross-posted from RIPE Labs, a blog maintained by the Réseaux IP Européens Network Coordination Centre (RIPE NCC). In addition to being the Regional Internet Registry for Europe, the Middle East and parts of Central Asia, the RIPE NCC also operates RIPE Atlas, a global measurement network that collects data on Internet connectivity and reachability to assess the state of the Internet in real time. Wikimedia engineer Faidon Liambotis recently collaborated with the RIPE NCC on a project to measure the delivery of Wikimedia sites to users in Asia and elsewhere using our current infrastructure. Together, they identified ways to decrease latency and improve performance for users around the world. 

During RIPE 67, Faidon Liambotis (Principal Operations Engineer at the Wikimedia Foundation) and I got into a hallway conversation. Long story short: We figured we could do something with RIPE Atlas to decrease latency for users visiting Wikipedia and other Wikimedia sites.
At that time, Wikimedia had two locations active (Ashburn and Amsterdam), and was preparing a third (San Francisco), to better serve users in Oceania, South Asia, and US/Canada west coast regions. We were wondering about the effects on network latency for users world-wide for this third location and Wikimedia wanted to quantify the effect turning up this location would have.
Wikimedia runs their own Content Delivery Network (CDN), mostly for privacy & cost reasons. Like most CDNs, to geographically balance the traffic to their various points of presence (PoPs), they employ a technique called GeoDNS: a user will, based on the DNS request that is made on their behalf from their DNS resolver, be specifically directed to one of the data centers based on their or their resolver’s IP address. This requires the authoritative DNS servers for Wikimedia sites to know where to best direct the user to. Wikimedia uses gdnsd for authoritative DNS to dynamically respond to those queries based on a region-to-datacenter map.
Some call this ‘stupid DNS tricks‘, others find it useful to decrease latency towards websites. Wikimedia is in the latter group, and we used RIPE Atlas to see how this method performs.
One specific question we wanted answered is where to “split Asia” between the San Francisco and the Amsterdam Wikimedia location. Latency is obviously a function of physical distance, but also the choice of upstream networks. As an example, these choices determine if packets to “other side of the world” destinations tend to be routed clockwise or counter-clockwise.
We scheduled latency measurements from all RIPE Atlas probes towards the three Wikimedia locations we wanted to look at, and visualised what datacenter showed the lowest latency for each probe. You can see the results in Figure 1 below.

Screenshot of latency map. Probes are colored based on the datacenter that shows the lowest measured latency for this particular probe
Figure 1: Screenshot of latency map. Probes are colored based on the datacenter that shows the lowest measured latency for this particular probe.

This latency map shows the locations of RIPE Atlas probes, coloured by what Wikimedia data center has the lowest latency measured from that probe:

  • Orange: the Amsterdam PoP has the lowest latency
  • Green: the Ashburn PoP has the lowest latency
  • Blue: the San Francisco PoP has the lowest latency.

Probes where the lowest latency is over 150ms have a red outline. An interactive version of this map is available here. Note that this is a prototype to show the potential of this approach, so it is a little rough around the edges.
Probes located in India clearly have lower latency towards Amsterdam. Probes in China, South Korea, the Philippines, Malaysia and Singapore showed lower latency towards San Francisco. For other locations in South-East Asia the situation was less clear, but that is also useful information to have, because it shows that directing users to either the Amsterdam or the San Francisco data center seems equally good (or bad). It is also interesting to note that all of Russia, including the two most eastern probes in Vladivostok have lowest latency towards Amsterdam. For the Vladivostok probes Amsterdam and San Francisco are almost the same distance, give or take 100 km. Nearby probes in China, South Korea and Japan have lowest latency towards San Francisco.
There is always the question of drawing conclusions based on a low number of samples, and how representative RIPE Atlas probe hosts are for a larger population. Having some data is better then no data in these cases though, and if a region has a low number of probes that can always be fixed by deploying more probes there. If you live in an underrepresented region you can apply for a probe and make this better.
With this measurement data to back it, Wikimedia has gradually turned up Oceania, South Asian countries and US/Canada states where RIPE Atlas measurements showed minimal latency to, to be served by their San Francisco caching PoP. The geo-config that Wikimedia is running on, is publicly available here.
As for the code that created the measurements and created the latency map: This was all prototype-quality code at best, so I originally planned to find a second site where we could do this, so to see if we could generalise scripts and visualisation and then share.
At RIPE 68 there was interest in even this raw prototype code for doing things with data centers, latency and RIPE Atlas, so we ended up sharing this code privately, and have heard of progress made on that already. In the meantime we’ve put up the code that created the latency map on github. Again: it’s a prototype, but if you can’t wait for a better version, please feel free to use and improve it.

Conclusion

If you have an interesting idea, and have no time, or other things are stopping you from implementing it, please let us know! You can always chat with us at a RIPE meetingregional meeting or any other channels. We don’t have infinite time, but we can definitely try out things, especially ideas that will improve the Internet and/or improve the life of network operators.
Emile Aben

Archive notice: This is an archived post from blog.wikimedia.org, which operated under different editorial and content guidelines than Diff.

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?

5 Comments
Inline Feedbacks
View all comments

Wonderful that we’ll help this useful RIPE Atlas with anchors, and fantastic to hear we might at last have some real worldwide status check for Wikimedia sites.
Also funny to see that most of the world, from Cape Town to Iceland to Vladivostok, is closer to Europe than to USA. 😉

Hoi,
Does this map not make it abundantly clear that READERS are best served with PoP’s outside the USA. Particularly Africa will benefit a lot because the data going into Africa is throttled due to a lack of capacity. How does this affect EDITORS ??
Thanks,
GerardM

I can not speaking english I speaking french
je suis tres content de me retrouver avec wikipedia

Excellent stuff! Glad to see my Atlas probe is doing useful things 🙂

By using this map, using knowledge about population density and know about the WMF’s Global South initiative it’s clearly visible that a PoP in Asia would makes more sense than an additional data-center in Dallas. I hope that WMF can find a solution for legal issues.
To reduce latency for the most users in the world, we should also have WMFLabs infrastucture in Amsterdam. I have also a name for this new system, call it Toolserver. 😉