Anti-vandalism with masked IPs: the steps forward

Johan Jönsson works for the Wikimedia Foundation. His main task is to include the Wikimedia communities in the product development process.

The Wikimedia wikis can be edited by registered and unregistered users alike. When someone isn’t logged in to an account, instead of their user name, the history – and the recent changes feed, your watchlist and so on – will show their IP address. This is mainly for attribution: when you write on the Wikimedia wikis, the copyright still belongs to you. You just give permission for the text to be spread and changed. So we need to attribute authorship to someone: a name, a pseudonym or at least an IP address. But knowing the IP behind an edit is also a tool we use to fight the edits we don’t want to see: vandalism and harassment, spam, and those that push a specific point of view at the cost of neutrality.

Roughly a year ago, a team within the Wikimedia Foundation’s Product department started a process on IP masking – hiding the IP addresses we today show in public. Our goal was roughly to try to address all the problems we knew it was going to bring, and hopefully be able to do it with no more work for vandal fighters than before we started. Recently the Wikimedia Foundation’s Legal department clarified their guidance: for legal reasons – which they can’t explain in detail due to legal privilege, the legal professional rules that control what lawyers can say about their work – this is something we have to do. We’re flexible on the how and the when, but not on the if. Thus that’s the reality we must deal with and the situation we are publicizing to the communities, as soon as we can.

There are other reasons for bringing up the subject, of course. The longer I work on the project, the stranger I personally find it that we publicly publish IPs – which I used to find completely natural, not least since I mainly contributed without being logged in for years in the earlier days of Wikipedia – of people who are trying to help make the wiki better. As a movement, we’ve had occasional debates on whether publishing the IPs really is what we should be doing for about as long as we’ve been doing it. But these are reasons for starting a conversation. Our legal experts telling us that this is something that has to be done is reason to do it.

I think one main communications issue is that we’ve tried to let the Wikimedia contributors in as early as possible and it’s not apparent to everyone where we are in the process. OK, we say, so we have to do this: Please let us know your fears and issues and everything you want us to take into account. This is something we need to solve with the wikis and vandal fighters, so that we can mitigate as much as possible. We try to ask questions as early as possible instead of doing internal planning based on our assumptions. The Wikimedia wikis have very different cultures and needs. They don’t see the same patterns around problems like undisclosed paid editing, harassment and returning vandals. The fact that I’m intimately familiar with this work on one wiki doesn’t mean there aren’t many things we need to learn from the communities, and no single wiki is a good model for all. What works for you or me will not work everywhere else.

We try to take the conversation that normally happens in Phabricator – open, but not easily accessible for most Wikimedians – and put it on the wiki. This means that we’re a couple of steps earlier in the process than people expect us to be. Some see that we plan to mask IPs, try to figure out how this is going to work and come away with the impression oh no, they have no idea what they’re doing. They have no plan. We do have a plan. It’s just that collecting information from the communities before we plan solutions is part of it. There’s time to work this out together. We’re not throwing the switch next week. Whether we know what we’re doing remains to be seen, of course, and I’m not the one to judge.

How do we plan to mitigate problems? Partly by giving more people access to the information that we’ll now be hiding from the public. We’ve been toying with the idea of a system with three tiers. First, we’d either build a new user right or maybe even just make access to the information opt-in, as long as the user meets certain criteria. Second, others could have access to part of the IP, to be able to see which range it belongs to. The threshold for access to the first user right would be lower than adminship on some wikis, since access still needs to be provided to admins on Wikimedia wikis with less stringent criteria, such as five or so users saying sure, why not, this new person seems serious and sincere, whereas this will give access to user data. Third, the public and those with no interest in the tasks where this information is relevant would see a masked IP. Those who are involved in cross-wiki vandal fighting would need global access. We don’t intend to break the system by putting this on the checkusers and stewards. The details need to be hashed out with the communities.

Partly we’re aiming to solve it by building new tools. We’re trying to make the checkusers’ and stewards’ lives easier by updating the checkuser tool and working on a tool to find potential undetected sockpuppets. We’re working on surfacing the information about what the IP address means in a way that’ll be accessible to more vandal fighters than used to be the case. We want to hear more needs and suggestions.

So we talk to people. In various places and languages, to figure out how it would affect them. It varies: a number of English Wikipedia vandal fighters have expressed concerns, while Swedish Wikipedia hasn’t, when explicitly asked. The Arabic Wikipedia discussion did not raise the same problems as the Chinese one.

Why do IP masking at all, some ask. Why not disable IP editing instead? We’re investing significant time and resources in trying to solve this because we’re convinced that turning off unregistered editing would severely harm the wikis. Benjamin Mako Hill has collected research on the subjectAnother researcher told us that if we turn IP editing off, we’ll doomed the wikis to a slow death: not because the content added by the IP edits, but because of the increased threshold to start editing. We can’t do it without harming long-term recruitment. The role unregistered editing plays also varies a lot from wiki to wiki. Compare English and Japanese Wikipedia, for example. The latter wiki has a far higher percentage of IP edits, yet the revert rate for IP edits is a third of what it is on English Wikipedia: 9.5% compared to 27.4%, defined as reverted within 48 hours. And some smaller wikis might suffer greatly even in the shorter term.

And that’s the heart of the problem: There is no available strategy without risk. Legal risk. Risk of vandalism. Risk of hurting long-term editor recruitment. So we hope to be able to work together, listen to suggestions and problems, and build around potential obstacles and mitigate concerns. Give the communities the tools they need.

This text was originally written for the English Wikipedia Signpost, on their request. It’s been slightly edited for a less wiki-specific audience.