Translate this post

The Site Reliability Engineering (SRE) Team is pleased to announce that we’ve launched a new status page that we’ll update during major outages to Wikipedia and other Wikimedia projects.  You can find it at

By “major outages” we mean problems so severe that the general public or the media might notice – issues like wikis being very slow or unreachable for many users.  The status page will definitely be useful for the editor community and others directly involved in the projects, but it won’t be replacing forums for in-depth discussion like Technical Village Pumps or Phabricator – rather, it will supplement them, particularly as a place to check when the wikis are unreachable for you.

Of course, timeliness of information is really important during any large disruption to our services.  A key feature of the new status page is a set of five high-level metrics that give a look into the overall health and performance of the wiki environment.  We wanted a set of indicators that would show widespread issues as obvious deviations from normal, so that non-technical people could look at the graphs and say “ah, yes, something is wrong”.  Automatically publishing these metrics means that users can have some idea that something is wrong for everyone, not just themselves, even before SRE has had a chance to post an update.

The rate of errors served by Wikimedia, during and then just after an outage.

Wikimedia previously offered a status page, but it was difficult to read and sometimes inaccurate.  The SRE team officially sunset it in 2018.  We’re pleased to re-launch a status page that we think is easy to interpret by both technical and non-technical folks, and that we’re committing to keep accurate and up-to-date.

Since we didn’t want to use any of our existing hosting infrastructure for the status page – as the entire point is that it must remain accessible when our servers or network connections are broken – we’re using an externally-hosted commercial product.  Do note that the Atlassian privacy policy applies when visiting the status page.

If you’re seeking more background on the project, or curious about technical decisions and implementation details, the project page on Wikitech is a good place to start. There’s also the Phabricator task for the project, which is not only a good place to learn more but also to offer any feedback. We’ll also be checking in on our team’s Talk page from time to time, and of course we’re reachable at our usual team hangout on IRC, #wikimedia-sre on

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?

Inline Feedbacks
View all comments

Good to see some work towards fixing , almost ten years since giving up ( )!

However I’m surprised that WMF uses proprietary software and sends personal information of visitors to Atlassian, Google, Amazon and Fastly. How did it happen? Are there plans to fix it?

Will these figures also be used as indicators to produce some regular reports about, to see whether things get better or worse over time (KPIs, if you wish)?

It saddens me to see that WMF has chosen a proprietary solution for this. It could have been a great opportunity to outsource something like this to a chapter that could have deployed and maintained one of the many open-source status pages available.

I find it concerning(and somewhat offensive) to see the label “toy projects” being used considering that Wikimedia is so dependent on open source and volunteers while there is no lack of open source solutions for status pages maintained by actual companies.

Why was this decision made behind closed doors and not in public on wiki or in public Phabricator tickets?