Major news in January include:
- the successful migration of our main services to our data center in Ashburn, Virginia;
- new features available in our mobile beta;
- progress on input methods and our upcoming translation interface;
- the announcement of GeoData, a feature to attach geo-coordinates to Wikipedia and Wikivoyage articles;
- a testing event to assess how VisualEditor handles non-Latin characters.
Note: We’re also proposing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.
- 112 unique committers contributed patchsets of code to MediaWiki.
- The total number of unresolved commits remained stable around 650.
- About 45 shell requests were processed.
- Wikimedia Labs now hosts 155 projects and 931 users; to date 1473 instances have been created.
Work with us
Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.
- Software Engineer – Editor Engagement
- Technical Writer – (Contract)
- Software Developer – Fundraising
- Software Engineer (Partners)
- Software Engineer (Apps)
- Software Developer General (Mobile)
- Software Engineer – Multimedia
- Software Engineer (Search)
- Product Manager (Mobile)
- Director of User Experience
- Visual Designer
- Operations Engineer
- Operations Engineer/Database Administrator
- Site Reliability Engineer
- Tools Lab Operations Engineer (Contractor)
- Yuvaraj Pandian re-joined the Mobile engineering team as Software developer (announcement). He joined the newly created Mobile App team with Brion Vibber and Shankar Narayan.
- Munagala Ramanath (Ram) joined the MediaWiki core team of the Platform engineering group as Senior Software Engineer (announcement).
- Runa Bhattacharjee joined the Language Engineering team as Outreach and QA coordinator (announcement).
- The Wikimedia Foundation switched over its primary data center from Tampa, Florida to Ashburn, Virginia on January 22. Given the scale and complexity of the migration, we scheduled three 8-hour windows to perform the migration, but we were able to complete it on the first attempt. Because the switchover involved, among other things, moving over the master databases from Tampa to Ashburn, the site was set to ‘read-only’ mode for about 32 minutes. During that period, the site was available but no new contents were created, edited or uploaded. As expected, there was some minor fallout of the migration, mostly due to configuration changes, but they were quickly contained by the Engineering and Operation teams.
- With this migration, Tampa data center will now be our fail-over site and we plan to perform site fail-over tests every few months. There are remaining small non-core applications still using Tampa as the primary site, such as RT, etherpad and Bugzilla. They too will be migrated in the coming months.
- One of the main concerns of the migration was serving traffic from the new data center using empty memcached servers: the spike in load on the Apache and database servers could have been disastrous to the site. To address it, Tim Starling improved on the single instance implementation of ‘Parser Cache’ persistent store in Tampa (to 3 sharded instances), and Asher Feldman built and replicated the databases across the 2 data centers.
- Another improvement, done by Asher and Peter Youngmeister, was the implementation of MHA (Master High Availability) on our MySQL clusters. Its primary objective is to automate the promotion of a slave database in a master database fail-over scenario and to to reduce downtime, without suffering from replication integrity problems, without prolong database latency, and without changing existing deployments.
- Faidon Liambotis and Mark Bergsma continued to work on the Ceph file object store. With Domas Mituzas’ help, they identified a performance issue with the RAID card which caused severe read/write latency on the Ceph cluster. Faidon has confirmed with the vendor that it is a known problem and no fix is available yet. We have ordered and substituted those RAID cards, and test results seem to indicate that the performance issue is solved.
- Fundraising bastion hosts were deployed in the Ashburn and Tampa data centers. We also tweaked and tuned central logging and monitoring, and converted the remaining fundraising MyISAM tables to InnoDB, which should fix dump-induced replication lag.
- This month, we had a look at the process of using the XML dumps to create a local copy of a Wikimedia site: it turned out to be painful and cumbersome at best, and unfathomable for the end-user in the worst case. As part of an attempt to improve this situation, there is now a new experimental tool available for *nix platforms, for generating MySQL tables from the XML stub and page content files. It is intended to read input files from various versions of MediaWiki and generate output for the version the user wants. Testing and feedback is encouraged.
- In January, we had a number of performance and usability improvements. Three compute nodes were added into the pmtpa zone. Alex Monk added Echo notification support to labsconsole, passwordless sudo is now the default for projects, and shell requests are created automatically on account creation. The sysadmin and netadmin roles have been combined into a single projectadmin role. Glusterfs was upgraded to handle a memory leak, but unfortunately a new bug has been introduced that caused some instability in project storage. Work is ongoing to improve the project storage situation.
Editor retention: Editing tools
Editor engagement features
Editor engagement experiments
First up, we launched guided tours on the English Wikipedia, including a test tour to demonstrate the capabilities of the extension, and a tour associated with the “onboarding new Wikipedians” (aka GettingStarted) project. In addition to tours created by the team, the extension supports community-created tours. Note that unlike many other projects by the E3 team, guided tours are planned as a permanent addition to Wikipedia, with each tour implementation considered to be experimental. (For example: the “getting started” tour will be delivered via a split A/B test.)
While building guided tours, the team also A/B tested the Getting Started landing page and task list, measuring the effect it had on driving new contributions. Several rounds of analysis were completed and published on Meta (round 1, round 2), with the conclusion that the onboarding experience is leading to small but statistically significant increases in new English Wikipedians attempting to edit, as well as saving their first edit. In addition to measuring the effects of the guided tour associated with this project, immediate plans are to redesign the landing page and add additional task types, to entice more new contributors.
Work also continued on refining the reliability and precision of the data collected from EventLogging. In particular, we migrated EventLogging to a dedicated database, and began collecting server-side events in addition to client-side, to support work such as measuring account creations on desktop and mobile. January also saw the heavy use of the new User Metrics API, in order to complete cohort analysis of onboarding users and for metrics reported at the Board presentation on the Foundation’s year-to-date progress. Development of the API continues, and a public announcement is expected for early March. Last but not least, a call was put out for a part-time Technical Writer to work on documenting both of these pieces of infrastructure.
Incremental architectural improvements
Security auditing and response
Engineering community team
Volunteer coordination and outreach
January 17th. He prepared an intro to MediaWiki & Wikimedia tech contributions, which he tested at FOSDEM, designed to be reused by other presenters. Last, we confirmed that technical projects are eligible to Individual Engagement Grants.
The Kiwix project is funded and executed by Wikimedia CH.
- We have adapted the kiwix-plug script to Tonidoplug2, a device cheaper than the Dreamplug. Kiwix was elected by Sourceforge users as February’s Project of the Month and an interview of Emmanuel Engelhart was published. For the first time, Kiwix has reached 100.000 downloads a month in January.
- Beside Kiwix, the openZIM website was revamped and simplified for better readability. The openZIM bug tracker and source code management were migrated to the Wikimedia infrastructure (Bugzilla and Git).
The Wikidata project is funded and executed by Wikimedia Deutschland.
- January has been an exciting month for Wikidata. The deployment on the first Wikipedia sites (Hungarian, Hebrew and Italian) was completed. At the same time, work has continued on the user interface and back-end for statements, the core part of Wikidata’s second phase. This will enable users to enter information like the children of a given person or a link to their portrait on Wikimedia Commons. These features can already be tested on the demo system. We’ve also worked on making AbuseFilter work with Wikidata, and wrote a new mechanism to distribute changes to the clients (Wikipedia) so they can show Wikidata changes in their RecentChanges. We made progress on using Solr for search and rewrote the draft for the inclusion syntax to be much simpler. This is the syntax that editors will use to include data from Wikidata in Wikipedia. A manual for using Pywikipedia on Wikidata was written as well.
- If you want to code on Wikibase, the software powering Wikidata, have a look at the outstanding bugs and tasks.
- The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.
This article was written collaboratively by Wikimedia engineers and managers. See revision history and associated status pages. A wiki version is also available.
Can you help us translate this article?
In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?Start translation