Wikimedia engineering report, May 2014

Translate this post

Major news in May include:

Note: We’re also providing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.

Engineering metrics in May:

  • 154 unique committers contributed patchsets of code to MediaWiki.
  • The total number of unresolved commits went from around 1305 to about 1440.
  • About 15 shell requests were processed.

Personnel

Work with us

Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.

Announcements

  • Filippo Giunchedi joined the Operations team as Operations Engineer (announcement).
  • Rob Moen moved from the VisualEditor team to the Growth team (announcement).
  • Bernd Sitzmann joined the Mobile App Team as Software Developer (announcement).
  • Mukunda Modell joined the new Platform Engineering team as Release Engineer (announcement).
  • Alex Monk joined the Features team as a contractor (announcement).
  • Abigail Ripstra joined the User Experience team as User Researcher Lead (announcement)
  • Rachel diCerbo joined the Wikimedia Foundation as Director of Community Engagement (Product) (announcement).
  • Dan Duvall joined the Release and QA team as Automation Engineer (announcement).

Technical Operations

New Dallas data center

Planning for the new Dallas data center has continued in May, and basic data infrastructure components such as racks, PDUs, network equipment and various supplies have been ordered. About 140 servers have been sent from our Tampa data center, to be installed in Dallas. Racking and configuration work in our Dallas data center will commence in June.
Labs metrics in May:

  • Number of projects: 162
  • Number of instances: 392
  • Amount of RAM in use (in MBs): 1,618,432
  • Amount of allocated storage (in GBs): 17,860
  • Number of virtual CPUs in use: 795
  • Number of users: 3,259

Wikimedia Labs

Labs has been upgraded to use Puppet version 3; Ubuntu Trusty (14.04) is now available for instances, and Tool Labs now features 787 tools from 508 maintainers.

Features Engineering

Editor retention: Editing tools

VisualEditor

In May, the VisualEditor team worked on the performance stability of the editor, rolled out a major new feature to help users better edit articles, and made some improvements to other features to increase their ease of use and understandability, fixing 75 bugs and tickets. The new citation editor is now available to all VisualEditor users on the English, Polish, and Czech Wikipedias, with instructions on how to enable it on other wikis. The citation and template dialogs were simplified to avoid technical language and some outcomes that were unexpected for users. As part of this, the citation icons were replaced with a new, clearer set, and the template hinting system now lets wikis mark template parameters as “suggested”, as a step below the existing “required” state. The formula editor is now available to all VisualEditor users, and a new Beta Feature giving a tool that lets you set the language of content was made available for testing and feedback. Following a new set of user testing, the toolbar was tweaked, moving the list and indent buttons to a drop-down to make them less prominent, and removing the gallery button which is rarely used and confused users. The mobile version of VisualEditor, currently available for alpha testers, was expanded to also have the new citation editor available, and had some significant performance improvements made, especially for long or complex pages. Work continued on making VisualEditor more performant and reliable, and key tasks like keyboard accessibility have progressed. The deployed version of the code was updated five times in the regular release cycle (1.24-wmf3, 1.24-wmf4, 1.24-wmf5, 1.24-wmf6 and 1.24-wmf7).

Parsoid

In May, the Parsoid team continued with ongoing bug fixes and bi-weekly deployments. Besides the user-facing bug fixes, we also improved our tracing support (to aid debugging), and did some performance improvements. We also finished implementing support for HTML/visual editing of transclusion parameters. This is not yet enabled in production while we finish up any additional performance tweaks on it.
GSoC 2014 also kicked off in May; we have one student working on a wikilint project to detect broken/bad wikitext in wiki pages.
We also started planning and charting goals for 2014/2015.

Core Features

Flow

In May, the Flow team prepared the new front-end redesign for expected release in mid-June. We completed work on sorting topics on a board by most recent activity, also for mid-June release. We changed hidden post handling so that everyone can see hidden posts, including anonymous users.
Back-end improvements include optimizations on UUID handling and standardized URL generation. We also merged Special:Flow for release; it’s a community-created improvement that makes it easier to create redirects to Flow boards. We also made no-JS fixes for topic submission and replies.
Bug fixes include: Firefox errors, WhatLinksHere fixes, special characters in topic titles, topic creation on empty boards, curr and prev links in board history for topic summaries, and cross-wiki issues with user name lookup.

Growth

Growth

Growth team presentation slides from the monthly Metrics meeting

This month the Growth team launched its A/B test of two methods for asking anonymous editors to sign up on English, German, French, and Italian Wikipedias. Full analysis of the test results is expected in June, though preliminary data strongly suggests a positive impact on new registrations. We finished the mw.cookie module, assisted by Timo Tijhof. Matt and Aaron participated in the Zürich hackathon. Last but not least, Growth released two smaller enhancements to our data collection regarding article creation, including adding page identifiers to MediaWiki core deletion logs and tracking page restorations across all wikis.

Support

Wikipedia Education Program

This month we fixed bugs and made some improvements to the Education Program extension. The biggest change was Sage Ross’s addition of an API for listing students enrolled in courses. Also, students from Facebook Open Academy worked on a new notification and a new activity feed.

Mobile

Wikimedia Apps

This month, the Apps team worked on a series of navigation improvements to the iOS and Android alpha apps, focusing on the UI for searching, saving and sharing pages, and navigating to the table of contents. We also worked on restyling the global navigation menu and article content—typography, color, and spacing—to create a standardized experience across the mobile web and apps. In preparation for the launch of the Android app in June, we tackled a number of user-reported crashing bugs to ensure a more stable and reliable experience for our users.

Mobile web projects

This month the Mobile Web team continued to build out the basic features of VisualEditor for tablet users, providing the ability to add references via VisualEditor. We hope to finish refining the add and modify references workflow in preparation for graduating VE for tablets to the stable mobile site sometime in July. On the reader features side, we’ve pushed a number of tablet-related styling improvements (typography, spacing, and Table of Contents) to the stable mobile site. This should greatly improve the reading experience for tablet users who are already accessing the mobile version of our projects, and it is one of the last pieces of work we planned to get done before we begin redirecting all tablet users to the mobile site mid-June.

Wikipedia Zero

During the last month, the team restored IP address dynamic updates for Wikipedia Zero partner configurations, advanced refactoring of ZeroRatedMobileAccess into multiple extensions, added support for graceful image quality reduction (roll-out for Wikipedia Zero will be carefully approached), fixed an HTTPS-to-HTTP redirection bug, and worked on an RFC for GIFification of banners instead of ESI. We also added MCC/MNC sampling and necessary library support to the reboots of the Wikipedia apps, cut an alpha Android build, performed limited app code review, added support for Nokia (now MS Mobile) proxies (next step is to add zero-rating with operators who have a Nokia proxy arrangement), diagnosed configuration retrieval issues, simplified automation tests down to just banner presence checks, ran SMS usage analysis and a one-off operator pageview analysis, and started work with the Design team on the final polish for the Wikipedia Zero experience in the forthcoming apps. Additionally, the team started its annual engineering goal setting for the upcoming fiscal year. Routine pre- and post-launch configuration changes were made to support operator zero-rating, with routine technical assistance provided to operators and the partner management team to help add zero-rating and address anomalies.

Wikipedia Zero (partnerships)

In May we launched Wikipedia Zero with Ncell in Nepal, Sky Mobile (Beeline) in Kyrgyzstan and Airtel in Nigeria. We also added Opera Mini zero-rating in Umniah in Jordan. We served roughly 67 million free page views in May across 30 partners in 28 countries. Adele Vrana attended the Wikipedia Education Hackathon in Jordan, where she collaborated with community members from Egypt, Saudi Arabia, Yemen and Jordan. While there, she visited Umniah, our local operator partner. Adele also went to Brazil, where she met with prospective partners. We kicked off the carrier portal UX design with Noble studios. 

Language Engineering

Milkshake

The Sanskrit keyboard was updated according to user requests. CLDRPluralParser was relicensed under the MIT license for possible reuse in upstream jQuery libraries.

Language Engineering Communications and Outreach

Niklas Laxström participated in the Zürich hackathon.

Content translation

Most of the team met in Valencia to complete the ContentTranslation architecture and roadmap. The dictionary feature is now up for limited testing.

Platform Engineering

MediaWiki Core

Site performance and architecture

Aaron Schulz has been reviewing the Petition extension for deployment to the cluster, working with Peter Coombe to improve its performance. In addition, the reliability and speed of media uploads was increased by removing many failure cases on Commons. There were many other minor fixes over the course of the month.

HHVM

HHVM is running on a test machine (“osmium”) in our production cluster. Most of Tim Starling’s work on the Zend compatibility layer have landed in HHVM 3.1. Most jobs are working, but bugfixing continues on osmium.

Release & QA

The Release and QA team expanded this past month with the addition of Mukunda Modell as a new Release Engineer. His first work is addressing the remaining technical issues blocking our adoption of Phabricator as a WMF-wide tool. Antoine Musso has begun drafting an RFC to outline how the WMF would support isolated test environments for automatic builds. The WMF kicked off the second RFP for the release management of MediaWiki. Chris McMahon fleshed out tooling for creating test data at test time more widely along with building off of Antoine’s work for using the appropriate version of browser tests for testing specific mediawiki test installs (browser tests are versioned along with the other code).

Admin tools development

Minor improvements were made to admin tools projects including having global blocks shown on Special:Contributions (bug 52673). Code review also continued on the global rename user tool.

Search

In May, we deployed changes to improve snippets generated by Cirrus to a handful of wikis, spent some time improving its analysis for Hebrew, and adding more backwards compatibility with lsearchd’s syntax to Cirrus.

Auth systems

We worked on the SOA Authentication RFC to support the Services team. We also created a MediaWiki-vagrant role for CentralAuth, including significant work to support multiple wikis on a single Vagrant instance. We continued work on the Phabricator-MediaWiki OAuth integration, and the patch was upstreamed. Last, we held an OAuth training session at the Zürich Hackathon, resulting in several new apps using OAuth.

Wikimania Scholarships app

The Wikimania Scholarships project for the 2014 cycle wrapped up with the final awarding of scholarships for Wikimania 2014. The current hope is that the 2015 application and review cycle will be managed mainly by the Wikimania 2015 organizing group with limited technical support from Platform Engineering.

Deployment tooling

Bryan Davis continued the Pythonification of our deployment tooling with the conversion of the remaining bits, namely all sync-* scripts (sync-common, sync-file, sync-dir, and sync-db). Work continues on modifying our deployment tooling to provide easier and more robust automatic access to our version information (available at Special:Version).

Security auditing and response

MediaWiki (1.22.7) was released to fix an XSS vulnerability. A separate DOM XSS issue was fixed in MobileFrontend. We also finished a review of Hadoop’s Camus.

Quality assurance

Quality Assurance

The expertise of our new Automation Engineer, Dan Duvall, will allow us to update, maintain and modernize our development environments using Vagrant virtual machines, keep our Puppet site configuration running properly, and contribute to the whole delivery pipeline as we continue to improve our ability to deploy software features to Wikimedia sites quickly and safely.

Browser testing

Now that we have the ability to address the MediaWiki API from the browser testing framework, we have changed the existing test suites to use this powerful tool. This not only gives us a test of the API itself, but also makes the browser tests faster and more reliable. Furthermore, it allows us to easily create a set of acceptance tests that will pass on any MediaWiki installation regardless of what extensions exist or what language the wiki is. We plan for these acceptance tests to eventually become part of MediaWiki core, and our existing tests continue to expose important issues in Wikimedia software development projects.

Multimedia

Multimedia

Metrics meeting presentation slides about UploadWizard

In May, the multimedia team released Media Viewer v0.2 on more large wikis (Dutch, French, Italian, Japanese, Portuguese, Spanish and Russian Wikipedias), with over 10 million image views daily. This multimedia browser has been well received: 70% of survey respondents find this tool useful; based on this favorable feedback, we plan to deploy Media Viewer on all wikis in June. Gilles Dubuc, Mark Holmquist, and Gergő Tisza fixed more bugs and features during this development cycle, with design help from Pau Giner.
The team has switched its focus to new projects, starting with the UploadWizard, our main user-facing feature this year: this month, we collected metrics, reviewed user feedback, created new designs, fixed bugs and refactored code as part of a major upgrade of this important contribution tool. We also allocated a third of our time to technical debt and bug fixes for other multimedia tools, with an initial focus on improving image scalers, GWToolset and TimedMediaHandler.
Fabrice Florin managed product development, hosting an annual planning meeting to define our goals for 2014−15: this year, we aim to engage more users to contribute media to our sites through tools like UploadWizard, while implementing structured data on Commons and continuing to address our technical debt and fix critical bugs. Keegan Peterzell and Fabrice also continued to engage our community partners throughout the release of Media Viewer. We are planning new discussions in coming weeks to improve our plans together. For join these conversations and keep up with our work, we invite you to subscribe to the multimedia mailing list.

Engineering Community Team

Bug management

Mark Holmquist and Chad Horohoe changed Bugzilla to automatically handle out “editbugs” permissions to new accounts and to be able to hand out “editbugs” recursively (bug 40497). Apart from usual bugtriage and major focus on Phabricator migration work, Andre Klapper retriaged some open major/high Multimedia tickets, created some requested components and investigated Gerrit bot notification breakage with Daniel Zahn and Christian Aistleitner. He also made small changes to the Annoying little bugs page based on feedback, and reorganized the Wikipedia App and the Analytics products in Bugzilla. Andre and Mark Hershberger updated some open Bugzilla tickets with Target Milestone 1.23.0 at the Zürich Hackathon 2014.

Phabricator migration

Mukunda Modell is currently addressing authentication and access restrictions for security tickets with upstream. Chase Pettet and Daniel Zahn of Wikimedia Operations are spearheading the Phabricator installation. Andre Klapper has upstreamed some issues, commented on numerous tickets, and identified further tasks related to migration. An overview board of tasks to solve for the first day of Phabricator in production is available. Furthermore, once Wikimedia SUL authentication is sorted out, it is investigated to launch the Phabricator production instance first with very limited functionality to provide a Trusted User Tool.

Mentorship programs

We hosted a Q&A session on IRC with a high participation of Google Summer of Code and FOSS Outreach Program for Women interns and mentors right at the beginning of the development phase. Below are the first project reports, including the lessons learned during the community bonding period:

Technical communications

In addition to ongoing communications support for the engineering staff, Guillaume Paumier mostly focused on improvements to the system of scripts and templates used to document Wikimedia engineering activities on mediawiki.org. Active and inactive projects can now be queried separately, which means that the list of projects that appears in the drop-down of the status helper gadget is much shorter, now only listing active projects. Guillaume also wrote a dedicated Lua module to manipulate engineering activities automatically, in particular for the Wikimedia Engineering portal. It is now possible to query activities from a given team, and display them on any page in various formats. Using the module, the ever-outdated list of “current” activities on the portal was replaced by an automatically-generated list based on projects listed on team hubs. The module also allows to feature a random engineering activity on the portal. Other additions to the portal include the latest issue of Tech News (transcluded and automatically updated every week), as well as the first paragraph of the latest monthly engineering report (manually updated for now). Future improvements of the portal are expected to be mostly aesthetic.

Volunteer coordination and outreach

The Wikimedia Hackathon in Zürich was a success according to ad hoc feedback from the participants. A deeper review is expected to be published in July, after compiling the results of the survey. The main merit goes to Wikimedia CH for an efficient, warm, and flexible organization. We also announced a process to request the organization of Hackathons. We had an intense calendar of events in May, including a Tech Talk about Elasticsearch and a meetup in San Francisco on Making Wikipedia Fast, organized successfully together with the Web Performance SF meetup.

Architecture and Requests for comment process

Engineers discussed the Performance guidelines at the Zürich Hackathon 2014. They also had IRC discussions of several requests for comment and documents:

Analytics

Wikimetrics

The focus for this month was on extending Wikimetrics to support the Editor Engagement Vital Signs (EEVS) project. The team also fixed several bugs around Unicode support (particularly non-Latin character sets) and implemented delete cohort functionality.

Data Processing

Capacity, deployment and CDH 5 (new Hadoop) version was worked on this month. These initiatives should be resolved in June. A permissions issue caused the page view dumps to stall for a weekend. The system was fixed promptly and no data was lost.

Editor Engagement Vital Signs

We created a headless user to run recurrent reports and groundwork to support creating new metrics. The team also discussed potential changes to the first round of metrics implemented in the system to support a broader view of participation, and reviewed mockups provided by the design team with stakeholders.

EventLogging

The team has adopted event logging and has identified some monitoring updates that need to be made.

Kiwix

The Kiwix project is funded and executed by Wikimedia CH.

The Hackathon in Zürich was our highlight for May. There, most of the team met, and a few new people joined the development effort during the 3 days in common with the Wikimedia hackathon. Most of the work done focused on preparing the final Kiwix 0.9 release, to be released in June. It was also necessary to release a new minor release (1.2) of libzim. Work on Kiwix for Android continues with the integration of a content/download manager. On the offline content side, 6 ZIM files with all TED talks were also released; it’s the first time we provide files with so much multimedia content. 50 USB flash drives with Kiwix and the Wikipedia for schools were prepared and sent to WikiIndaba; 15 Kiwix-plug are also going to be prepared for Afripedia. Finally, in the scope of the Malebooks project, work to prepare an offline version of the Gutenberg project, with its 45,000 public domain books, has started.

Wikidata

The Wikidata project is funded and executed by Wikimedia Deutschland.

It has been a busy month for Wikidata. The development team made significant progress on a number of important features. These include simple queries, the mono-lingual text datatype and redirects. We’ve also done a lot of research for the upcoming user interface change and started making mockups for it. Lydia Pintscher held an office hour on IRC and answered a lot of questions. Two interns (Anjali Sharma and Helen Halbert) started working with the team. They will improve the user documentation and social media outreach around Wikidata. Wikidata was also well represented at the MediaWiki hackathon in Zürich. Last but not least, Magnus Manske developed a number of games around Wikidata that help improve the data in it and add more. It’s a resounding success so far.

Future

The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the annual goals, listing ongoing and future Wikimedia engineering efforts.

This article was written collaboratively by Wikimedia engineers and managers. See revision history and associated status pages. A wiki version is also available.

Archive notice: This is an archived post from blog.wikimedia.org, which operated under different editorial and content guidelines than Diff.

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?