We’re adding an off site archive for Commons and the XML snapshots

Translate this post

Thanks are due to eBart consulting and User:Milosh for proving a backup server and storage array at their colocation facility in Europe. This server will store archives of our publicly available data of Wikimedia Commons and the XML snapshots.
Everyone knows that this has been long and coming as having an off site location for our data is extremely important for disaster recovery. With this archive in place we’ll have another external archive space for Commons image data to complement the one living at MIT.
Given the 10T’s donated were likely to also store yearly archives of the XML snapshots.
This won’t stop us from continuing to be rigorous about our internal backups for the same data along with keeping all of our users private data within our own data centers. It will simply be another physical space for us to archive our publicly available content.
While this off line mirror will only be used internally we have some other leads about other sponsors who might be able to offer a publicly available mirror. Over the next weeks we’ll be streamlining the off line archiving process and seeding the initial commons upload which currently comes in at just under 4T’s ! Once we make some sense of how best to manage the archiving process we’ll see who else is able to host our data.

Archive notice: This is an archived post from blog.wikimedia.org, which operated under different editorial and content guidelines than Diff.

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?

Inline Feedbacks
View all comments

Music to my ears.
Now how to bribe Murphy for the next couple of weeks 😉

great news!
This is exciting.
Now all we need is some grid/p2p architecture for distributing and processing it.

“Once we make some sense of how best to manage the archiving process we’ll see who else is able to host our data”.
I’d better give you the heads up. You might call it making “some sense of how best to manage the archives”, but in the world of archivists, they’re called curators.
Watch this space. http://www.wikimedia.org.au/wiki/GLAM

I don’t know where to write it, so put it as a comment to the most recent news. A lot of users in Russia and Belarus currently cannot access to local wikipedias. The confirmed reports are from Minsk, Yekaterinburg, Zlatoust, Yuruzan, Izhevsk. The typical tracert is the following: 1 9 ms 9 ms 10 ms 2 10 ms 12 ms 11 ms 3 10 ms 9 ms 11 ms 4 11 ms 10 ms 11 ms 5 68 ms 195 ms 19 ms 6 20 ms 13 ms 12 ms 7 42 ms 43… Read more »

Truly so! I’m from Yekaterinburg, Russia, and Wikipedia is very slow for two days already, with no pictures loading at all. Up to 2 minutes to open any single page.

Here’s the discussion at the local portal, users of several Yekt. providers have almost no access to Wikipedia for two days:
There are tracert’s included..

I hope there will be some reaction, third day started with many people having crippled access to Wiki.

Hi. Great stuff. I actually stumbled upon your blog accidentally but was amazed by its content. The information was really good and i would love to come back often. shall I give link to your blog from my blog. Same thing is expected from your side for good colleboration & mutual gain.

Looks like Mark already got to this and we have reports on Village Pump that everything is working fine again.

谁能告诉我怎么设置 wikipedia 的 URL Rewrite 功能啊?