ESAMS Servers not reachable, some EU traffic affected. (Fixed!)

Starting approx. 03:20 GMT, servers in our ESAMS facility began to roll offline one after another.  After some investigation, it appea
rs power is not being supplied to all the servers.  This has resulted in some slow downs for traffic of EU users.
Image (1) 120px-Gnome-face-sick.svg.png for post 3751
We have temporarily migrated all traffic to our primary FL datacenter.  Once the servers are
back online in ESAMS, we will be pushing service back to it as well.
Update: The problem has been identified and finally fixed. Traffic has been returned to normal.
The best guess so far is that there was a cooling failure in the datacenter which caused the Sun boxes to shut themselves down.
An update from Leaseweb/Evoswitch is here:  http://noc.leaseweb.com/status.php?i=389
Rob Halsell, Operations Engineer

Archive notice: This is an archived post from blog.wikimedia.org, which operated under different editorial and content guidelines than Diff.

6 Comments
Inline Feedbacks
View all comments

the new servers in haarlem are down al lot :/

Toolserver seems also to be down (http://commons.wikimedia.org/wiki/File:ToolserverErrormessage.jpg), making impossible a lot of admin work on Commons.

Yea, is this the 2nd or the third problem in so many weeks at evoswitch?

please make this site completly avraible correctly with https and not a mixed version(using https includes some http files.)

Nice to hear that it’s “Fixed!” But then why is the urgently needed Toolserver still down?

Please note that while the toolservers are indeed housed in the racks with Wikimedia servers, they are a separate management issue. They have root users and admins that the Wikimedia cluster does not.
So yes, they should be online, but they are also not directly administered by all of the Wikimedia sysadmins.
Any toolserver related issues should be addressed in the #wikimedia-toolserver IRC channel.
http://meta.wikimedia.org/wiki/Toolserver has further information.