Archive for the 'Data Center Ops' Category
We must have angered the Internet gods because this Monday has been nothing short of tremendously disappointing. Pictured below is my staff working on the issues:

On to the specifics:
ExchangeDefender reports did not run last night and will likely remain offline until close of business today. We have had two switch crashes on our load balancers in front of our shared mail1 and www1 hosting services. Our offsite backup upgrade does not seem to be validating the certificate requests so https:// requests are failing (http:// still works fine, and data is encrypted on the client side so the transport mechanism isn’t as relevant - but if you’ve set https:// your backups are failing so we are treating this as a very serious issue)
Somehow, the roof is still above us and we have power. For now.
All the outstanding issues are being filtered through by my teams and will have service restored to 100% across the entire product portfolio - by the end of business today.
Update: As of 5 PM EST the ExchangeDefender reporting is back online, all the network issues have been resolved. The Offsite Backup service is still available via http:// but we are still working with AhSay to get the certificate issue resolved. Will update further on this as soon as I have more information.
Update: As of 11 PM EST all offsite backup grids now respond with the valid SSL certificates on the SSL port.
Looks like the ugly Monday is finally behind us.
Sincerely,
Vlad Mazek, CEO
Read the whole post...
As you may be aware, we have two data centers in Los Angeles on Wilshire Blvd. Earlier today, this area suffered a 5.8 magnitude earthquake. No systems were affected, no impact on any power feeds or network connections. Earthquakes tend to be followed by smaller “aftershocks” and we will be updating this post with details of any relevant information that may become available.
Read the whole post...
Our Los Angeles data center carrier has suffered an HVAC failure, and the connectivity to the network has been severed for the time being. The facilities team is in touch with the building owner, service restore is under way. All services provided by this data center are unfortunately affected and down at the moment.
Services affected: some ExchangeDefender, some SharePoint Hosting, some Virtual Servers.
We will update this ticket when all services have been restored. This ticket is ranked urgent. Our priority will be to restore services that are not redundant first: virtual servers, followed by SharePoint hosting.
Update (@ 3:00 AM PST -8 GMT, 6 AM EST -5 GMT): We expect SharePoint and Virtual Server services to be restored around 6 AM PST (-8 GMT). ExchangeDefender services are not impacted (please be patient with SPAM releases however). We will update this ticket at 6 AM or when services start coming back online.
Update (@ 3:44 AM PST -8 GMT, 6:44 AM EST -5 GMT): All services have been restored.
Total LA DC1 outage: 53 minutes.
Read the whole post...
For the past 10 hours or so we have been handling an 820% surge in reboot requests for hung Microsoft servers after applying the latest security patches. Our managed network of Windows 2003 servers has not been affected but a huge portion of our network apparently has, please be advised.
If your Windows Server becomes inaccessible as a result of the latest patches please open a ticket request and mark it as urgent. You will not be charged for the support request and your reboot will be handled with the highest priority. We have an additional shift on hand in all data centers to help you through this network event.
Read the whole post...
At roughly noon central time we have completed the upgrade of our Dallas DC4 network. The bandwidth upgrade brings in another 100Mbit of connectivity from Level3 and 100Mbit connectivity from Cogent, primarily for the offsite backup service that has experienced tremendous growth over the year.
Our Los Angeles DC2 will be undergoing a similar update by Thanksgiving along with plans to open the third data center in the Los Angeles area by start of 2008.
Read the whole post...
We have received a number of support tickets inquiring about the stabiblity of our San Jose data center (MAE West) following the 5.6 magnitude earthquake last night. While the 5.6 magnitude earthquake is significant, it has posed no issues to our data center or any infrastructure located there. All our west coast (Los Angeles, San Jose and Seattle data centers) equipment is rack mounted in four-post closed racks and even a significant quake would not pose any immediate danger to any equipment inside the building.
Thank you for your concerns and your well wishes to our staff, everyone is safe and sound and the network is as well. Your sympathies are appreciated nonetheless.
Read the whole post...