I’m at a father daughter dance and the theme is 80’s. One, none of the kids know what this means. Two, I think some of the grownups are perhaps a bit too earnest in their costumes. For example, the guy in the white Jean jacket with a single cross earring.
Longpost tech ops story -- How I deleted an entire MMO
One of my first jobs in ops, I did server wrangling and NOC duties for a Content Distribution Network and file host/storage.
We had a contract with a major MMO to provide CDN services as well as file hosting/storage. We claimed the storage was "in the cloud"; you know what they say about clouds.
They had a big, big patch drop and let us know to expect a traffic burst. I decided to reallocate where their files were stored to protect other customers.
What I didn't know was our data admin had been doing some stuff "off the record" to the storage infrastructure. I didn't check things as well as I should and in the course of removing their data from servers that also had other high-traffic customers on them, left the customer's data on only one server.
As luck would have it, the server's RAID died during the pre-load. I'd find out later it had been throwing degraded array alerts for weeks but no one was minding that monitoring system.
This was during the downtime the MMO was undergoing to migrate DBs and prepare for the patch; it was too late for them to roll back cleanly.
We reached out to them and apparently they needed to recompile the client and patches, and re-upload them. Tens of gigs of stuff. It took days; and that entire time the MMO was effectively down. Oh it was up, but no one could get the patch needed to log in.
We also lost their patcher binary (which they couldn't trivially rebuild???), a bunch of their website assets, and chunks of their weird in-game UI. It was a mess that took weeks to fully sort out.
I went and looked at that game's forums after it happened. The hate was pretty spectacular. You know the MMO player entitlement rage? Yeah. Imagine knowing you caused it :)
I manually verify data integrity on sharded data stores now before doing anything to it.