Ads Top

"What was with the glitches on the Nest Atlas earlier? - A Post Mortem from the Silph Road Team"


#PokemonGO: Hey travelers,If you attempted to use the Nest Atlas in the 12 hours preceding this post, you most likely encountered some technical difficulties. Unfortunately, the system had some issues earlier today. I want to explain what happened, what's been lost, what hasn't been lost, and what's been done about it.(First of all, major thanks is due to /u/marcoceppi, who has spent the last 16 straight hours working on these issues. He's helped the Silph Road team scale beyond our old days of 15k subreddit subscribers in ways we couldn't have without him. He'll probably post a more technical post mortem on http://blog.silph.io soon.)But our story begins a few days ago:Knowing that the new Nest Atlas upgrade would be significantly more resource-heavy we provisioned a lot more resources and made some significant infrastructure changes ahead of the launch.After the launch went well, we figured word would spread to other sites and YouTube, etc, so we decided to provision even more resources ahead of that spike. Like, a lot more resources (like 4x more capacity).When Trainer Tips launched a video about the new Atlas, we quickly blew well past the 4x resource usage increase that we expected from past Trainer Tips spikes to 8x yesterday's resource usage. It wasn't a quick spike. It was sustained for hours. This locked things up and the site slowed down. We were calling Google trying to get them to give us more CPUs.Then disaster struck!Somehow, in the intense resource overload a user's registration and nest report went awry. One user managed to accidentally overwrite the usernames of every other user while joining the web app. So your Silph Road account suddenly had this user's username. Not only that, but every nest field report was suddenly re-assigned to this specific user, and moved to one specific nest.So just like that, all the field reports were useless.But you have backups right?/u/marcoceppi had wisely helped set up a periodic backup of the Nest Atlas' data that runs every 6 hours. Well, it runs every 6 hours if the database server isn't under intense load. Unfortunately, that setting was a bad, bad thing to accidentally leave configured during a launch week.The latest backup we did have was fortunately from: 2016-10-05 03:27:05 UTC. The issue occurred 8 hours ago at 2016-10-06 19:11:39 UTC. In those 40 hours were many of the newer reports post-Migration #4.There was some good news, however.Because of a recent optimization we decided to make before launch, nests are stored with their current status, pokemon species, and spawn type in 2 places. One was not affected! So we are able to restore the last known info on nests that have been verified post-Migration #4! But we are not able to retrieve the 'notes' on those reports nor their submitter. They will have to be submitted again. :/What Was Originally LostAll usernames created before 8 hours ago had been overwritten.All field reports created before 8 hours ago had been moved to an incorrect nest and re-assigned to an incorrect travelerThe GoodWe've been able to restore usernames for all travelers who joined before the time of the incident. (7:11pm UTC)We've been able to completely restore all field reports before 3:27am UTC Oct 5We've been able to restore the status and current species of all Nests submitted/verified/refuted post Migration #4 and added a MissingNo. field report to the nest history explaining.All reports submitted after the incident 8 hours ago were unaffected.The BadTravelers who joined the Road after 3am Oct 5 have had their usernames defaulted to their reddit usernames (which survived the wipe). If you would like to update yours, just shoot /u/dronpes a PM and we'll get you taken care of until we have a way to update it in the web app.Field reports submitted after Oct 5 and before 8 hours ago have had their meta-data lost. (10.45% of field reports) Though their effect on the nest itself was preserved, the individual reports will need to be submitted again.Dronpes was freaking out at Comic Con instead of enjoying people watching with /u/Moots7Marco accidentally drank a whole 2 liter of Mountain DewThe TakeawayObviously we'll be backing data up much more aggressively than every 6 hours moving forward, and we'll be tightening down the code that writes data to the DB. We'll also be enforcing a few more columns are unique, which would have prevented much of this.We are confident that this was not a SQL injection attack, for reasons we won't disclose (for obvious security reasons). Our analysis has also shown us that this was extremely unlikely to have been the result of malicious/illicit scripts attempting to steal/scrape data from the Atlas and breaking things. As far as we know at this point, this was a "the DB server and it's slaves are on fire and something broke while attempting to save a record."We were so heartbroken to see data loss, travelers. You're all working so hard out there getting boots on the ground and eyes on these nests. We failed the Silph Road community in a way tonight. But know that we're using this as a learning experience and sharpening our recovery procedures and infrastructure to better handle future issues.Feel free to ask any questions you have and /u/marcoceppi and myself will be happy to answer (though Marco's falling asleep at the keyboard at this point, and I'm not far behind, so we might have to answer in greater depth tomorrow).Thanks for your patience as we work out the kinks in the Nest Atlas, travelers.- dronpes - via /r/TheSilphRoad http://ift.tt/2dJgE5M
"What was with the glitches on the Nest Atlas earlier? - A Post Mortem from the Silph Road Team" "What was with the glitches on the Nest Atlas earlier? - A Post Mortem from the Silph Road Team" Reviewed by The Pokémonger on 12:02 Rating: 5

No comments

Hey Everybody!

Welcome to the space of Pokémonger! We're all grateful to Pokémon & Niantic for developing Pokémon GO. This site is made up of fan posts, updates, tips and memes curated from the web! This site is not affiliated with Pokémon GO or its makers, just a fan site collecting everything a fan would like. Drop a word if you want to feature anything! Cheers.