Ads Top

"Can existing data or anecdotal data help the egg research to find a conclusion"


#PokemonGO: First of all – I read with great interest the article about the egg research. This is an area I have been interested myself for a long while. I even collected a set like the one collected by all the researchers – albeit not a single stop but a set of 5 stops. And I gave up after around 40 eggs – so heads up to the researchers who went on collecting from only a single stop. This is really impacting your own game play.But let’s get back to the title.In my view the reason that the egg research is less conclusive as it could be is because of the design of the experiment. In my view it pays not enough attention to a lot of the anecdotal data reported here on the Silph Road in an uncontrolled way. Let me explain what I mean:The research tried to tackle one recurring post – do different poke stops give out different ratios of 2K/5K/10K eggs – with travellers mainly interested in 10K eggs. There are three other kinds of egg related posts that you read here quite often and that might influence the research or might help you to design an experiment which will give you more accurate data / help you to control factors you need to check.A lot of people post here that they get all there 10K eggs from 3 or 4 stops – or they post that they collected (insert high number) eggs without ever getting a single 10K egg. What is common to these posts is that they seldom talk about a single stop. Gathering data from a single stop can be very, very difficult. But if all you want to proof is that you want to show a chi-square test with significance to show distribution is not random then you can pool poke stops as well as users. All you need to do is – do this ahead !! of the experiment. For example – there seems a difference in egg drops with stops at the local beach and stops in town. Define which stops are on the beach / which are in town – then collect data. Given the same list you can even spread the work across several data gatherers.Advantage: Data gathering which impacts less the game play and results in more people gathering data for you. Faster gathering of data – especially if multiple people gather at the same stops. Disadvantage: you need a reason why you split your pokestops and why some are category A and others are category B. And if you get this wrong then you average out your signal. So you need local guidance and a strong signal to start with.Another recurring post here is: has the 2K(10K) drop rate changed. Now this one is awkward. The whole egg research looked into differences in individual pokestops. If egg drops change with nest migrations then this could severely impact the signal. The best options here are either to only collect during one singe migration (best guess when changes might happen), watch the Silph Road for posts of changes in drop rates and compare data before/after or look for data sets which are not affected by this. Does the recording still exist of jhero trying to reach 1 million XP in a single day with 9 incubators going all the way. How many eggs did he gather during that one day – this is data unlikely to be affected by time.A possible way to check this in the data of the Silph Road Research – do a Chi Sqaure test on the largest sets of data collected – check if the rates of 2K eggs in the first 50% collected differ from the last 50%. A negative isn’t a proof that it doesn’t change – but a positive could be an indication that it does change.Pro: none Con: If drop rate changes over time then there is one more variable that makes research more complicated.Can the different level of researchers impact the results? Again, there are certain posts that come again and again where you read anecdotally that I got no 10K eggs between level x and y – then 10k eggs started again to appear. Something you also read often is players who get their 10K egg at the beginning of play and none again for a long while. If level does effect, then checking starting characters is likely the best way to check this out. A new player starts out with 9 empty egg slots. So all needed is – fill the 9 egg slots, gather the data for these 9 eggs. The level/CP of hatched eggs can be used as a check that the eggs are actually from the right level.Pro: A potentially easy way to check level dependence is to use new starter accounts as anecdotally they seem to give the largest effect, 9 open slots should make it easiest to gather this data Con: Silph Road researchers are not starter players, making accounts for this research is strictly speaking against TOS.Use of historical data. One other aspect that could help is the use of historical data. My very first comment ever distribution to the Silph Road which trended for >24 hours on the front page here was about random egg drops. I started with the assumption they are random – but my believe was shaken by results from posters. One example: /u/Cha-La-Mao reports that he got 146 eggs in a row with zero 10k egg drops. What is the likelihood of this happening of compared to the 10K egg drop rate of the 1500 eggs? There is biased reporting – but is the chance 1 out of 100K or 1 out of millions or billions? Another dataset is from /u/CrabHelmed - http://ift.tt/2fZ12sg who reported on approx. 1000 eggs dropped – apparently from a bot. This one has a very high 10K drop rate of around 12%. What is the chance of this fitting with the data just gathered. I also mentioned already /u/jdero – does a copy of his live stream still exist. Can the egg drop rates recovered and do they differ from the rates of the SilphRoad experiment. Pro: Using existing data means less work needed to gather the data CON: this data is not controlled and likely only the jdero data (if still in existence) would fulfil SilphRoad Research stringent quality control measures on data for their research I’m sure there is more I missed. And I encourage readers to add their ideas in the comment section. But I felt that we need to discuss more as just the statistical approach of the egg research. This is still one of the great mysteries that in my view remains unsolved.TL;DR A different approach to the egg research might yield as much results to a significant find as more data or better statistical handling. Use existing data / anecdotal data as a guide to design a good experiment. Some examples are given and readers are encouraged to write their suggestions.Edit1: I just accepted an invite to the TSR as a researcher. I don't have any access to their work yet but once I have access I might have to restrain myself if any answers I give here would compromise leaking their research. This post was something I wanted to post for a long while but never got around to. So when I logged in I saw the invite - but still wanted to post my thoughts. Don't want to get thrown out right away after waiting for a while :) and not even noticing the invite as I was reading SilphRoad but wasn't logged in. via /r/TheSilphRoad http://ift.tt/2gUKxlg
"Can existing data or anecdotal data help the egg research to find a conclusion" "Can existing data or anecdotal data help the egg research to find a conclusion" Reviewed by The Pokémonger on 23:38 Rating: 5

No comments

Hey Everybody!

Welcome to the space of Pokémonger! We're all grateful to Pokémon & Niantic for developing Pokémon GO. This site is made up of fan posts, updates, tips and memes curated from the web! This site is not affiliated with Pokémon GO or its makers, just a fan site collecting everything a fan would like. Drop a word if you want to feature anything! Cheers.