"Verifying Gamepress' Grand Unified Catch Theory with Bayesian Markov Chain Monte Carlo Statistics"
#PokemonGO: TL;DR I re-analyzed the Gamepress data set to verify the Grand Unified Catch Theory. The resulting probability distributions (my analysis in black, Gamepress values in red) for most multipliers were generally consistent, but the Ultra Ball multiplier could have been previously slightly underestimated (I find 2.15 ± 0.04 instead of 2.0).The data set that was shared publicly does not allow to test the 2-r throw_multiplier, the average "Nice" throws mutliplier, or the medal multipliers.As several of you probably know, a series of recent posts were made by Gamepress in an attempt to unveil the exact server-side calculations of PoGo catch rates.Among other things, they used a set of ~56,600 bot throws collected by /u/CreativiTimothy (shared here, thanks to /u/homu/ !) to constrain the improvements in catch rate that arises from using different balls, throw bonuses, berries and/or curved throws. They found that the final catch probability (let's call it P_GUCT) can be obtained from the "base catch rate" (let's call it P_B) with this formula :P_GUCT = 1 - (1 - P_B)^multiplier, where P_B depends only on the pokemon species and its level. The "multiplier" is a function of the throw and ball types, whether a berry was used, whether a user has medals or not, and goes like this:multiplier = ball_multiplier * throw_multiplier * curve_multiplier * berry_multiplier * medal_multiplier Here are the results that gamepress obtained for these multipliers:ball_multiplier (Poke ball) = 1.0ball_multiplier (great ball) = 1.5ball_multiplier (ultra ball) = 2.0berry_multiplier = 1.5curve_multiplier = 1.7medal_multiplier (bronze) = 1.1medal_multiplier (silver) = 1.2medal_multiplier (gold) = 1.3throw_multiplier = 2.0 - rwhere "r" is the fraction between the radii of the color circle and the white circle, and only applies when the ball hits inside of the color circle. Gamepress also provided average values for the throw_multipliers depending on the message you obtain when hitting inside the color circle (either "Nice !", "Great !" or "Excellent !"):throw_multiplier (average Nice) = 1.15throw_multiplier (average Great) = 1.50throw_multiplier (average Excellent) = 1.85A very neat thing about this notation is that the final "multiplier" can be interpreted as a "number of balls" thrown at the monster at once. For example, throwing a Ultra ball gives you the same probability of catching the monster as throwing two regular Poke balls at once, except that the monster does not have as many opportunities for fleeing.Since /u/CreativiTimothy conveniently shared the complete data set, I decided to re-analyze it using Bayesian statistics and a Markov Chain Monte Carlo sampler. The basic principle of this method is to derive most probable values for all multipliers at once, using the complete data set instead of dividing the set into many separate categories (e.g. "great throws" only, "berries" only, etc.). This method also provides measurement errors on each multipliers, and allows to view statistical correlations between their values.For those that are a bit familiar with the concept, I used flat Bayesian priors on all multipliers (because they are simpler to use and priors will have an insignificant effect with so much data), and a binomial Likelihood function that goes like :L_i = (1-P_GUCT)^(1-success) * P_GUCT^success where the value of "success" is 1 for a catch or 0 otherwise.The Markov Chain Monte Carlo sampler then tries a lot of different values for all multipliers, and for each set calculates the likelihoods L_i for all individual throws, and then takes the product of all of these likelihoods:L = (product of L_i over all i) where L is the probability that these ~56,600 throws produced all of the observed outcomes for a given set of multipliers.The goal of the Markov Chain Monte Carlo sampler is to investigate the distribution of the likelihood L around its peak, which will give us information on the most probable values for each of the multipliers.Now, the dataset of /u/CreativiTimothy was created pre-v0.41 and therefore it does not allow to investigate anything about the medal bonuses. The radii of the color circles were also not registered, which means that I cannot try to verify the "2 - r" value of the throw_multiplier, but I can instead try to reproduce the average value of the throw multiplier for the three types of bonuses (i.e., "Nice", "Great" and "Excellent" throws). However, no "Nice" throws were recorded in the dataset, which means that I also can't investigate the value of the average throw_multiplier for "Nice" throws.Here are the values that I obtained, with their error bars:ball_multiplier (Great ball) = 1.54 ± 0.04ball_multiplier (Ultra ball) = 2.15 ± 0.06throw_multiplier (average Great) = 1.6 ± 0.2throw_multiplier (average Excellent) = 1.92 ± 0.08berry_multiplier = 1.51 ± 0.04curve_multiplier = 1.69 ± 0.01Compared with the Gamepress results reported above:ball_multiplier (Great ball) = 1.5ball_multiplier (Ultra ball) = 2.0throw_multiplier (average Great) = 1.50throw_multiplier (average Excellent) = 1.85berry_multiplier = 1.5curve_multiplier = 1.7We can see that there is generally a good agreement between our values, but a few of them seem to be offset by a small amount. Using the error bars that I have derived, for each of the multipliers I can derive a statistical confidence at which we can say that our results differ:ball_multiplier (Great ball) at 68%ball_multiplier (Ultra ball) at 98.8%throw_multiplier (average Great) at 38%throw_multiplier (average Excellent) at 62%berry_multiplier at 20%curve_multiplier at 68%For reference, in science we generally only consider statements as verified when they have a statistical significance between 95% and 99.99994% depending on the field or research teams (an example of the most restrictive 99.99994% was CERN for reporting the Higgs Boson discovery).In other words, the only multiplier that seem possibly different in my analysis is the Ultra Ball multiplier - it seems that it could be slightly underestimated (2.15 instead of 2.0) with a ~98.8% statistical confidence. I must be clear that this multiplier I am obtaining here is 100% dependent on /u/CreativiTimothy's data set only, and not on metagaming considerations like how the client/server responses are encoded, or on previous research that was based on the RGB color of the circle. I would urge to take this potential result with care at the moment, but I think that it would be a good idea to double-check the previous analyses based on RGB circle color, and make sure that this Ultra Ball multiplier is really 2.0.Finally, here is a triangle plot of my results. The 2D contours represent the 1, 2 and 3-sigma (respectively 68%, 95% and 99.7% statistical confidence) ranges for each parameters, and the histograms on the diagonal represent the probability distribution for each individual multiplier. The red squares and red lines are centered on the Gamepress values.Other than the possible slight discrepancies that I mentioned above, you can see that most of the 2D plots are pretty round-shaped. This means that the parameters in question are not correlated, and it is a good thing, as for example it would not make sense that the "curve_multiplier" would be dependent on whether a Ultra ball was used or not.There is one exception to this where the two parameters seem slightly correlated: "Great ball" versus "berry". My interpretation is that the current data is not absolutely great for separating the effects of a great ball versus that of a berry. A way to improve this would be to add a lot more throws that use a berry without a Great ball (currently only 1,600 of the 56,600 throws fall in this category), or that use a Great ball without a berry (currently only 3,600 of the 56,600 throws fall in this category).If any one would like to re-do this analysis or make sure that I didn't mess up, I am sharing the Python code that I used to do this, as well as the .csv ascii version of /u/CreativiTimothy's data set here. If you want just the Python code, click here. via /r/TheSilphRoad http://ift.tt/2fGKuZB
"Verifying Gamepress' Grand Unified Catch Theory with Bayesian Markov Chain Monte Carlo Statistics"
Reviewed by The Pokémonger
on
09:16
Rating:
No comments