about summary refs log tree commit diff
path: root/stat/stat.py
diff options
context:
space:
mode:
Diffstat (limited to 'stat/stat.py')
-rw-r--r--stat/stat.py37
1 files changed, 37 insertions, 0 deletions
diff --git a/stat/stat.py b/stat/stat.py
new file mode 100644
index 0000000..540ade8
--- /dev/null
+++ b/stat/stat.py
@@ -0,0 +1,37 @@
+from math import sqrt
+
+"""
+Motivation: sorting items by rating (simple up/down votes)
+
+I first encountered this here:
+-> https://www.evanmiller.org/how-not-to-sort-by-average-rating.html
+
+Naively subtracting up/down is obviously incorrect (e.g. +1000/-900 is sorted above +50/-0).
+
+Comparing raw percentages is more subtly incorrect, but it comes down to the fact that the
+sample sizes are different - 100% positive out of 1 vote can't be meaningfully compared directly
+to 98% positive out of 100 votes.
+-> c.f. Matt Parker's video on why you can't simply subtract percentages
+   (debunking a claim made about 2020 US election results)
+   https://www.youtube.com/watch?v=aokNwKx7gM8
+
+And so, the thing to do is statistics! Using the scores we have, we can construct a confidence
+interval - a range that we're 95% sure contains the "true" rating we'd get if we managed to get
+a vote from everybody. To sort, though, we want a single value - the lower bound of the confidence
+interval is a good choice, since it goes up both when the average goes up but also when we have
+a larger sample size (tighter standard deviation).
+
+The math here comes from a 1927 paper by Edwin B. Wilson.
+-> https://www.jstor.org/stable/2276774 (public domain!)
+
+1998 papers that end up recommending the Wilson 'score' because it's easy to compute, while still
+being a good enough approximation of the exact confidence interval on average and not *too* pessimistic.
+-> https://doi.org/10.2307/2685469
+-> https://doi.org/10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.CO;2-E
+"""
+def score(up, down):
+    """Wilson 'score' with λ=2 - lower bound of a ~95.5% confidence interval"""
+    n = up + down
+    if n == 0: return 0.1 # arbitrary. this is above +1/-1, but below +1/-0 or +2/-2
+    return (up+2)/(n+4) - (2*sqrt(1+(up*down)/n))/(n+4)
+