@MartinJJ
Okay, dataset at https://www.kaggle.com/unanimad/us-election-2020?select=president_county_candidate.csv (you can get a login from bugmenot).
I have not calculated any p-values, so I can't say whether this is significant.
@MartinJJ
Illinois looks pretty sus for *both* big party candidates.
(This is just indicating data irregularities, again note I haven't done p-values for these).
This data doesn't tell us if anyone specific cheated, just that it's less likely to have come from a real-world dataset (also they have leading zeroes in their data, which makes my figures here inaccurate).
@MartinJJ
(Or maybe the zeroes are counties without any votes for that candidate)
@everlastingrocks Best first check if those datasets are the real deal. Are the official data already available?
@MartinJJ
https://www.electionreturns.pa.gov/General/CountyBreakDownResults?officeId=1&districtId=1&ElectionID=undefined&ElectionType=undefined&IsActive=undefined
Checked the `grep Joe` data in Pennsylvania. Bucks, Dauphin, Erie, Fulton, Lancaster, Montour, Northampton, Northumberland, and Philadelphia were out of date, but not such that the first digit would change.
Should Benford's law apply here? Here's another dataset keyed by Pennsylvania counties: the number of precincts per.
@everlastingrocks Don't ask me. I'm no math expert on this. We probably have people around who are a lot better at this stuff.
@everlastingrocks If you found it, let us know.