Abstract
The purpose of this methodological article is to clearly present the Newcomb-Benford law, accompanied by an example, to enhance understanding among diverse areas of research in psychology unfamiliar with its use in other disciplines, including cognitive science. This law is primarily applied for detecting fraud in databases and tallying votes in popular elections. The article commences with a historical overview, presenting distributions from the first to the fourth significant digit, as well as the two-digit distribution. Statistical-mathematical explanations of the law are reviewed, followed by the presentation of six goodness-of-fit tests and the calculation of simultaneous confidence intervals to assess compliance with the law. Simulated data following two distributions, namely normal and lognormal, are employed. The former, common in psychology, doesn't conform to the law, while the latter facilitates transforming the normal distribution to adhere to it. Finally, conclusions are drawn, and suggestions are made to detect manipulation of normally distributed data.
References
Benford, F. (1938). The Law of Anomalous Numbers. Proceedings of the American Philosophical Society, 78(4), 551–572. http://www.jstor.org/stable/984802
Berger, A., & Hill, T. P. (2020). The Mathematics of Benford’s Law: A Primer. Statistical Methods & Applications, 30(3), 779–795. https://doi.org/10.1007/s10260-020-00532-8
Bono, R., Arnau, J., Alarcón, R., & Blanca, M. J. (2020). Bias, Precision, and Accuracy of Skewness and Kurtosis Estimators for Frequently Used Continuous Distributions. Symmetry, 12(1), 19. https://doi.org/10.3390/sym12010019
Burns, B. D. (2020). Do People Fit to Benford's Law, or Do They Have a Benford Bias? Cognitive Science Society, 20(0379), 17291735. https://cognitivesciencesociety.org/cogsci20/papers/0379/0379.pdf
Burns, B. D., & Krygier, J. (2015). Psychology and Benford’s Law. In S. J. Miller (Ed.), The theory and applications of Benford’s law (pp. 267-275). Princeton University Press https://doi.org/10.23943/princeton/9780691147611.003.0014
Campanelli, L. (2024). Tuning up the Kolmogorov-Smirnov Test for Testing Benford’s Law. Communications in Statistics-Theory and Methods, 1. https://doi.org/10.1080/03610926.2024.2318608
Cerasa, A. (2022). Testing for Benford’s Law in Very Small Samples: Simulation Study and a New Test Proposal. PLoS One, 17(7), e0271969. https://doi.org/10.1371/journal.pone.0271969
Cerqueti, R., & Maggi, M. (2021). Data Validity and Statistical Conformity with Benford’s Law. Chaos, Solitons & Fractals, 144, 110740. https://doi.org/10.1016/j.chaos.2021.110740
Cerqueti, R., Maggi, M., & Riccioni, J. (2022). Statistical Methods for Decision Support Systems in Finance: How Benford’s Law Predicts Financial Risk. Annals of Operations Research. https://doi.org/10.1007/s10479-022-04742-z
Chi, D., & Burns, B. (2022). Why Do People Fit to Benford’s Law? – A Test of the Recognition Hypothesis. In J. Culbertson, A. Perfors, H. Rabagliati & V. Ramenzoni (Eds.), Proceedings of the 44th Annual Conference of the Cognitive Science Society (pp. 3648-3654). https://escholarship.org/uc/cognitivesciencesociety/44/44
Coracioni, A. T. (2020). Testing of Published Information on Greenhouse Gas Emissions. Conformity Analysis with the Benford’s Law Method. Audit Financiar, 18(4), 821−830. https://doi.org/10.20869/AUDITF/2020/160/029
D’Alessandro, A. (2020). Benford’s Law and Metabolomics: A Tale of Numbers and Blood. Transfusion and Apheresis Science, 59(6), 103019. https://doi.org/10.1016/j.transci.2020.103019
da Silva, A. J., Floquet, S., Santos, D. O. C., & Lima, R. F. (2020). On the Validation of the Newcomb - Benford Law and the Weibull Distribution in Neuromuscular Transmission. Physica A: Statistical Mechanics and Its Applications, 553(1), 124606. https://doi.org/10.1016/j.physa.2020.124606
Eichhorn, K. (2022). Digitalization of the Menu of Manipulation: Electoral Forensics of Russian Gubernatorial Elections. Demokratizatsiya: The Journal of Post-Soviet Democratization, 30(3), 283−304. https://www.researchgate.net/publication/356834886_Digitalization_of_the_Menu_of_Manipulation_Electoral_Forensics_of_Russian_Gubernatorial_Elections
Fang, G. (2022). Investigating Hill’s Question for Some Probability Distributions. AIP Advances 12(9), 095004. https://doi.org/10.1063/5.0100429
Feng, M., Deng, L. J., Chen, F., Perc, M., & Kurths, J. (2020). The Accumulative Law and its Probability Model: An Extension of the Pareto Distribution and the Log-Normal Distribution. Proceedings of the Royal Society, Series A, 476(2237), 20200019. https://doi.org/10.1098/rspa.2020.0019
Fewster, R. M. (2009). A Simple Explanation of Benford’s Law. The American Statistician, 63(1), 26–32. https://doi.org/10.1198/tast.2009.0005
Fisher, R. A. (1929). Test of Significance in Harmonic Analysis. Proceedings of the Royal Society of London, Series A (Mathematica, Psychical and Engineering Sciences), 125(796), 5459. http://doi.org/10.1098/rspa.1929.0151
Formann, A. K. (2010). The Newcomb-Benford Law in its Relation to Some Common Distributions. PLoS One, 5(5), e10541. https://doi.org/10.1371/journal.pone.0010541
Golbeck, J. (2019). Benford’s Law Can Detect Malicious Social Bots. First Monday, 24(8), 10163. https://doi.org/10.5210/fm.v24i8.10163
Goodman, L. A. (1965). On Simultaneous Confidence Intervals for Multinomial Proportions. Technometrics, 7(2), 247–254. https://doi.org/10.1080/00401706.1965.10490252
Gauvrit, N., Houillon, J. C. & Delahaye, J. P. (2017). Generalized Benford’s Law as a Lie Detector. Advances in Cognitive Psychology, 13(2), 121–127. https://doi.org/10.5709/acp-0212-x
Gunver, M. G. (2022). Norm-Referenced Scoring on Real Data: A Comparative Study of GRiSTEN and STEN. SAGE Open, 12(2), 21582440221091253. https://doi.org/10.1177/21582440221091253
Hogg, R. V. (1974). Adaptive Robust Procedures: A Partial Review and Some Suggestions for Future Applications and Theory. Journal of the American Statistical Association, 69(348), 909–923. https://doi.org/10.2307/2286160
Hogg, R. V., Fisher, D. M., & Randles, R. H. (1975). A Two-Sample Adaptive Distribution Free Test. Journal of the American Statistical Association, 70(351), 656–661. https://doi.org/10.2307/2285950
Jianu, I., & Jianu, I. (2021). Reliability of Financial Information from the Perspective of Benford’s Law. Entropy, 23(5), 557. https://doi.org/10.3390/e23050557
Kaiser, M. (2019). Benford’s Law as an Indicator of Survey Reliability—Can We Trust our Data? Journal of Economic Surveys, 33(5), 1602−1618. https://doi.org/10.1111/joes.12338
Kelley, T. L. (1947). Fundamentals of Statistics. Cambridge. Harvard University Press.
Kenny, D. A. (2019). Enhancing Validity in Psychological Research. American Psychologist, 74(9), 1018–1028. https://doi.org/10.1037/amp0000531
Kilani, A., & Georgiou, G. P. (2021). Countries with Potential Data Misreport Based on Benford’s Law. Journal of Public Health, 43(2), e295-e296. https://doi.org/10.1093/pubmed/fdab001
Klepac, G. (2018). Cognitive Data Science Automatic Fraud Detection Solution, Based on Benford’s law, Fuzzy Logic with Elements of Machine Learning. In A. Sangaiah, A. Thangavelu, & V. Meenakshi Sundaram (Eds), Cognitive Computing for Big Data Systems Over IoT. Lecture Notes on Data Engineering and Communications Technologies (vol. 14, pp. 79–95). Springer. https://doi.org/10.1007/978-3-319-70688-7_4
Kolmogorov, A. N. (1933). Sulla Determinazione Empirica di una Legge di Distribuzione [Sobre la determinación empírica de una ley de distribución]. Giornale dell’Istituto Italiano degli Attuari, 4, 83−91.
Kreuzer, M., Jordan, D., Antkowiak, B., Drexler, B., Kochs, E. F., & Schneider, G. (2014). Brain Electrical Activity Obeys Benford’s Law. Anesthesia & Analgesia, 118(1), 183-191. https://doi.org/10.1213/ANE.0000000000000015
Lacasa, L., & Fernández-Gracia, J. (2019). Election Forensics: Quantitative Methods for Electoral Fraud Detection. Forensic Science International, 294, e19-e22. https://doi.org/10.1016/j.forsciint.2018.11.010
Lesperance, M., Reed, W. J., Stephens, M. A., Tsao, C., & Wilton B. (2016). Assessing Conformance with Benford’s Law: Goodness-of-Fit Tests and Simultaneous Confidence Intervals. PLoS One, 11(3), e0151235. https://doi.org/10.1371/journal.pone.0151235
Lockhart, R. A., Spinelli, J. J., & Stephens, M. A. (2007). Cramér-von Mises Statistics for Discrete Distributions with Unknown Parameters. The Canadian Journal of Statistics, 35(1), 125–133. https://doi.org/10.1002/cjs.5550350111
Moral, J., & Valle, A. (2020). Validation of the Attitude Towards Sexuality Scale in two Samples of University Students. International Journal of Psychology and Counselling, 12(4), 131-151. https://academicjournals.org/journal/IJPC/article-references/A56ED2A65389
Newcomb, S. (1881). Note on the Frequency of Use of the Different Digits in Natural Numbers. American Journal of Mathematics, 4(1/4), 39–40. https://doi.org/10.2307/2369148
Pearson, K. (1894). Contributions to the Mathematical Theory of Evolution. I. On the Dissection of Asymmetrical Frequency Curves. Philosophical Transactions of the Royal Society of London A, 185, 71−110. https://doi.org/10.1098/rsta.1894.0003
Pearson, K. (1895). Contributions to the Mathematical Theory of Evolution. II. Skew Variation in Homogeneous Material. Philosophical Transactions of the Royal Society of London A, 186, 343-414. https://doi.org/10.1098/rsta.1895.0010
Pearson, K. (1900). On the Criterion that a Given System of Deviations from the Probably in the Case of a Correlated System of Variables is Such that it Can Be Reasonably Supposed to Have Arisen from Random Sampling. London, Edinburgh and Dublin Philosophical Magazine and Journal of Science, 50(302), 157−175. https://doi.org/10.1080/14786440009463897
Reed, J. F., & Stark, D. B. (1996). Hinge Estimators of Location: Robust to Asymmetry. Computer Methods and Programs in Biomedicine, 49(1), 11−17. https://doi.org/10.1016/0169-2607(95)01708-9
Schubert, A., Glänzel, W., & Schubert, G. (2022). Eponyms in Science: Famed or Framed? Scientometrics, 127(3), 1199−1207. https://doi.org/10.1007/s11192-022-04298-6
Scott, P. D., & Fasli, M. (2001). Benford's Law: An Empirical Investigation and a Novel Explanation. CSM Technical Report 349. Department of Computer Science. https://core.ac.uk/download/pdf/19749326.pdf
Smirnov, N. (1948). Tables for Estimating the Goodness-Of-Fit of Empirical Distributions. Annals of Mathematical Statistics, 19(2), 279-281. http://dx.doi.org/10.1214/aoms/1177730256
Stephens, M. A. (1986). Test Based on EDF Statistics. In R. B. D’Agostino & M. A. Stephens (Eds.), Goodness-of-Fit Techniques (pp. 97−194). Marcel Dekker. https://doi.org/10.1201/9780203753064-4
Striga, D., & Podobnik, V. (2018). Benford’s Law and Dunbar’s Number: Does Facebook Have a Power to Change Natural and Anthropological laws? IEEE Access, 6, 1462914642. https://doi.org/10.1109/ACCESS.2018.2805712
Szabo, J. K., Forti, L. R., & Callaghan, C. T. (2023). Large Biodiversity Datasets Conform to Benford’s Law: Implications for Assessing Sampling Heterogeneity. Biological Conservation, 280(6), 109982. https://doi.org/10.1016/j.biocon.2023.109982
Val Danilov, I. (2023). Theoretical Grounds of Shared Intentionality for Neuroscience in Developing Bioengineering Systems. OBM Neurobiology, 7(1), 156. https://doi.org/10.21926/obm.neurobiol.2301156
Volčič, A. (2020). Uniform Distribution, Benford’s Law and Scale-Invariance. Bollettino dell'Unione Matematica Italiana, 13(4), 539−543.
https://doi.org/10.1007/s40574-020-00245-6
Wald, A., & Wolfowitz, J. (1943). An Exact Test for Randomness in the Case Non-Parametric Case Based on Serial Correlation. Annals of Mathematic Statistics, 14(4), 378−388. https://doi.org/10.1214/aoms/1177731358
Woolf, B. (1957). The Log Likelihood Ratio Test (G-Test); Methods and Tables to Test of Heterogeneity in Contingency Tables. Annals of Human Genetics, 21(4), 397−409. https://doi.org/10.1111/j.1469-1809.1972.tb00293.x
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Copyright (c) 2023