Controversial software is proving surprisingly accurate at spotting errors in psychology papers

first_imgStatcheck gets it right more than 95% of the time, according to a new study. Critics remain unconvinced. In a 2015 study, Nuijten and colleagues ran statcheck on more than 30,000 psychology papers and found that half contained at least one statistical inconsistency, and one in eight had a gross inconsistency. Last year, Nuijten’s Tilburg University colleague Chris Hartgerink analyzed just under 700,000 results reported in more than 50,000 psychology studies using statcheck, and had the results automatically posted on the postpublication peer-review site PubPeer, with email notifications to the authors. Some researchers welcomed the feedback, but the German Psychological Society (DGPs) said the postings were causing needless reputational damage. Susan Fiske, a psychologist at Princeton University and former head of the Association for Psychological Science, called the effort a “form of harassment.” (The study was a one-time exercise; the researchers haven’t publicly subjected papers to statcheck since.)Whether statcheck is fair depends in part on its accuracy. “If it’s known that on 99% of occasions the automated check is accurate, then fine. If the accuracy is only 90%, I’d be really unhappy about the current process,” developmental neuropsychologist Dorothy Bishop of the University of Oxford in the United Kingdom told Retraction Watch at the time.For the new paper, the team ran statcheck on 49 papers that Nuijten’s colleagues had checked for statistical inconsistencies by hand in a paper published in 2011. They found that the algorithm’s “true positive rate” lies between 85.3% and 100%, and its “true negative rate” lies between 96% and 100%. (The precise numbers depended on various settings in statcheck.) Combined, those numbers meant that statcheck gets the right answer from the extracted results between 96.2% and 99.9% of the time.The researchers also tried to address another criticism: That statcheck often stumbles when researchers have applied legitimate statistical corrections to their data. By searching for specific keywords, the researchers found that such corrections are vastly more common than they had estimated in their previous paper. “Something went wrong there,” Nuijten says. But she and her colleagues found that corrected statistics weren’t a major source of inconsistencies.Thomas Schmidt, an experimental psychologist at the University of Kaiserslautern in Germany, remains wary. Because it works only with APA-style reporting, statcheck can calculate p-values for only 61% of statistical tests, he noted in a comment posted on PsyArXiv on 22 November. By Schmidt’s calculations, statcheck has a “poor sensitivity” of only 52%. “It is generally unacceptable as a research tool, and certainly unacceptable for purely automatic scanning of multitudes of papers,” he wrote. Nuijten says the team never claimed that statcheck can handle all reported statistics; the point of the new study was to check how well it does with the stats that it does recognize, she says.DGPs Secretary Mario Gollwitzer, a psychologist at the Philipps University of Marburg in Germany, is now convinced. Although papers should never be dismissed based on statcheck alone, “We believe that authors should use [it] to scan their paper” before submitting it to a journal, he says.Some already do. Since the developers released statcheck as a web application in September 2016, it has been accessed by more than 18,000 visitors, Nuijten says. “Statcheck can examine many statistics very quickly, and identify a subset for me that may be problematic,” says Brian Nosek, executive director of the Center for Open Science in Charlottesville, Virginia. “This is a huge efficiency gain.”A few psychology journals have made statcheck part of their peer-review process, and Nuijten envisions expanding to other disciplines, such as the biomedical sciences. “Statcheck is not perfect,” its proud developer says, “but it’s pretty close.” ERHUI1979/ISTOCKPHOTO When Dutch researchers developed an open-source algorithm designed to flag statistical errors in psychology papers, it received mixed reaction from the research community—especially after the free tool was run on tens of thousands of papers and the results were posted online. Many questioned the accuracy of the algorithm, named statcheck, or said the exercise amounted to public shaming.But statcheck actually gets it right in more than 95% of cases, its developers claim in a study posted on the preprint server PsyArXiv on 16 November. Some outsiders agree, and are calling for routine use. “The new paper convincingly shows that statcheck is indeed robust,” says Casper Albers, a psychometrician at the University of Groningen in the Netherlands. Others still aren’t convinced.Statcheck was developed in 2015 by Michèle Nuijten, a methodologist at Tilburg University in the Netherlands, and Sacha Epskamp, a psychometrician at the University of Amsterdam. It scours papers for data reported in the standard format prescribed by the American Psychological Association (APA) and uses them to calculate the p-value, a controversial but widely used measure of statistical significance. If the calculated p-value differs from the one reported by the researchers, the tool flags it as an “inconsistency”; if the reported p-value is below the commonly used threshold of 0.05 and statcheck’s figure isn’t, or vice versa, it is labeled a “gross inconsistency” that may call into question the conclusions. (Erroneous p-values are increasingly recognized as a big problem in psychology; Nuijten believes most stem from human error, but statcheck cannot distinguish misconduct from honest mistakes.) Controversial software is proving surprisingly accurate at spotting errors in psychology papers Country * Afghanistan Aland Islands Albania Algeria Andorra Angola Anguilla Antarctica Antigua and Barbuda Argentina Armenia Aruba Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin Bermuda Bhutan Bolivia, Plurinational State of Bonaire, Sint Eustatius and Saba Bosnia and Herzegovina Botswana Bouvet Island Brazil British Indian Ocean Territory Brunei Darussalam Bulgaria Burkina Faso Burundi Cambodia Cameroon Canada Cape Verde Cayman Islands Central African Republic Chad Chile China Christmas Island Cocos (Keeling) Islands Colombia Comoros Congo Congo, the Democratic Republic of the Cook Islands Costa Rica Cote d’Ivoire Croatia Cuba Curaçao Cyprus Czech Republic Denmark Djibouti Dominica Dominican Republic Ecuador Egypt El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Falkland Islands (Malvinas) Faroe Islands Fiji Finland France French Guiana French Polynesia French Southern Territories Gabon Gambia Georgia Germany Ghana Gibraltar Greece Greenland Grenada Guadeloupe Guatemala Guernsey Guinea Guinea-Bissau Guyana Haiti Heard Island and McDonald Islands Holy See (Vatican City State) Honduras Hungary Iceland India Indonesia Iran, Islamic Republic of Iraq Ireland Isle of Man Israel Italy Jamaica Japan Jersey Jordan Kazakhstan Kenya Kiribati Korea, Democratic People’s Republic of Korea, Republic of Kuwait Kyrgyzstan Lao People’s Democratic Republic Latvia Lebanon Lesotho Liberia Libyan Arab Jamahiriya Liechtenstein Lithuania Luxembourg Macao Macedonia, the former Yugoslav Republic of Madagascar Malawi Malaysia Maldives Mali Malta Martinique Mauritania Mauritius Mayotte Mexico Moldova, Republic of Monaco Mongolia Montenegro Montserrat Morocco Mozambique Myanmar Namibia Nauru Nepal Netherlands New Caledonia New Zealand Nicaragua Niger Nigeria Niue Norfolk Island Norway Oman Pakistan Palestine Panama Papua New Guinea Paraguay Peru Philippines Pitcairn Poland Portugal Qatar Reunion Romania Russian Federation Rwanda Saint Barthélemy Saint Helena, Ascension and Tristan da Cunha Saint Kitts and Nevis Saint Lucia Saint Martin (French part) Saint Pierre and Miquelon Saint Vincent and the Grenadines Samoa San Marino Sao Tome and Principe Saudi Arabia Senegal Serbia Seychelles Sierra Leone Singapore Sint Maarten (Dutch part) Slovakia Slovenia Solomon Islands Somalia South Africa South Georgia and the South Sandwich Islands South Sudan Spain Sri Lanka Sudan Suriname Svalbard and Jan Mayen Swaziland Sweden Switzerland Syrian Arab Republic Taiwan Tajikistan Tanzania, United Republic of Thailand Timor-Leste Togo Tokelau Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan Turks and Caicos Islands Tuvalu Uganda Ukraine United Arab Emirates United Kingdom United States Uruguay Uzbekistan Vanuatu Venezuela, Bolivarian Republic of Vietnam Virgin Islands, British Wallis and Futuna Western Sahara Yemen Zambia Zimbabwecenter_img By Dalmeet Singh ChawlaNov. 28, 2017 , 2:45 PM Click to view the privacy policy. Required fields are indicated by an asterisk (*) Email Sign up for our daily newsletter Get more great content like this delivered right to you! Countrylast_img