diff --git a/Supplements.qmd b/Supplements.qmd index 51cfb6a..9f151bb 100644 --- a/Supplements.qmd +++ b/Supplements.qmd @@ -167,6 +167,9 @@ if (isTRUE(debug_mode)) { An overestimation the prevalence of each OSP in the population can lead to potential problems with all following steps. The true prevalences and confidence intervals along with performance diagnostics of trained models were assessed after all classification tasks were processed. An estimation of the prevalences per year was not suitable as no detailed information about those proportions was available. Instead, the established approach to stratify the sample proportionally to the population was used [@larsenProportionalAllocationStrata2008]. +# Full Text Retreival + +As mentioned in the manuscript, full texts were retreived using a self developed web application that used both web scraping and publisher API's. Legal aspects were carefully considered throughout the development. Within the EU, scraping is legal for scientific purposes [@urhg-60d-tdm], but institutional contracts can override this. Scraping was therefore limited to the university network and only to publishers that permit it while other publishers were scraped outside of the network. Technical details are available in the documents provided while the scraper might be made publicly available in the future. # Model Training diff --git a/img/app_screenshots/2025-08-04-174744_hyprshot.png b/img/app_screenshots/2025-08-04-174744_hyprshot.png deleted file mode 100644 index 4991a71..0000000 Binary files a/img/app_screenshots/2025-08-04-174744_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-04-174747_hyprshot.png b/img/app_screenshots/2025-08-04-174747_hyprshot.png deleted file mode 100644 index 55b1404..0000000 Binary files a/img/app_screenshots/2025-08-04-174747_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103449_hyprshot.png b/img/app_screenshots/2025-08-05-103449_hyprshot.png deleted file mode 100644 index ee96dad..0000000 Binary files a/img/app_screenshots/2025-08-05-103449_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103454_hyprshot.png b/img/app_screenshots/2025-08-05-103454_hyprshot.png deleted file mode 100644 index cf9720f..0000000 Binary files a/img/app_screenshots/2025-08-05-103454_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103456_hyprshot.png b/img/app_screenshots/2025-08-05-103456_hyprshot.png deleted file mode 100644 index 01494c4..0000000 Binary files a/img/app_screenshots/2025-08-05-103456_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103500_hyprshot.png b/img/app_screenshots/2025-08-05-103500_hyprshot.png deleted file mode 100644 index 30deb9d..0000000 Binary files a/img/app_screenshots/2025-08-05-103500_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103505_hyprshot.png b/img/app_screenshots/2025-08-05-103505_hyprshot.png deleted file mode 100644 index ce2806a..0000000 Binary files a/img/app_screenshots/2025-08-05-103505_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103511_hyprshot.png b/img/app_screenshots/2025-08-05-103511_hyprshot.png deleted file mode 100644 index 88f1558..0000000 Binary files a/img/app_screenshots/2025-08-05-103511_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103513_hyprshot.png b/img/app_screenshots/2025-08-05-103513_hyprshot.png deleted file mode 100644 index a33b18e..0000000 Binary files a/img/app_screenshots/2025-08-05-103513_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103518_hyprshot.png b/img/app_screenshots/2025-08-05-103518_hyprshot.png deleted file mode 100644 index ed3f775..0000000 Binary files a/img/app_screenshots/2025-08-05-103518_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103525_hyprshot.png b/img/app_screenshots/2025-08-05-103525_hyprshot.png deleted file mode 100644 index 6c89cad..0000000 Binary files a/img/app_screenshots/2025-08-05-103525_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103529_hyprshot.png b/img/app_screenshots/2025-08-05-103529_hyprshot.png deleted file mode 100644 index 45cbb06..0000000 Binary files a/img/app_screenshots/2025-08-05-103529_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103541_hyprshot.png b/img/app_screenshots/2025-08-05-103541_hyprshot.png deleted file mode 100644 index 96ec687..0000000 Binary files a/img/app_screenshots/2025-08-05-103541_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103546_hyprshot.png b/img/app_screenshots/2025-08-05-103546_hyprshot.png deleted file mode 100644 index 953918e..0000000 Binary files a/img/app_screenshots/2025-08-05-103546_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103553_hyprshot.png b/img/app_screenshots/2025-08-05-103553_hyprshot.png deleted file mode 100644 index e7c8faf..0000000 Binary files a/img/app_screenshots/2025-08-05-103553_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103630_hyprshot.png b/img/app_screenshots/2025-08-05-103630_hyprshot.png deleted file mode 100644 index 4d808ae..0000000 Binary files a/img/app_screenshots/2025-08-05-103630_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-103640_hyprshot.png b/img/app_screenshots/2025-08-05-103640_hyprshot.png deleted file mode 100644 index 2c919cb..0000000 Binary files a/img/app_screenshots/2025-08-05-103640_hyprshot.png and /dev/null differ diff --git a/img/app_screenshots/2025-08-05-150104_hyprshot.png b/img/app_screenshots/2025-08-05-150104_hyprshot.png deleted file mode 100644 index 274bffd..0000000 Binary files a/img/app_screenshots/2025-08-05-150104_hyprshot.png and /dev/null differ diff --git a/index.qmd b/index.qmd index 658e926..498ac8d 100644 --- a/index.qmd +++ b/index.qmd @@ -122,7 +122,7 @@ A focused literature review on adoption produced limited evidence as we still kn Self-reports suggest high OSP familiarity-but they co-exist with widespread QRPs and are vulnerable to bias. In @chinQuestionableResearchPractices2023, 89% of respondents said they had used at least one OSP, yet 87% also admitted at least one QRP, and some serious QRPs (e.g., hiding known problems) were non-trivial. Survey data indicate that about 25% of researchers across fields have preregistered a study, with higher uptake in psychology (50-60%) and lower prevalence in sociology (~30%) [@fergusonSurveyOpenScience2023a]. Another survey in the field similarly estimated preregistration use at 45% (42-49%) [@chinQuestionableResearchPractices2023]. The reported prevalence of OD varies widely across disciplines. Survey data suggest that more than 60% of researchers report having posted data or code, with higher rates in psychology (>50%) compared to sociology (~35%) [@fergusonSurveyOpenScience2023a]. The prevalence of OM sharing is more limited compared to OD and access. Survey results indicate that 43% (40-47%) of researchers report providing access to their research materials [@chinQuestionableResearchPractices2023]. Few or no journals require data sharing in the field, coupled with rare preregistration and a tiny share of replication studies [@pridemoreReplicationCriminologySocial2018]. -The @moneva2025attitudes NSCR (Netherlands Institute for the Study of Crime and Law Enforcement) finds broadly positive attitudes but divergent views by method and career stage, and a long list of cultural, structural, legal/privacy, and cost barriers. @fessingerStateOpenScience2025 also shows strong approval (88% positive) and some experience (58% tried at least one OSP), but routine adoption looks limited (only 44% even hold a repository account). In contrast, an assessment of social science studies between 2014 and 2017 found no preregistered studies at all [@hardwickeEmpiricalAssessmentTransparency2020]. +The @moneva2025attitudes Netherlands Institute for the Study of Crime and Law Enforcement finds broadly positive attitudes but divergent views by method and career stage, and a long list of cultural, structural, legal/privacy, and cost barriers. @fessingerStateOpenScience2025 also shows strong approval (88% positive) and some experience (58% tried at least one OSP), but routine adoption looks limited (only 44% even hold a repository account). In contrast, an assessment of social science studies between 2014 and 2017 found no preregistered studies at all [@hardwickeEmpiricalAssessmentTransparency2020]. Article audits show far lower OSP uptake than surveys, implying either nondisclosure or overestimation. @greenspanOpenSciencePractices2024 coded 722 articles (2018-2022) across five leading journals and found OM in about a third of papers, but \<10% with OD, \<2% with open code or preregistration, and no upward trend. @@ -132,7 +132,7 @@ The applied nature of the research in this field means fragile findings can driv # Data and Method -The aim of this methodological work is to compile a sample of publications in the fields of criminology and legal psychology, to classify them as either statistical inference (SI) publications or non-SI publications and further examine the former to assess whether they use any of the OSPs under consideration: preregistration, OD, OM, or OA. OA results are reported as secondary, descriptive analyses to benchmark open-science adoption. The presented OSPs will be operationalized and a text-classification pipeline (keyword dictionaries and machine-learning models) will be used to detect them. OA status will be determined using publicly available metadata, given the relatively high reliability of such information. The fine-tuned models are validated against a hand-coded sample that was extended using a large-language-model (LLM, ChatGPT 4o & ChatGPT 5o), report precision/recall and calibration, and then estimate annual prevalence with uncertainty intervals. +The aim of this methodological work is to compile a sample of publications in the fields of criminology and legal psychology, classify them as either statistical inference (SI) publications or non-SI publications and further examine the former to assess whether any of the OSPs under consideration are used: preregistration, OD, OM, or OA. OA results are reported as secondary, descriptive analyses to benchmark open-science adoption. The presented OSPs will be operationalized and a text-classification pipeline (keyword dictionaries and machine-learning models) will be used to detect them. OA status will be determined using publicly available metadata, given the relatively high reliability of such information. The fine-tuned models are validated against a hand-coded sample that was extended using a large-language-model (LLM, ChatGPT 4o & ChatGPT 5o), report precision/recall and calibration, and then estimate annual prevalence with uncertainty intervals. Full-text data for training the machine learning classification models will be collected with a web application developed specifically for this project. Since software development is not the focus of this work, details of the app's architecture will not be discussed here. A brief description of the application, along with screenshots, is provided in @sec-data-fulltext-collection. @@ -224,9 +224,9 @@ Publications were filtered by the resulting date variable to limit the populatio ### Sample -Using the obtained crossref metadata, the analytical sample was drawn stratified by year according to the calculation in @sec-sampling. The resulting analytical sample contains roughly 10% of the population data. As seen in @fig-freq-pubs-comp, Sample A, that is the training and validation sample for the SI classifier, already visually appears to not resemble the year pattern. This is intended as the proportion of SI papers are expected to not vary by year. As described before, stratification by journal was finally rejected due to the resulting sample sizes as an analysis of 100 journals would have required much more cases. +Using the obtained crossref metadata, the analytical sample was drawn stratified by year according to the calculation in @sec-sampling. The resulting analytical sample contains roughly 10% of the population data. As seen in @fig-freq-pubs-comp, Sample A, that is the training and validation sample for the SI classifier, is intended as the proportion of SI papers are expected to not vary and therefore not stratified by year. Stratification by journal was rejected due to the resulting sample sizes of 100 journals would have required much more cases. -The final analytical sample is made up of 4265 publications stratified by year. The OS prevalence classification sample consists of 352 publications stratified by year whereas the unstratified sample A for the training of the SI classifiers consists of 408 publications. +The final analytical sample is made up of 4265 publications. The OS prevalence classification sample consists of 352 publications stratified by year whereas the unstratified sample A for the training of the SI classifiers consists of 408 publications. ```{r} #| fig-cap: "Frequencies: Publications by Year in Population and Sample" @@ -338,9 +338,7 @@ if (isTRUE(debug_mode)) { ### Full Text Retrieval -The initial approach to gathering full texts, which used Zotero to translate DOIs as per Scoggins and Robertson, was unreliable across multiple attempts and versions. Due to the unsuitability of existing software tools - either for technical or legal reasons - a custom web application was developed. - -Legal aspects were carefully considered throughout the development. Within the EU, scraping is legal for scientific purposes [@urhg-60d-tdm], but institutional contracts can override this. Scraping was therefore limited to the university network and only to publishers that permit it while other publishers were scraped outside of the network. Technical details are available in the documents provided while the scraper might be made publicly available in the future. +The initial approach to gathering full texts, which used Zotero to translate DOIs as per Scoggins and Robertson, was unreliable across multiple attempts and versions. Due to the unsuitability of existing software tools, be it for technical or legal reasons, a custom web application was developed. Downloading the analytical sample was mostly successful, though some publisher protections caused dropouts. Due to time constraints, additional more optimized runs were not feasible. Documents under 1,000 words were considered non-full-text papers. However, shorter HTML texts were retained for potential keyword matching. Text quality assessment (Flesch-Index) and word count identified missing full texts [@benoitQuantedaPackageQuantitative2018], with further analysis available in the methodological report. Full texts were downloaded for Independent Sample A and the Analytical Sample from which Sample B was drawn. The resulting dropouts should have been implicitly handled by post-stratification. Publisher-level weighting was considered but infeasible due to sparse cells that would have produced unstable weights. Post-stratification was conducted by year only, which does not correct publisher- or journal-specific dropouts. Future iterations should add publisher-level adjustment.