closes #4; refined data collection
This commit is contained in:
parent
95f1683326
commit
22989cf064
@ -107,7 +107,7 @@ The study will focus on papers in criminal psychology that use data and statisti
|
|||||||
|
|
||||||
## Data Collection
|
## Data Collection
|
||||||
|
|
||||||
The process of data collection will closely follow @scogginsMeasuringTransparencySocial2024 and begin with identifying relevant journals in criminal psychology. I will consult the Clarivate Journal Citation Report via their API to obtain a comprehensive list of journals within these fields by filtering for the top 30 journals in the respective fields (originally, @scogginsMeasuringTransparencySocial2024 used a top 100 filter - I will use top 30 journals to limit the amount of data because of technical limitations in my workspace setup). To ensure feasibility, I will filter this list to include only journals that are accessible under the university’s licensing agreements. Once the relevant journals are identified, I will use APIs such as Crossref, Scopus, or Web of Science to download metadata for all papers published between 2013 to 2023.
|
The process of data collection will closely follow @scogginsMeasuringTransparencySocial2024 and begin with identifying relevant journals in criminal psychology. I will consult the Clarivate Journal Citation Report to obtain a comprehensive list of journals within the fields by filtering for the top 100 journals. The Transparency-and-Openness-Promotion-Factor[^4] (TOP-Factor) according to @nosekPromotingOpenResearch2015 will be used to then assess the journal's admission of open science practices and by including it in the journal dataset. Once the relevant journals are identified, I will use APIs such as Crossref, Scopus, and Web of Science to download metadata for all papers published between 2013 to 2023.
|
||||||
|
|
||||||
After obtaining the metadata, I will proceed to download the full-text versions of the identified papers. Whenever possible, I will prioritize downloading HTML versions of the papers due to their structured format, which simplifies subsequent text extraction. For papers that are not available in HTML, I will consider downloading full-text PDFs. Tools such as PyPaperBot or others[^1] can facilitate this process, although I will strictly stick to ethical and legal guidelines, avoiding unauthorized sources like Sci-Hub or Anna's Archive and only using sources that are either included in my institutions campus license or available via open access. If access to full-text papers becomes a limiting factor, I will assess alternative strategies such as collaborating with institutional libraries to request specific papers or identifying open-access repositories that may provide supplementary resources. Non-available texts will be considered with their own category in the later analysis. Once all available full-text papers are collected, I will preprocess the data by converting HTML and PDF files into plain text format using tools such as SciPDF Parser or others[^2]. This preprocessing step ensures that the text is in a standardized format suitable for analysis.
|
After obtaining the metadata, I will proceed to download the full-text versions of the identified papers. Whenever possible, I will prioritize downloading HTML versions of the papers due to their structured format, which simplifies subsequent text extraction. For papers that are not available in HTML, I will consider downloading full-text PDFs. Tools such as PyPaperBot or others[^1] can facilitate this process, although I will strictly stick to ethical and legal guidelines, avoiding unauthorized sources like Sci-Hub or Anna's Archive and only using sources that are either included in my institutions campus license or available via open access. If access to full-text papers becomes a limiting factor, I will assess alternative strategies such as collaborating with institutional libraries to request specific papers or identifying open-access repositories that may provide supplementary resources. Non-available texts will be considered with their own category in the later analysis. Once all available full-text papers are collected, I will preprocess the data by converting HTML and PDF files into plain text format using tools such as SciPDF Parser or others[^2]. This preprocessing step ensures that the text is in a standardized format suitable for analysis.
|
||||||
|
|
||||||
@ -119,6 +119,8 @@ The proposed data collection is resource-intensive but serves multiple purposes.
|
|||||||
|
|
||||||
[^3]: DDoS: Distributed Denial of Service, see @wangDDoSAttackProtection2015.
|
[^3]: DDoS: Distributed Denial of Service, see @wangDDoSAttackProtection2015.
|
||||||
|
|
||||||
|
[^4]: The TOP-Factor according to @nosekRegisteredReports2014 is a score that assesses the admission of open science practices can be obtained from [topfactor.org](https://topfactor.org/journals).
|
||||||
|
|
||||||
## Classification
|
## Classification
|
||||||
|
|
||||||
The classification process will begin with operationalizing the key open science practices that I aim to study. This involves the definition of clear criteria for identifying papers that fall into the categories I plan to classify: Papers that use statistical inference, papers that applied preregistration, papers that applied open data practices, papers that offer open materials and papers that are available via open access.
|
The classification process will begin with operationalizing the key open science practices that I aim to study. This involves the definition of clear criteria for identifying papers that fall into the categories I plan to classify: Papers that use statistical inference, papers that applied preregistration, papers that applied open data practices, papers that offer open materials and papers that are available via open access.
|
||||||
|
15
lit.bib
15
lit.bib
@ -7232,6 +7232,21 @@
|
|||||||
langid = {english}
|
langid = {english}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@article{nosekPromotingOpenResearch2015,
|
||||||
|
title = {Promoting an Open Research Culture},
|
||||||
|
author = {Nosek, B. A. and Alter, G. and Banks, G. C. and Borsboom, D. and Bowman, S. D. and Breckler, S. J. and Buck, S. and Chambers, C. D. and Chin, G. and Christensen, G. and Contestabile, M. and Dafoe, A. and Eich, E. and Freese, J. and Glennerster, R. and Goroff, D. and Green, D. P. and Hesse, B. and Humphreys, M. and Ishiyama, J. and Karlan, D. and Kraut, A. and Lupia, A. and Mabry, P. and Madon, T. and Malhotra, N. and {Mayo-Wilson}, E. and McNutt, M. and Miguel, E. and Paluck, E. Levy and Simonsohn, U. and Soderberg, C. and Spellman, B. A. and Turitto, J. and VandenBos, G. and Vazire, S. and Wagenmakers, E. J. and Wilson, R. and Yarkoni, T.},
|
||||||
|
year = {2015},
|
||||||
|
month = jun,
|
||||||
|
journal = {Science},
|
||||||
|
volume = {348},
|
||||||
|
number = {6242},
|
||||||
|
pages = {1422--1425},
|
||||||
|
publisher = {American Association for the Advancement of Science},
|
||||||
|
doi = {10.1126/science.aab2374},
|
||||||
|
urldate = {2024-12-18},
|
||||||
|
file = {/home/michi/Zotero/storage/A32SAIJU/Nosek et al. - 2015 - Promoting an open research culture.pdf}
|
||||||
|
}
|
||||||
|
|
||||||
@article{nosekRegisteredReports2014,
|
@article{nosekRegisteredReports2014,
|
||||||
title = {Registered {{Reports}}},
|
title = {Registered {{Reports}}},
|
||||||
author = {Nosek, Brian A. and Lakens, Dani{\"e}l},
|
author = {Nosek, Brian A. and Lakens, Dani{\"e}l},
|
||||||
|
8
make.sh
8
make.sh
@ -9,10 +9,10 @@ OUT="${FILENAME}.pdf"
|
|||||||
echo "Generating PDF..."
|
echo "Generating PDF..."
|
||||||
pandoc -i "$IN" \
|
pandoc -i "$IN" \
|
||||||
-o "$OUT" \
|
-o "$OUT" \
|
||||||
--csl=apa-7th-edition.csl \
|
--csl=resources/apa-7th-edition.csl \
|
||||||
--citeproc \
|
--citeproc \
|
||||||
--lua-filter=filters/first-line-indent.lua \
|
--lua-filter=filters/first-line-indent.lua \
|
||||||
--citation-abbreviations=citation-abbreviations.csl
|
--citation-abbreviations=resources/citation-abbreviations.csl
|
||||||
|
|
||||||
# Check if pandoc ran successfully
|
# Check if pandoc ran successfully
|
||||||
if [ $? -ne 0 ]; then
|
if [ $? -ne 0 ]; then
|
||||||
@ -20,9 +20,9 @@ if [ $? -ne 0 ]; then
|
|||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
Insert Erklärung.pdf at the end of the PDF
|
# Insert Erklärung.pdf at the end of the PDF
|
||||||
echo "Modifying the PDF..."
|
echo "Modifying the PDF..."
|
||||||
./modify-pdf.sh "$OUT" "Erklärung.pdf" "$OUT"
|
./modify-pdf.sh "$OUT" "resources/Erklärung.pdf" "$OUT"
|
||||||
|
|
||||||
# remove last page for osf.io
|
# remove last page for osf.io
|
||||||
echo "Removing last page for OSF.io output and saving to OSF-$OUT"
|
echo "Removing last page for OSF.io output and saving to OSF-$OUT"
|
||||||
|
1917
resources/apa-7th-edition.csl
Normal file
1917
resources/apa-7th-edition.csl
Normal file
File diff suppressed because it is too large
Load Diff
8
resources/citation-abbreviations.csl
Normal file
8
resources/citation-abbreviations.csl
Normal file
@ -0,0 +1,8 @@
|
|||||||
|
{ "default": {
|
||||||
|
"container-title": {
|
||||||
|
"European Social Survey European Research Infrastructure": "ESS ERIC",
|
||||||
|
"Bundeskriminalamt": "BKA",
|
||||||
|
"Scots Law Times": "SLT"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
Loading…
x
Reference in New Issue
Block a user