Adds AI Generated About. Still has to be manually edited.
This commit is contained in:
parent
1534dbb0ba
commit
a83ce86bf9
110
scipaperloader/templates/about.html
Normal file
110
scipaperloader/templates/about.html
Normal file
@ -0,0 +1,110 @@
|
||||
{% extends 'base.html' %} {% block content %}
|
||||
<h1 class="mb-4">📘 About This App</h1>
|
||||
|
||||
<p class="lead">
|
||||
<strong>The Research Paper Scraper</strong> is a lightweight web-based tool
|
||||
designed to help researchers manage and download large sets of academic papers
|
||||
efficiently, using only a list of DOIs.
|
||||
</p>
|
||||
|
||||
<hr class="my-4" />
|
||||
|
||||
<section class="mb-5">
|
||||
<h2 class="h4">🔍 What It Does</h2>
|
||||
<p>
|
||||
This app automates the process of downloading research paper PDFs based on
|
||||
metadata provided in a CSV file. It’s especially useful when dealing with
|
||||
hundreds or thousands of papers you want to collect for offline access or
|
||||
analysis.
|
||||
</p>
|
||||
<p>
|
||||
You simply upload a structured CSV file with paper metadata, and the system
|
||||
takes care of the rest – importing, organizing, and downloading each paper
|
||||
in the background.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section class="mb-5">
|
||||
<h2 class="h4">⚙️ How It Works</h2>
|
||||
|
||||
<h5 class="mt-4">1. CSV Import</h5>
|
||||
<p>
|
||||
Users start by uploading a CSV file that contains metadata for many papers
|
||||
(such as title, DOI, ISSN, etc.). The app only stores the fields it needs –
|
||||
like the DOI, title, and publication date – and validates each entry before
|
||||
importing it into the internal database.
|
||||
</p>
|
||||
|
||||
<h5 class="mt-4">2. Metadata Management</h5>
|
||||
<p>Each paper is stored in a local SQLite database, along with its status:</p>
|
||||
<ul>
|
||||
<li><strong>Pending</strong>: Ready to be downloaded.</li>
|
||||
<li><strong>Done</strong>: Successfully downloaded.</li>
|
||||
<li><strong>Failed</strong>: Something went wrong (e.g. PDF not found).</li>
|
||||
</ul>
|
||||
|
||||
<h5 class="mt-4">3. Background Scraping</h5>
|
||||
<p>
|
||||
A separate background process runs 24/7, automatically downloading papers
|
||||
based on a configurable hourly schedule. It uses tools like the Zotero API
|
||||
to fetch the best available version of each paper (ideally as a PDF), and
|
||||
stores them on disk in neatly organized folders, one per paper.
|
||||
</p>
|
||||
<p>
|
||||
To avoid triggering download limits or spam detection, download times are
|
||||
<strong>randomized within each hour</strong> to mimic natural behavior.
|
||||
</p>
|
||||
|
||||
<h5 class="mt-4">4. Smart Scheduling</h5>
|
||||
<p>
|
||||
You can set how many papers the system should attempt to download during
|
||||
each hour of the day. This allows you to, for example, schedule more
|
||||
downloads during daytime and pause at night – or tailor usage to match your
|
||||
institution’s bandwidth or rate limits.
|
||||
</p>
|
||||
|
||||
<h5 class="mt-4">5. Easy Web Interface</h5>
|
||||
<p>Everything is managed through a simple, responsive web interface:</p>
|
||||
<ul>
|
||||
<li>📥 Upload CSV files</li>
|
||||
<li>📄 Track the status of each paper</li>
|
||||
<li>⚠️ See which downloads failed, and why</li>
|
||||
<li>📂 Download PDFs directly from the browser</li>
|
||||
<li>🕒 Adjust the hourly download schedule</li>
|
||||
</ul>
|
||||
<p>
|
||||
No command-line tools or scripts required – everything works in your
|
||||
browser.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section class="mb-5">
|
||||
<h2 class="h4">📦 File Storage</h2>
|
||||
<p>
|
||||
Downloaded PDFs are saved to a structured folder on the server, with each
|
||||
paper in its own directory based on the DOI. The app never stores files
|
||||
inside the database – only references to where each PDF is located.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section class="mb-5">
|
||||
<h2 class="h4">🔒 Simple & Local</h2>
|
||||
<p>
|
||||
This app is designed for internal use on a local server or research
|
||||
workstation. It does not send or expose data to third parties. Everything –
|
||||
from file storage to scheduling – happens locally, giving you full control
|
||||
over your paper collection process.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section class="mb-5">
|
||||
<h2 class="h4">💡 Who It's For</h2>
|
||||
<p>This tool is ideal for:</p>
|
||||
<ul>
|
||||
<li>Research assistants organizing large literature datasets</li>
|
||||
<li>Labs preparing reading archives for team members</li>
|
||||
<li>Faculty compiling papers for courses or research reviews</li>
|
||||
<li>Anyone needing a structured way to fetch and track papers in bulk</li>
|
||||
</ul>
|
||||
</section>
|
||||
{% endblock %}
|
Loading…
x
Reference in New Issue
Block a user