Adds AI Generated About. Still has to be manually edited.

2025-03-31 01:32:24 +02:00 · 2025-03-31 01:32:24 +02:00 · a83ce86bf9
commit a83ce86bf9
parent 1534dbb0ba
1 changed files with 110 additions and 0 deletions
--- a/scipaperloader/templates/about.html
+++ b/scipaperloader/templates/about.html
@ -0,0 +1,110 @@
+{% extends 'base.html' %} {% block content %}
+<h1 class="mb-4">📘 About This App</h1>
+
+<p class="lead">
+  <strong>The Research Paper Scraper</strong> is a lightweight web-based tool
+  designed to help researchers manage and download large sets of academic papers
+  efficiently, using only a list of DOIs.
+</p>
+
+<hr class="my-4" />
+
+<section class="mb-5">
+  <h2 class="h4">🔍 What It Does</h2>
+  <p>
+    This app automates the process of downloading research paper PDFs based on
+    metadata provided in a CSV file. It’s especially useful when dealing with
+    hundreds or thousands of papers you want to collect for offline access or
+    analysis.
+  </p>
+  <p>
+    You simply upload a structured CSV file with paper metadata, and the system
+    takes care of the rest – importing, organizing, and downloading each paper
+    in the background.
+  </p>
+</section>
+
+<section class="mb-5">
+  <h2 class="h4">⚙️ How It Works</h2>
+
+  <h5 class="mt-4">1. CSV Import</h5>
+  <p>
+    Users start by uploading a CSV file that contains metadata for many papers
+    (such as title, DOI, ISSN, etc.). The app only stores the fields it needs –
+    like the DOI, title, and publication date – and validates each entry before
+    importing it into the internal database.
+  </p>
+
+  <h5 class="mt-4">2. Metadata Management</h5>
+  <p>Each paper is stored in a local SQLite database, along with its status:</p>
+  <ul>
+    <li><strong>Pending</strong>: Ready to be downloaded.</li>
+    <li><strong>Done</strong>: Successfully downloaded.</li>
+    <li><strong>Failed</strong>: Something went wrong (e.g. PDF not found).</li>
+  </ul>
+
+  <h5 class="mt-4">3. Background Scraping</h5>
+  <p>
+    A separate background process runs 24/7, automatically downloading papers
+    based on a configurable hourly schedule. It uses tools like the Zotero API
+    to fetch the best available version of each paper (ideally as a PDF), and
+    stores them on disk in neatly organized folders, one per paper.
+  </p>
+  <p>
+    To avoid triggering download limits or spam detection, download times are
+    <strong>randomized within each hour</strong> to mimic natural behavior.
+  </p>
+
+  <h5 class="mt-4">4. Smart Scheduling</h5>
+  <p>
+    You can set how many papers the system should attempt to download during
+    each hour of the day. This allows you to, for example, schedule more
+    downloads during daytime and pause at night – or tailor usage to match your
+    institution’s bandwidth or rate limits.
+  </p>
+
+  <h5 class="mt-4">5. Easy Web Interface</h5>
+  <p>Everything is managed through a simple, responsive web interface:</p>
+  <ul>
+    <li>📥 Upload CSV files</li>
+    <li>📄 Track the status of each paper</li>
+    <li>⚠️ See which downloads failed, and why</li>
+    <li>📂 Download PDFs directly from the browser</li>
+    <li>🕒 Adjust the hourly download schedule</li>
+  </ul>
+  <p>
+    No command-line tools or scripts required – everything works in your
+    browser.
+  </p>
+</section>
+
+<section class="mb-5">
+  <h2 class="h4">📦 File Storage</h2>
+  <p>
+    Downloaded PDFs are saved to a structured folder on the server, with each
+    paper in its own directory based on the DOI. The app never stores files
+    inside the database – only references to where each PDF is located.
+  </p>
+</section>
+
+<section class="mb-5">
+  <h2 class="h4">🔒 Simple & Local</h2>
+  <p>
+    This app is designed for internal use on a local server or research
+    workstation. It does not send or expose data to third parties. Everything –
+    from file storage to scheduling – happens locally, giving you full control
+    over your paper collection process.
+  </p>
+</section>
+
+<section class="mb-5">
+  <h2 class="h4">💡 Who It's For</h2>
+  <p>This tool is ideal for:</p>
+  <ul>
+    <li>Research assistants organizing large literature datasets</li>
+    <li>Labs preparing reading archives for team members</li>
+    <li>Faculty compiling papers for courses or research reviews</li>
+    <li>Anyone needing a structured way to fetch and track papers in bulk</li>
+  </ul>
+</section>
+{% endblock %}