diff --git a/scipaperloader/templates/about.html b/scipaperloader/templates/about.html new file mode 100644 index 0000000..2a3b168 --- /dev/null +++ b/scipaperloader/templates/about.html @@ -0,0 +1,110 @@ +{% extends 'base.html' %} {% block content %} +

πŸ“˜ About This App

+ +

+ The Research Paper Scraper is a lightweight web-based tool + designed to help researchers manage and download large sets of academic papers + efficiently, using only a list of DOIs. +

+ +
+ +
+

πŸ” What It Does

+

+ This app automates the process of downloading research paper PDFs based on + metadata provided in a CSV file. It’s especially useful when dealing with + hundreds or thousands of papers you want to collect for offline access or + analysis. +

+

+ You simply upload a structured CSV file with paper metadata, and the system + takes care of the rest – importing, organizing, and downloading each paper + in the background. +

+
+ +
+

βš™οΈ How It Works

+ +
1. CSV Import
+

+ Users start by uploading a CSV file that contains metadata for many papers + (such as title, DOI, ISSN, etc.). The app only stores the fields it needs – + like the DOI, title, and publication date – and validates each entry before + importing it into the internal database. +

+ +
2. Metadata Management
+

Each paper is stored in a local SQLite database, along with its status:

+ + +
3. Background Scraping
+

+ A separate background process runs 24/7, automatically downloading papers + based on a configurable hourly schedule. It uses tools like the Zotero API + to fetch the best available version of each paper (ideally as a PDF), and + stores them on disk in neatly organized folders, one per paper. +

+

+ To avoid triggering download limits or spam detection, download times are + randomized within each hour to mimic natural behavior. +

+ +
4. Smart Scheduling
+

+ You can set how many papers the system should attempt to download during + each hour of the day. This allows you to, for example, schedule more + downloads during daytime and pause at night – or tailor usage to match your + institution’s bandwidth or rate limits. +

+ +
5. Easy Web Interface
+

Everything is managed through a simple, responsive web interface:

+ +

+ No command-line tools or scripts required – everything works in your + browser. +

+
+ +
+

πŸ“¦ File Storage

+

+ Downloaded PDFs are saved to a structured folder on the server, with each + paper in its own directory based on the DOI. The app never stores files + inside the database – only references to where each PDF is located. +

+
+ +
+

πŸ”’ Simple & Local

+

+ This app is designed for internal use on a local server or research + workstation. It does not send or expose data to third parties. Everything – + from file storage to scheduling – happens locally, giving you full control + over your paper collection process. +

+
+ +
+

πŸ’‘ Who It's For

+

This tool is ideal for:

+ +
+{% endblock %}