102 lines
4.2 KiB
Markdown
102 lines
4.2 KiB
Markdown
# SciPaperLoader: Flask Application Initial Structure
|
|
|
|
## Project Overview
|
|
|
|
**SciPaperLoader** is a Flask-based web application for managing scientific papers. It provides a web interface (with Jinja2 templates) enhanced by **Alpine.js** for interactive UI components and **HTMX** for partial page updates without full reloads. The application is composed of two main parts: a Flask web app (serving pages for uploading data, configuring schedules, and viewing logs) and a background **scraper daemon** that runs independently to perform long-running tasks (like fetching paper details on a schedule). The project is organized following Flask best practices (using blueprints, separating static files and templates) and is set up for easy development and testing (with configuration files and a pytest test fixture).
|
|
|
|
## Quick Start
|
|
|
|
Run the application:
|
|
|
|
make run
|
|
|
|
And open it in the browser at [http://localhost:5000/](http://localhost:5000/)
|
|
|
|
## Prerequisites
|
|
|
|
- Python >=3.8
|
|
- Redis (for Celery task queue)
|
|
|
|
## Development environment
|
|
|
|
- `make venv`: creates a virtualenv with dependencies and this application
|
|
installed in [development mode](http://setuptools.readthedocs.io/en/latest/setuptools.html#development-mode)
|
|
|
|
- `make run`: runs a development server in debug mode (changes in source code
|
|
are reloaded automatically)
|
|
|
|
- `make format`: reformats code
|
|
|
|
- `make lint`: runs flake8
|
|
|
|
- `make mypy`: runs type checks by mypy
|
|
|
|
- `make test`: runs tests (see also: [Testing Flask Applications](https://flask.palletsprojects.com/en/3.0.x/testing/))
|
|
|
|
- `make dist`: creates a wheel distribution (will run tests first)
|
|
|
|
- `make clean`: removes virtualenv and build artifacts
|
|
|
|
- add application dependencies in `pyproject.toml` under `project.dependencies`;
|
|
add development dependencies under `project.optional-dependencies.*`; run
|
|
`make clean && make venv` to reinstall the environment
|
|
|
|
## Asynchronous Task Processing with Celery
|
|
|
|
SciPaperLoader uses Celery for processing large CSV uploads and other background tasks. This allows the application to handle large datasets reliably without blocking the web interface.
|
|
|
|
### Running Celery Components
|
|
|
|
- `make redis`: ensures Redis server is running (required for Celery)
|
|
|
|
- `make celery`: starts a Celery worker to process background tasks
|
|
|
|
- `make celery-flower`: starts Flower, a web interface for monitoring Celery tasks at http://localhost:5555
|
|
|
|
- `make run-all`: runs the entire stack (Flask app + Celery worker + Redis) in development mode
|
|
|
|
### How It Works
|
|
|
|
When you upload a CSV file through the web interface:
|
|
|
|
1. The file is sent to the server
|
|
2. A Celery task is created to process the file asynchronously
|
|
3. The browser shows a progress bar with real-time updates
|
|
4. The results are displayed when processing is complete
|
|
|
|
This architecture allows SciPaperLoader to handle CSV files with thousands of papers without timing out or blocking the web interface.
|
|
|
|
## Configuration
|
|
|
|
Default configuration is loaded from `scipaperloader.defaults` and can be
|
|
overriden by environment variables with a `FLASK_` prefix. See
|
|
[Configuring from Environment Variables](https://flask.palletsprojects.com/en/3.0.x/config/#configuring-from-environment-variables).
|
|
|
|
### Celery Configuration
|
|
|
|
The following environment variables can be set to configure Celery:
|
|
|
|
- `FLASK_CELERY_BROKER_URL`: Redis URL for the message broker (default: `redis://localhost:6379/0`)
|
|
- `FLASK_CELERY_RESULT_BACKEND`: Redis URL for storing task results (default: `redis://localhost:6379/0`)
|
|
|
|
Consider using
|
|
[dotenv](https://flask.palletsprojects.com/en/3.0.x/cli/#environment-variables-from-dotenv).
|
|
|
|
## Deployment
|
|
|
|
See [Deploying to Production](https://flask.palletsprojects.com/en/3.0.x/deploying/).
|
|
|
|
You may use the distribution (`make dist`) to publish it to a package index,
|
|
deliver to your server, or copy in your `Dockerfile`, and insall it with `pip`.
|
|
|
|
You must set a
|
|
[SECRET_KEY](https://flask.palletsprojects.com/en/3.0.x/tutorial/deploy/#configure-the-secret-key)
|
|
in production to a secret and stable value.
|
|
|
|
### Deploying with Celery
|
|
|
|
When deploying to production:
|
|
|
|
1. Configure a production-ready Redis instance or use a managed service
|
|
2. Run Celery workers as system services or in Docker containers
|
|
3. Consider setting up monitoring for your Celery tasks and workers |