SciPaperLoader/tools/DIAGNOSTIC_GUIDE.md

89 lines
2.3 KiB
Markdown

# SciPaperLoader Diagnostic Guide
This guide explains how to use the diagnostic tools included with SciPaperLoader,
especially for addressing issues with the scraper module.
## Common Issues and Solutions
### 1. Scraper Still Runs After Being Stopped
**Symptoms:**
- Web interface shows scraper as stopped but papers are still being processed
- `/scraper/stop` endpoint returns success but processing continues
- Active tasks show up in APScheduler inspector
**Solutions:**
```bash
# Run the emergency stop to force-terminate all tasks
make diagnostics # Then select option 5 (Emergency stop)
# Or directly:
python tools/diagnostics/emergency_stop.py
```
The emergency stop performs these actions:
- Sets scraper state to inactive in the database
- Revokes all running and scheduled APScheduler tasks
- Purges all task queues
- Reverts papers with "Pending" status to their previous state
### 2. Workers Not Picking Up Code Changes
**Symptoms:**
- Code changes don't seem to have any effect
- Bug fixes don't work even though the code is updated
- APScheduler might be using cached versions of modified code
**Solution:**
```bash
# Use the quick fix to stop tasks and restart the application
make diagnostics # Then select option 6 (Quick fix)
# Or directly:
python tools/diagnostics/quick_fix.py
```
### 3. Investigating Task or Scraper Issues
```bash
# Run the full diagnostic tool
make diagnostics # Then select option 3 (Full diagnostic report)
# Or directly:
python tools/diagnostics/diagnose_scraper.py
```
This tool will:
- Show current scraper state
- List all active and scheduled APScheduler tasks
- Display recent activity and error logs
## Preventative Measures
1. **Always stop the scraper properly** through the web interface before:
- Restarting the application
- Deploying code changes
- Modifying the database
2. **Monitor APScheduler jobs** through the diagnostic tools:
```bash
make diagnostics # Then select option 2 (Inspect tasks)
```
3. **Check logs for failed tasks** regularly in the Logger tab of the application
## For Developers
To test the paper reversion functionality:
```bash
make diagnostics # Then select option 4 (Test paper reversion)
# Or directly:
python tools/diagnostics/test_reversion.py
```
This is particularly helpful after making changes to the scraper or task handling code.