257 lines
7.5 KiB
Markdown
257 lines
7.5 KiB
Markdown
![]() |
# Torn User Activity Tracker
|
||
|
|
||
|
> [!WARNING]
|
||
|
> **Development is still in its early stages; do not put it to productive use!**
|
||
|
|
||
|
## Features
|
||
|
|
||
|
Multiple users control a single activity tracker using Torn's API.
|
||
|
|
||
|
- Start and stop scraping user activity data
|
||
|
- View real-time logs
|
||
|
- Download data and log files
|
||
|
- View scraping results
|
||
|
- Plugin based analysis system
|
||
|
- Toggle between light and dark mode
|
||
|
|
||
|
**Note:** Many features are not fully implemented yet, but the activity tracker/grabber works as intended.
|
||
|
|
||
|
## Planned Features
|
||
|
|
||
|
- Additional analyses plugins
|
||
|
- Selector for Torn API data to choose which data shall be tracked
|
||
|
- log viewer
|
||
|
|
||
|
## Requirements
|
||
|
|
||
|
- Python 3.8+
|
||
|
- Flask
|
||
|
- Flask-Bootstrap
|
||
|
- Flask-WTF
|
||
|
- Pandas
|
||
|
- Requests
|
||
|
- Redis
|
||
|
- Celery
|
||
|
- uWSGI
|
||
|
|
||
|
Redis currently has to run locally, but this will be changed in the future. See file tasks.py:
|
||
|
|
||
|
```python
|
||
|
# tasks.py
|
||
|
def get_redis():
|
||
|
return redis.StrictRedis(
|
||
|
host='localhost',
|
||
|
port=6379,
|
||
|
db=0,
|
||
|
decode_responses=True
|
||
|
)
|
||
|
```
|
||
|
|
||
|
## Installation
|
||
|
|
||
|
### Docker
|
||
|
|
||
|
#### Prerequisites
|
||
|
- Docker
|
||
|
- Docker Compose
|
||
|
|
||
|
#### Steps to Deploy
|
||
|
|
||
|
1. Clone the repository:
|
||
|
```bash
|
||
|
git clone <repository-url>
|
||
|
cd TornActivityTracker
|
||
|
```
|
||
|
|
||
|
2. Configure environment variables:
|
||
|
- Copy the example .env file and modify if needed
|
||
|
```bash
|
||
|
cp .env.example .env
|
||
|
```
|
||
|
|
||
|
3. Build and start the containers:
|
||
|
```bash
|
||
|
docker-compose up -d --build
|
||
|
```
|
||
|
|
||
|
This will start:
|
||
|
- The main Flask application
|
||
|
- Redis for task queue management
|
||
|
- Nginx as reverse proxy
|
||
|
|
||
|
The application will be available at `http://localhost:80`
|
||
|
|
||
|
#### Maintenance
|
||
|
|
||
|
To view logs:
|
||
|
```bash
|
||
|
docker-compose logs -f
|
||
|
```
|
||
|
|
||
|
To stop the application:
|
||
|
```bash
|
||
|
docker-compose down
|
||
|
```
|
||
|
|
||
|
To rebuild and restart:
|
||
|
```bash
|
||
|
docker-compose up -d --build
|
||
|
```
|
||
|
|
||
|
### Manual
|
||
|
|
||
|
1. Clone the repository:
|
||
|
|
||
|
```sh
|
||
|
git clone https://github.com/MichaelB7/TornActivityTracker.git
|
||
|
cd TornActivityTracker
|
||
|
```
|
||
|
|
||
|
2. Create a virtual environment and activate it:
|
||
|
|
||
|
```sh
|
||
|
python3 -m venv venv
|
||
|
source venv/bin/activate # On Windows use: .\venv\Scripts\activate
|
||
|
```
|
||
|
|
||
|
3. Install the required packages:
|
||
|
|
||
|
```sh
|
||
|
pip install -r requirements.txt
|
||
|
```
|
||
|
|
||
|
4. Start Redis server locally:
|
||
|
```sh
|
||
|
redis-server
|
||
|
```
|
||
|
|
||
|
5. Set up your configuration:
|
||
|
Create a `config.ini` file in the root directory by copying `example_config.ini`:
|
||
|
|
||
|
```sh
|
||
|
cp example_config.ini config.ini
|
||
|
```
|
||
|
|
||
|
Then edit `config.ini` with your settings:
|
||
|
|
||
|
```ini
|
||
|
[DEFAULT]
|
||
|
SECRET_KEY = your_secret_key
|
||
|
API_KEY = your_api_key
|
||
|
# ...rest of the config settings...
|
||
|
```
|
||
|
|
||
|
6. Start the Celery worker:
|
||
|
```sh
|
||
|
celery -A app.celery_worker worker --loglevel=info
|
||
|
```
|
||
|
|
||
|
7. Run the Flask application:
|
||
|
```sh
|
||
|
flask run
|
||
|
```
|
||
|
|
||
|
The application will be available at `http://127.0.0.1:5000/`
|
||
|
|
||
|
## Adding an Analysis Module
|
||
|
|
||
|
This guide explains how to add a new analysis module using the provided base classes: `BasePlotlyAnalysis` and `BasePlotAnalysis`. These base classes ensure a structured workflow for data preparation, transformation, and visualization.
|
||
|
|
||
|
### 1. Choosing the Right Base Class
|
||
|
Before implementing an analysis module, decide on the appropriate base class:
|
||
|
- **`BasePlotlyAnalysis`**: Use this for interactive plots with **Plotly** that generate **HTML** outputs.
|
||
|
- **`BasePlotAnalysis`**: Use this for static plots with **Matplotlib/Seaborn** that generate **PNG** image files.
|
||
|
- **`BaseAnalysis`**: Use this for any other type of analysis with **text** or **HTML** output for max flexibility.
|
||
|
|
||
|
### 2. Naming Convention
|
||
|
Follow a structured naming convention for consistency:
|
||
|
- **File name:** `plotly_<analysis_name>.py` for Plotly analyses, `plot_<analysis_name>.py` for Matplotlib-based analyses.
|
||
|
- **Class name:** Use PascalCase and a descriptive suffix:
|
||
|
- Example for Plotly: `PlotlyActivityHeatmap`
|
||
|
- Example for Matplotlib: `PlotUserSessionDuration`
|
||
|
|
||
|
### 3. Data Structure
|
||
|
The following DataFrame structure is passed to analysis classes:
|
||
|
|
||
|
| user_id | name | last_action | status | timestamp | prev_timestamp | was_active | hour |
|
||
|
|----------|-----------|----------------------|--------|-----------------------------|----------------|------------|------|
|
||
|
| XXXXXXX | UserA | 2025-02-08 17:58:11 | Okay | 2025-02-08 18:09:41.867984056 | NaT | False | 18 |
|
||
|
| XXXXXXX | UserB | 2025-02-08 17:00:10 | Okay | 2025-02-08 18:09:42.427846909 | NaT | False | 18 |
|
||
|
| XXXXXXX | UserC | 2025-02-08 16:31:52 | Okay | 2025-02-08 18:09:42.823201895 | NaT | False | 18 |
|
||
|
| XXXXXXX | UserD | 2025-02-06 23:57:24 | Okay | 2025-02-08 18:09:43.179914951 | NaT | False | 18 |
|
||
|
| XXXXXXX | UserE | 2025-02-06 06:33:40 | Okay | 2025-02-08 18:09:43.434650898 | NaT | False | 18 |
|
||
|
|
||
|
Note that the first X rows, depending on the number of the members, will always contain empty values in prev_timestamp as there has to be a previous timestamp ....
|
||
|
|
||
|
### 4. Implementing an Analysis Module
|
||
|
Each analysis module should define two key methods:
|
||
|
- `transform_data(self, df: pd.DataFrame) -> pd.DataFrame`: Processes the input data for plotting.
|
||
|
- `plot_data(self, df: pd.DataFrame)`: Generates and saves the plot.
|
||
|
|
||
|
#### Example: Adding a Plotly Heatmap
|
||
|
Below is an example of how to create a new analysis module using `BasePlotlyAnalysis`.
|
||
|
|
||
|
```python
|
||
|
import pandas as pd
|
||
|
import plotly.graph_objects as go
|
||
|
from .basePlotlyAnalysis import BasePlotlyAnalysis
|
||
|
|
||
|
class PlotlyActivityHeatmap(BasePlotlyAnalysis):
|
||
|
"""
|
||
|
Displays user activity trends over multiple days using an interactive heatmap.
|
||
|
"""
|
||
|
name = "Activity Heatmap (Interactive)"
|
||
|
description = "Displays user activity trends over multiple days."
|
||
|
plot_filename = "activity_heatmap.html"
|
||
|
|
||
|
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
|
||
|
df['hour'] = df['timestamp'].dt.hour
|
||
|
active_counts = df[df['was_active']].pivot_table(
|
||
|
index='name',
|
||
|
columns='hour',
|
||
|
values='was_active',
|
||
|
aggfunc='sum',
|
||
|
fill_value=0
|
||
|
).reset_index()
|
||
|
return active_counts.melt(id_vars='name', var_name='hour', value_name='activity_count')
|
||
|
|
||
|
def plot_data(self, df: pd.DataFrame):
|
||
|
df = df.pivot(index='name', columns='hour', values='activity_count').fillna(0)
|
||
|
self.fig = go.Figure(data=go.Heatmap(
|
||
|
z=df.values, x=df.columns, y=df.index, colorscale='Viridis',
|
||
|
colorbar=dict(title='Activity Count')
|
||
|
))
|
||
|
self.fig.update_layout(title='User Activity Heatmap', xaxis_title='Hour', yaxis_title='User')
|
||
|
```
|
||
|
|
||
|
#### Example: Adding a Static Matplotlib Plot
|
||
|
Below is an example of a Matplotlib-based analysis module using `BasePlotAnalysis`.
|
||
|
|
||
|
```python
|
||
|
import pandas as pd
|
||
|
import matplotlib.pyplot as plt
|
||
|
from .basePlotAnalysis import BasePlotAnalysis
|
||
|
|
||
|
class PlotUserSessionDuration(BasePlotAnalysis):
|
||
|
"""
|
||
|
Displays a histogram of user session durations.
|
||
|
"""
|
||
|
name = "User Session Duration Histogram"
|
||
|
description = "Histogram of session durations."
|
||
|
plot_filename = "session_duration.png"
|
||
|
|
||
|
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
|
||
|
df['session_duration'] = (df['last_action'] - df['timestamp']).dt.total_seconds()
|
||
|
return df
|
||
|
|
||
|
def plot_data(self, df: pd.DataFrame):
|
||
|
plt.figure(figsize=(10, 6))
|
||
|
plt.hist(df['session_duration'].dropna(), bins=30, edgecolor='black')
|
||
|
plt.xlabel('Session Duration (seconds)')
|
||
|
plt.ylabel('Frequency')
|
||
|
plt.title('User Session Duration Histogram')
|
||
|
```
|
||
|
|
||
|
## License
|
||
|
|
||
|
All assets and code are under the [CC BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/) LICENSE and in the public domain unless specified otherwise.
|