vivaplusdl/README.md

# Viva++
Viva++ is a tool that scrapes Viva+ and provides a TV-friendly user-interface with some additional features not present on Viva+.

## Environment Variables
The following environment variables can be used to configure viva++:

- `VIVAPLUS_USER`: The username for logging into Viva+
- `VIVAPLUS_PASS`: The password for logging into Viva+
- `VIVAPLUS_SLEEPTIME`: Amount of time to sleep between scrapes in minutes. Defaults to 15.
- `VIVAPLUS_DATABASE`: Path to SQLite database holding metadata, scrape links, and scrape statuses for videos. Defautls to `videos.db3`

## Database Structure
The database contains a single table called `videos` with the following columns:

- `id`: The standard ID column from SQLite
- `title`: The title of an episode
- `url`: The url to the video page. Typically `/supports/videos/XYZ`
- `inserted_on`: Date and time when the record was inserted.
- `upload_date`: The date at which the video was uploaded
- `cast`: The URL to a stream mux which can be downloaded directly using ytdlp. Note that these containing identifying information, and are time restricted. If one needs to redownload a video, set this column to `NULL` before starting vivaplusdl.
- `description`: The description added to the video. Not always present.
- `year`: The year part of `upload_date`. Added as a separate column to make some queries a little easier.
- `episode`: A number that orders different videos on a single day. E.g., when two videos are uploaded on 13/03/2025, the first one will be episode 1, the second will be episode 2.
- `run`: The run during which this episode was scraped. Every time a run starts that finds at least one new video, the run will be increment by one. This field is necessary to properly calculate the episode numbers across runs.
- `state`: The download state of the video. `done` indicates that the video is already imported. `pending` means that no attempt to import it has occurred as yet, or that all imports attempts have resulted in errors. `local` is similar to `done`, but with the additional implication that the video is available locally.
- `thumbnail`: Link to the video thumbnail

## Fetching Process
This tools makes use of Playwright to interact with the website (as there is no API to find this information).
It goes through these steps to download episodes:

1) Login to the website using your credentials
2) Go to the all videos pages sorted from newest to oldest and press the *End* key until we find a video that is already present in our database. During the one-time seeding process, the oldest video is manually added to the database using a SQL migration. For each video the `url` and `run` are stored in the database. The database state for this episode is now set to `pending`.
3) Each video that does not have metadata (the `cast` column is set to null), we fetch the video page and extract the title, upload date, description, and cast url. The record is updated to contain this information.
4) The proper episode numbers (or at least, the `EE` part of it) is calculated. This steps performs no network requests.

If any errors occur during the process, the program will log an error and quit.
When running it as a Docker container, Docker will automatically start it again.