Replace hardcoded config and directory scan with YAML config and explicit clip list
- config.py constants -> config/config.yaml (user-editable, git-ignored) - Questions and defaults now defined in the YAML, including per-question defaults - ClipSelector no longer scans the data dir; reads a user-provided clips.txt instead - Removed --daily / --time / --skip-existing-day args - video_loader now samples frames evenly across the full clip - pyyaml added as a dependency Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
112
README.md
112
README.md
@@ -24,42 +24,87 @@ python -m venv .venv
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
## Setup
|
||||
|
||||
Before running, create your config and clip list from the provided examples:
|
||||
|
||||
```sh
|
||||
cp config/config.example.yaml config/config.yaml
|
||||
cp config/clips.example.txt config/clips.txt
|
||||
```
|
||||
|
||||
Edit `config/config.yaml` to set your `data_dir` and `out_dir`, then edit `config/clips.txt` to list the clips you want to annotate.
|
||||
|
||||
## Usage
|
||||
|
||||
```sh
|
||||
python -m river_annotation_tool.annotation_script --data <path/to/zips> --out <path/to/output>
|
||||
python -m river_annotation_tool.annotation_script
|
||||
```
|
||||
|
||||
### Arguments
|
||||
|
||||
| Argument | Default | Description |
|
||||
|---|---|---|
|
||||
| `--data` | *(hardcoded path)* | Directory containing ZIP archives of clips |
|
||||
| `--out` | `data/annotation_results/` | Directory where annotations are written |
|
||||
| `--clip` | *(first unannotated clip)* | Open a specific clip by stem name (e.g. `left_20230501`) |
|
||||
| `--time` | — | Target time of day `HH:MM` — picks the clip closest to this time for each day |
|
||||
| `--daily` | off | Annotate one clip per day (at `--time`, default noon); advances to the next day on **Next** |
|
||||
| `--skip-existing-day` | off | With `--daily`, skip entire days that already have any annotated clip |
|
||||
| `--config` | `config/config.yaml` | Path to the config YAML file |
|
||||
| `--data` | *(from config)* | Override `data_dir` from config |
|
||||
| `--out` | *(from config)* | Override `out_dir` from config |
|
||||
| `--clips` | *(from config)* | Override `clips_file` from config |
|
||||
| `--clip` | *(first unannotated in list)* | Open a specific clip by stem name (e.g. `left_20230501`) |
|
||||
| `--extras` | off | Also save GIFs and extra PNGs (see Output section) |
|
||||
|
||||
### Typical workflows
|
||||
|
||||
```sh
|
||||
# Annotate clips in chronological order (default)
|
||||
# Annotate clips listed in config/clips.txt (default)
|
||||
python -m river_annotation_tool.annotation_script
|
||||
|
||||
# Use a different config file
|
||||
python -m river_annotation_tool.annotation_script --config config/my_config.yaml
|
||||
|
||||
# Override paths from the command line
|
||||
python -m river_annotation_tool.annotation_script --data data/clips --out data/out
|
||||
|
||||
# One clip per day, always at the noon recording
|
||||
python -m river_annotation_tool.annotation_script --data data/clips --out data/out --daily --time 12:00
|
||||
|
||||
# Resume a daily run, skip days already touched
|
||||
python -m river_annotation_tool.annotation_script --data data/clips --out data/out \
|
||||
--daily --time 12:00 --skip-existing-day
|
||||
|
||||
# Annotate a single specific clip
|
||||
python -m river_annotation_tool.annotation_script --data data/clips --out data/out \
|
||||
--clip left_20230615T120000
|
||||
python -m river_annotation_tool.annotation_script --clip left_20230615T120000
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
All settings live in `config/config.yaml`. Copy `config/config.example.yaml` to get started.
|
||||
|
||||
```yaml
|
||||
display_max: 720 # longest side in pixels for display
|
||||
fps_fallback: 25 # FPS to use if the video header is missing
|
||||
max_frames: 100 # max frames to extract per clip
|
||||
|
||||
data_dir: data/clips # directory containing ZIP archives
|
||||
out_dir: data/annotation_results
|
||||
clips_file: config/clips.txt
|
||||
|
||||
questions:
|
||||
- section: River
|
||||
items:
|
||||
- key: flow
|
||||
label: "Flow Regime"
|
||||
options: [Turbulent, Laminar, Uncertain]
|
||||
default: Laminar
|
||||
# add more items or sections as needed
|
||||
```
|
||||
|
||||
Add, remove, or reorder questions directly in the YAML — the UI rebuilds automatically. `key` is what gets saved in `metadata.json`; `default` selects the pre-checked option (omit or set to `null` to leave unselected).
|
||||
|
||||
## Clip list file
|
||||
|
||||
`config/clips.txt` lists the clip filenames to annotate, one per line. Lines starting with `#` are ignored. Clips are processed in order; already-annotated clips (those with an existing `mask.png`) are skipped automatically.
|
||||
|
||||
```
|
||||
# Example clips.txt
|
||||
left_20230501T120000.zip
|
||||
left_20230502T120000.zip
|
||||
```
|
||||
|
||||
Copy `config/clips.example.txt` as a starting point.
|
||||
|
||||
## Controls
|
||||
|
||||
The window shows the video on the left (auto-playing) and the survey panel on the right.
|
||||
@@ -78,7 +123,7 @@ The window shows the video on the left (auto-playing) and the survey panel on th
|
||||
|
||||
## Output
|
||||
|
||||
Each annotated clip produces a folder `<output_dir>/<clip_stem>/` with:
|
||||
Each annotated clip produces a folder `<out_dir>/<clip_stem>/` with:
|
||||
|
||||
```
|
||||
mask.png # Binary water mask at full source resolution (always)
|
||||
@@ -96,6 +141,8 @@ video_overlay_lowres.gif # Overlay GIF at 50% of display resolution
|
||||
|
||||
### Survey answers (`metadata.json`)
|
||||
|
||||
Keys and values are determined by the `questions` section in `config/config.yaml`. With the default config:
|
||||
|
||||
```json
|
||||
{
|
||||
"flow": "Turbulent | Laminar | Uncertain",
|
||||
@@ -112,20 +159,16 @@ video_overlay_lowres.gif # Overlay GIF at 50% of display resolution
|
||||
|
||||
### Clip format
|
||||
|
||||
Each clip is a ZIP archive containing a `left.mp4` video. The filename encodes the recording timestamp (e.g. `left_20230615T120000.zip`), which is used for sorting and daily filtering.
|
||||
Each clip is a ZIP archive containing a `left.mp4` video. The filename encodes the recording timestamp (e.g. `left_20230615T120000.zip`).
|
||||
|
||||
### Frame loading
|
||||
|
||||
Up to 100 frames are extracted from the video and scaled so the longest side is 480 px. This display-resolution copy is what the annotator works on; the full-resolution dimensions are remembered separately so the saved mask is upscaled back to the original size on export.
|
||||
Up to `max_frames` frames are extracted from the video and scaled so the longest side is `display_max` px. This display-resolution copy is what the annotator works on; the full-resolution dimensions are remembered separately so the saved mask is upscaled back to the original size on export.
|
||||
|
||||
### Mask drawing
|
||||
|
||||
The mask is a binary NumPy array matching the display frame size. Each brush stroke stamps a filled circle of the selected radius, setting pixels to 1 (draw) or 0 (erase). The history stack stores a copy of the mask before each stroke, enabling unlimited undo. On save the mask is resized to the original video resolution with nearest-neighbour interpolation and written as an 8-bit PNG (0 or 255).
|
||||
|
||||
### Clip selection
|
||||
|
||||
`ClipSelector` scans the data directory, builds a sorted DataFrame of clips ordered by timestamp, and filters out clips that already have a `mask.png`. In daily mode it groups the remaining clips by calendar day and picks the one whose recording time is closest to the target hour; on **Next**, it moves to the first clip of the following day.
|
||||
|
||||
### Resuming
|
||||
|
||||
When a clip is loaded that already has a saved `mask.png` and `metadata.json`, the mask is restored at display resolution and the survey answers are pre-filled. **Reload Saved** lets you revert to the last save at any point during the current session.
|
||||
@@ -133,15 +176,20 @@ When a clip is loaded that already has a saved `mask.png` and `metadata.json`, t
|
||||
## Repository structure
|
||||
|
||||
```
|
||||
config/
|
||||
config.yaml # Your local config (git-ignored, copy from example)
|
||||
config.example.yaml # Example config to copy and edit
|
||||
clips.txt # Your clip list (git-ignored, copy from example)
|
||||
clips.example.txt # Example clip list
|
||||
src/river_annotation_tool/
|
||||
annotation_script.py # Entry point — argument parsing and app launch
|
||||
annotator.py # Main QMainWindow — orchestrates all components
|
||||
clip_selector.py # Clip-picking logic (daily mode, time filtering)
|
||||
mask_canvas.py # Drawing widget — brush, undo, erase, mouse events
|
||||
video_loader.py # ZIP extraction and frame resizing
|
||||
config.py # Config constants, question definitions, defaults
|
||||
__init__.py # Package version
|
||||
pyproject.toml # Project metadata and dependencies
|
||||
annotation_script.py # Entry point — argument parsing and app launch
|
||||
annotator.py # Main QMainWindow — orchestrates all components
|
||||
clip_selector.py # Reads the clip list and picks the next clip
|
||||
mask_canvas.py # Drawing widget — brush, undo, erase, mouse events
|
||||
video_loader.py # ZIP extraction and frame resizing
|
||||
config.py # AppConfig dataclass and YAML loader
|
||||
__init__.py # Package version
|
||||
pyproject.toml # Project metadata and dependencies
|
||||
```
|
||||
|
||||
## Development
|
||||
|
||||
Reference in New Issue
Block a user