Files
clip-annotator/README.md
2026-05-27 10:02:20 +02:00

361 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Video Annotation Tool
A desktop GUI application for manually annotating video clips. Annotators draw pixel-level segmentation masks over footage and answer structured survey questions defined in a config file.
## Requirements
- Python 3.12
- [uv](https://docs.astral.sh/uv/) (recommended) or pip
## Quick start
```sh
# 1. Clone and install
git clone https://gitlab.datascience.ch/industry/aimsight/river-annotation-tool
cd river-annotation-tool
uv sync
# 2. Create config and clip list from examples
cp config/config.example.yaml config/config.yaml
cp config/clips.example.txt config/clips.txt
# 3. Edit config/config.yaml (set data_dir and out_dir)
# Edit config/clips.txt (list clips to annotate)
# Edit config/questions.yaml to customise survey questions (optional)
# 4. Run
uv run python -m clip_annotator.annotation_script
```
## Installation
```sh
# Install with uv (creates the virtual environment automatically)
uv sync
# Or with pip
python -m venv .venv
.venv\Scripts\activate # Windows
source .venv/bin/activate # macOS/Linux
pip install -e .
```
## Setup
Before running, create your config and clip list from the provided examples:
```sh
cp config/config.example.yaml config/config.yaml
cp config/clips.example.txt config/clips.txt
```
Edit `config/config.yaml` to set your `data_dir` and `out_dir`, then edit `config/clips.txt` to list the clips you want to annotate. Survey questions are defined in `config/questions.yaml` (committed to the repo; edit to customise). See the [Configuration](#configuration) section for all available options.
### S3 storage (optional)
By default the tool reads clips from and writes annotations to the local filesystem (`storage: local`). To use an S3-compatible object store instead, set `storage: s3` in `config/config.yaml` and give `data_dir` / `out_dir` as `bucket/prefix` paths:
```yaml
storage: s3
data_dir: my-bucket/clips
out_dir: my-bucket/annotation_results
```
Copy `.env.example` to `.env` and fill in your credentials — the app loads this file automatically at startup:
```sh
cp .env.example .env
# edit .env with your credentials
```
| Variable | Description |
|---|---|
| `S3_ACCESS_KEY` | Access key ID |
| `S3_SECRET_ACCESS_KEY` | Secret access key |
| `S3_ENDPOINT_URL` | Endpoint URL (defaults to `https://os.zhdk.cloud.switch.ch` if not set) |
| `AWS_REQUEST_CHECKSUM_CALCULATION` | Set to `when_required` to avoid checksum errors on SwitchEngines/Ceph |
| `AWS_RESPONSE_CHECKSUM_VALIDATION` | Set to `when_required` to avoid checksum errors on SwitchEngines/Ceph |
The `clips_file` (the list of clip filenames to annotate) is always read from the local filesystem even when `storage: s3`.
## Usage
```sh
uv run python -m clip_annotator.annotation_script
# or, if you have the venv activated:
python -m clip_annotator.annotation_script
```
### Arguments
| Argument | Default | Description |
|---|---|---|
| `--config` | `config/config.yaml` | Path to the config YAML file |
| `--data` | *(from config)* | Override `data_dir` from config |
| `--out` | *(from config)* | Override `out_dir` from config |
| `--clips` | *(from config)* | Override `clips_file` from config |
| `--clip` | *(first unannotated in list)* | Open a specific clip by its stem name (filename without extension, e.g. `clip_20230501T120000`) |
| `--extras` | off | Also save GIFs and extra PNGs (see Output section) |
| `--no-skip` | off | Show already-annotated clips instead of skipping them |
### Typical workflows
```sh
# Annotate clips listed in config/clips.txt (default)
uv run python -m clip_annotator.annotation_script
# Use a different config file
uv run python -m clip_annotator.annotation_script --config config/my_config.yaml
# Override paths from the command line
uv run python -m clip_annotator.annotation_script --data data/clips --out data/out
# Annotate a single specific clip
uv run python -m clip_annotator.annotation_script --clip clip_20230615T120000
```
## Configuration
Main settings live in `config/config.yaml`. Copy `config/config.example.yaml` to get started.
```yaml
storage: local # required: 'local' or 's3'
data_dir: # required: read-only source of ZIP archives (local path or bucket/prefix for S3)
out_dir: # required: write destination for annotations (can be same bucket as data_dir with a different prefix, or a separate location)
clips_file: config/clips.txt
optical_flow_config_file: config/optical_flow_config.yaml
questions_config_file: config/questions.yaml
display_max: 720 # longest side in pixels for display
fps_fallback: 25 # FPS to use if the video header is missing
max_frames: 100 # max frames to extract per clip
# Override input filenames only if your ZIP archives differ from the defaults
filenames:
video_in_zip: left.mp4
video_tmp_suffix: .mp4
zip_extension: .zip
```
Output filenames (`mask.png`, `metadata.json`, etc.) have sensible defaults and can be overridden in the `filenames:` block — see [`config.py`](src/clip_annotator/config.py) for the full list.
### Survey questions
Survey questions are defined in `config/questions.yaml` (committed to the repo). Add, remove, or reorder sections and items — the UI rebuilds automatically. `key` is what gets saved in `metadata.json`; `default` selects the pre-checked option (omit or set to `null` to leave unselected).
### Optical flow segmentation
`config/optical_flow_config.yaml` controls the **Auto Segment** button. When pressed, the tool computes a segmentation mask from the loaded frames and replaces the current mask (undoable). The segmentation combines two criteria:
- **Optical flow magnitude** — pixels where the temporal median of frame-to-frame flow (scaled by FPS) exceeds a fraction of the maximum are considered moving.
- **Brightness** — pixels outside a brightness window are excluded (removes sky, saturated glare, etc.).
```yaml
# config/optical_flow_config.yaml
enabled: true
norm_squared_threshold: 0.06 # fraction of max flow² that counts as moving
gaussian_kernel: [5, 5] # blur kernel applied to the reference frame before brightness check
brightness_range: [2, 253] # [min, max] greyscale brightness to keep
```
`enabled: false` disables the button without removing the config file.
## Clip list file
`config/clips.txt` lists bare clip filenames (not full paths) to annotate, one per line — the tool prepends `data_dir` automatically. Each entry must be the exact filename of the ZIP archive as it appears in `data_dir` (e.g. `clip_20230501T120000.zip`). Lines starting with `#` are ignored. Clips are processed in order; a clip is considered already-annotated when `<out_dir>/<clip_stem>/mask.png` exists and is skipped automatically. Pass `--no-skip` to include already-annotated clips. When the last clip is reached, a dialog appears and the app exits.
```
# Example clips.txt
clip_20230501T120000.zip
clip_20230502T120000.zip
```
Copy `config/clips.example.txt` as a starting point.
## Multi-annotator setup
To distribute clips across multiple annotators, create one clips file per annotator (e.g. `config/annotator_A.txt`) listing their assigned clip filenames, then pass it via `--clips`:
```sh
uv run python -m clip_annotator.annotation_script --clips config/annotator_A.txt
```
Assigning non-overlapping clip lists lets each annotator work independently. Intentionally overlapping a subset of clips across annotators enables inter-annotator agreement checks.
## Controls
The window is split into two panels: the **video canvas** on the left (~70% of the width) and the **survey panel** on the right. The video auto-plays as a looping preview. Drawing tools and mask controls are arranged above and beside the canvas; navigation buttons (**Previous / Next / Skip**) sit at the top.
### Tool modes
Three drawing tools are available in the tool row. The active tool is highlighted in blue.
| Tool | How to activate | Description |
|---|---|---|
| **Brush** | Click **Brush** | Click and drag to paint the mask with a circular brush (default) |
| **Polygon** | Click **Polygon** | Click to place vertices and build closed shapes; use **Fill** mode to commit them |
| **Fill** | Click **Fill** | Click inside a closed polygon to fill it onto the mask |
### Brush tool
| Action | How |
|---|---|
| Draw mask | Click and drag on the video |
| Erase mask | Toggle **Eraser** button (turns orange when active), then drag |
| Brush preview | A white circle follows the cursor showing the current brush size |
| Adjust brush size | **Brush size** slider (250 px, default 5); click **↺** to reset |
### Polygon tool
Polygons are drawn as overlays and do not affect the mask until you use **Fill** mode.
| Action | How |
|---|---|
| Add vertex | Left-click on the canvas |
| Remove last vertex | Right-click |
| Close a shape | Left-click near the first vertex (red dot) when ≥ 3 vertices are placed; completed shapes turn bold cyan |
| Draw multiple shapes | Each closed shape is kept independently; draw as many as needed |
| Cancel in-progress polygon | **Cancel Current Poly** — discards the unfinished polygon, keeps completed shapes |
| Delete last completed shape | **Del Shape** |
### Fill tool
| Action | How |
|---|---|
| Fill a shape | Left-click anywhere inside a closed polygon; that shape's interior is painted onto the mask |
| Nested shapes | If a closed polygon lies entirely inside the target, its interior is left unfilled (acts as a hole) |
| Innermost shape | Clicking inside nested shapes always fills the innermost (smallest) polygon containing the click |
| Undo fill | **Undo** — each fill is a single undoable step |
### Mask editing
| Action | How |
|---|---|
| Undo last action | **Undo** |
| Undo 10 actions | **Undo×10** |
| Redo | **Redo** |
| Clear entire mask | **Clear** |
| Toggle mask overlay | **Hide Mask / Show Mask** — button turns red when hidden; does not affect mask data |
| Mask transparency | **Mask Alpha** slider (0100%, default 15%); click **↺** to reset |
### Starting-point shortcuts
| Action | How |
|---|---|
| Load mask from previous clip | **Load Prev Mask** — copies the saved mask of the previous clip onto the current one; undoable |
| Optical flow first guess | **Auto Segment** — replaces the current mask with an automatic segmentation based on motion and brightness; undoable. Disabled when `enabled: false` in `config/optical_flow_config.yaml`. |
### Image display adjustments
Three vertical sliders sit to the left of the video and affect display only — they do not change what is saved.
| Slider | Effect | Range |
|---|---|---|
| Brightness | Shifts all pixel values up or down | 100 to +100 |
| Contrast | Scales pixel values around the midpoint | 100 to +100 |
| Gamma | Applies a power-law correction (higher = brighter) | 0.1× to 3.0× |
Click **↺** below any slider to restore its default value.
### Navigation
| Action | How |
|---|---|
| Save and continue | **Next** — saves current clip and loads the next one. If the clip already has a saved annotation a dialog asks whether to replace it or keep the existing save. |
| Go back | **Previous** — saves current clip and returns to the previously viewed clip. Disabled on the first clip. |
| Skip without saving | **Skip** — discards any unsaved changes and loads the next clip without writing anything to disk. |
## Output
Each annotated clip produces a folder `<out_dir>/<clip_stem>/` with:
```
mask.png # Binary segmentation mask at full source resolution (always)
metadata.json # Survey answers as JSON (always)
frame.png # Middle frame of the clip (always)
overlay.png # That frame with the mask blended in green (always)
# Only with --extras:
mask_vis.png # Mask rendered as a greyscale PNG
video_original_hires.gif # All frames at display resolution
video_original_lowres.gif # All frames at 50% of display resolution
video_overlay_hires.gif # Overlay GIF at display resolution
video_overlay_lowres.gif # Overlay GIF at 50% of display resolution
```
All output filenames can be overridden via the `filenames:` section in `config/config.yaml`.
### Survey answers (`metadata.json`)
Keys and values are determined by `config/questions.yaml`. With the default questions:
```json
{
"flow": "Turbulent | Laminar | Uncertain",
"shadows": "Yes | No | Uncertain",
"artifacts": "Yes | No | Uncertain",
"lighting": "Day | Night | Uncertain",
"exposure": "Overexposed | Underexposed | Both | Normal | Uncertain",
"snowing": "Yes | No | Uncertain",
"snow_on_ground":"Yes | No | Uncertain"
}
```
## Internals
### Clip format
Each clip is a ZIP archive containing a video file. The video inside the archive must be named `left.mp4` by default; if your archives use a different internal name, set `filenames.video_in_zip` in `config.yaml`. The ZIP filename becomes the output folder name (e.g. `clip_20230501T120000.zip``<out_dir>/clip_20230501T120000/`).
### Frame loading
Up to `max_frames` frames are extracted from the video and scaled so the longest side is `display_max` px. This display-resolution copy is what the annotator works on; the full-resolution dimensions are remembered separately so the saved mask is upscaled back to the original size on export.
### Mask drawing
The mask is a binary array at display resolution. **Brush** strokes stamp a filled circle (draw or erase). **Polygon** shapes are stored as overlays and don't touch the mask until a **Fill** click rasterises them — the innermost polygon containing the click is filled, and any polygon whose centroid falls inside it is punched out as a hole.
Every mask-changing operation is pushed onto an undo stack before it executes. On save, the mask is upscaled to the original video resolution and written as an 8-bit PNG (0 or 255).
### Resuming
When a clip is loaded that already has a saved `mask.png` and `metadata.json`, the mask is restored at display resolution and the survey answers are pre-filled.
## Repository structure
```
.env.example # S3 credential template (copy to .env and fill in)
config/
config.yaml # Your local config (git-ignored, copy from example)
config.example.yaml # Example config to copy and edit
clips.txt # Your clip list (git-ignored, copy from example)
clips.example.txt # Example clip list
questions.yaml # Survey question definitions
optical_flow_config.yaml # Optical flow parameters (set enabled: false to disable Auto Segment)
src/clip_annotator/
annotation_script.py # Entry point — argument parsing and app launch
annotator.py # Main QMainWindow — orchestrates all components
clip_selector.py # Reads the clip list and picks the next clip
filesystem.py # Storage backend — local passthrough or S3 via s3fs
mask_canvas.py # Drawing widget — brush, undo, erase, mouse events
video_loader.py # ZIP extraction and frame resizing
compute_optical_flow.py # Optical flow segmentation (Auto Segment button)
config.py # AppConfig dataclass and YAML loader
__init__.py # Package version
pyproject.toml # Project metadata and dependencies
```
## Development
```sh
# Install pre-commit hooks
uv run pre-commit install
uv run pre-commit run --all-files # Run manually once
# Add a dependency
uv add <package>
uv add --dev <package> # Development-only
```