Files
clip-annotator/README.md
2026-05-20 13:26:03 +02:00

6.2 KiB

River Annotation Tool

A desktop application for manually annotating river video clips as part of the HydroScan project. Annotators draw pixel-level water masks over river footage and answer structured survey questions about flow conditions, lighting, and scene quality.

Requirements

  • Python 3.12
  • uv (recommended) or pip

Installation

# Clone the repository
git clone <repo-url>
cd river-annotation-tool

# Install with uv (creates the virtual environment automatically)
uv sync

# Or with pip
python -m venv .venv
.venv\Scripts\activate        # Windows
# source .venv/bin/activate   # macOS/Linux
pip install -e .

Usage

python -m river_annotation_tool.annotation_script --data <path/to/zips> --out <path/to/output>

Arguments

Argument Default Description
--data (hardcoded path) Directory containing ZIP archives of clips
--out data/annotation_results/ Directory where annotations are written
--clip (first unannotated clip) Open a specific clip by stem name (e.g. left_20230501)
--time Target time of day HH:MM — picks the clip closest to this time for each day
--daily off Annotate one clip per day (at --time, default noon); advances to the next day on Next
--skip-existing-day off With --daily, skip entire days that already have any annotated clip
--extras off Also save GIFs and extra PNGs (see Output section)

Typical workflows

# Annotate clips in chronological order (default)
python -m river_annotation_tool.annotation_script --data data/clips --out data/out

# One clip per day, always at the noon recording
python -m river_annotation_tool.annotation_script --data data/clips --out data/out --daily --time 12:00

# Resume a daily run, skip days already touched
python -m river_annotation_tool.annotation_script --data data/clips --out data/out \
    --daily --time 12:00 --skip-existing-day

# Annotate a single specific clip
python -m river_annotation_tool.annotation_script --data data/clips --out data/out \
    --clip left_20230615T120000

Controls

The window shows the video on the left (auto-playing) and the survey panel on the right.

Action How
Draw water mask Click and drag on the video
Erase mask Toggle Eraser button, then drag
Undo last stroke Undo
Clear entire mask Clear
Adjust brush size Slider next to the erase controls
Save and continue Next — saves current clip and loads the next one
Skip without saving Skip — discards changes and loads the next one
Save only Save — writes to disk without advancing
Restore last save Reload Saved — reverts mask and answers to what was last written

Output

Each annotated clip produces a folder <output_dir>/<clip_stem>/ with:

mask.png          # Binary water mask at full source resolution (always)
metadata.json     # Survey answers as JSON (always)
frame.png         # Middle frame of the clip (always)
overlay.png       # That frame with the mask blended in green (always)

# Only with --extras:
mask_vis.png               # Mask rendered as a greyscale PNG
video_original_hires.gif   # All frames at display resolution
video_original_lowres.gif  # All frames at 50% of display resolution
video_overlay_hires.gif    # Overlay GIF at display resolution
video_overlay_lowres.gif   # Overlay GIF at 50% of display resolution

Survey answers (metadata.json)

{
  "flow":          "Turbulent | Laminar | Uncertain",
  "shadows":       "Yes | No | Uncertain",
  "artifacts":     "Yes | No | Uncertain",
  "lighting":      "Day | Night | Uncertain",
  "exposure":      "Overexposed | Underexposed | Both | Normal | Uncertain",
  "snowing":       "Yes | No | Uncertain",
  "snow_on_ground":"Yes | No | Uncertain"
}

How it works

Clip format

Each clip is a ZIP archive containing a left.mp4 video. The filename encodes the recording timestamp (e.g. left_20230615T120000.zip), which is used for sorting and daily filtering.

Frame loading

Up to 100 frames are extracted from the video and scaled so the longest side is 480 px. This display-resolution copy is what the annotator works on; the full-resolution dimensions are remembered separately so the saved mask is upscaled back to the original size on export.

Mask drawing

The mask is a binary NumPy array matching the display frame size. Each brush stroke stamps a filled circle of the selected radius, setting pixels to 1 (draw) or 0 (erase). The history stack stores a copy of the mask before each stroke, enabling unlimited undo. On save the mask is resized to the original video resolution with nearest-neighbour interpolation and written as an 8-bit PNG (0 or 255).

Clip selection

ClipSelector scans the data directory, builds a sorted DataFrame of clips ordered by timestamp, and filters out clips that already have a mask.png. In daily mode it groups the remaining clips by calendar day and picks the one whose recording time is closest to the target hour; on Next, it moves to the first clip of the following day.

Resuming

When a clip is loaded that already has a saved mask.png and metadata.json, the mask is restored at display resolution and the survey answers are pre-filled. Reload Saved lets you revert to the last save at any point during the current session.

Repository structure

src/river_annotation_tool/
    annotation_script.py   # Entry point — argument parsing and app launch
    annotator.py           # Main QMainWindow — orchestrates all components
    clip_selector.py       # Clip-picking logic (daily mode, time filtering)
    mask_canvas.py         # Drawing widget — brush, undo, erase, mouse events
    video_loader.py        # ZIP extraction and frame resizing
    config.py              # Config constants, question definitions, defaults
    __init__.py            # Package version
pyproject.toml             # Project metadata and dependencies

Development

# Install pre-commit hooks
pre-commit install
pre-commit run --all-files   # Run manually once

# Add a dependency
uv add <package>
uv add --dev <package>       # Development-only