Add Quick start section, surface uv run throughout, fix repo URL placeholder, and rename 'How it works' to 'Internals'. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
15 KiB
River Annotation Tool
A desktop GUI application for manually annotating river video clips as part of the HydroScan project. Annotators draw pixel-level water masks over river footage and answer structured survey questions about flow conditions, lighting, and scene quality.
Requirements
- Python 3.12
- uv (recommended) or pip
Quick start
# 1. Clone and install
git clone https://github.com/HydroScan/river-annotation-tool
cd river-annotation-tool
uv sync
# 2. Create config and clip list from examples
cp config/config.example.yaml config/config.yaml
cp config/clips.example.txt config/clips.txt
# 3. Edit config/config.yaml (set data_dir and out_dir)
# Edit config/clips.txt (list clips to annotate)
# 4. Run
uv run python -m river_annotation_tool.annotation_script
Installation
# Install with uv (creates the virtual environment automatically)
uv sync
# Or with pip
python -m venv .venv
.venv\Scripts\activate # Windows
source .venv/bin/activate # macOS/Linux
pip install -e .
Setup
Before running, create your config and clip list from the provided examples:
cp config/config.example.yaml config/config.yaml
cp config/clips.example.txt config/clips.txt
Edit config/config.yaml to set your data_dir and out_dir, then edit config/clips.txt to list the clips you want to annotate. See the Configuration section for all available options.
S3 storage (optional)
By default the tool reads clips from and writes annotations to the local filesystem (storage: local). To use an S3-compatible object store instead, set storage: s3 in config/config.yaml and give data_dir / out_dir as bucket/prefix paths:
storage: s3
data_dir: my-bucket/clips
out_dir: my-bucket/annotation_results
Copy .env.example to .env and fill in your credentials — the app loads this file automatically at startup:
cp .env.example .env
# edit .env with your credentials
| Variable | Description |
|---|---|
S3_ACCESS_KEY |
Access key ID |
S3_SECRET_ACCESS_KEY |
Secret access key |
S3_ENDPOINT_URL |
Endpoint URL (defaults to https://os.zhdk.cloud.switch.ch if not set) |
AWS_REQUEST_CHECKSUM_CALCULATION |
Set to when_required to avoid checksum errors on SwitchEngines/Ceph |
AWS_RESPONSE_CHECKSUM_VALIDATION |
Set to when_required to avoid checksum errors on SwitchEngines/Ceph |
The clips_file (the list of clip filenames to annotate) is always read from the local filesystem even when storage: s3.
Usage
uv run python -m river_annotation_tool.annotation_script
# or, if you have the venv activated:
python -m river_annotation_tool.annotation_script
Arguments
| Argument | Default | Description |
|---|---|---|
--config |
config/config.yaml |
Path to the config YAML file |
--data |
(from config) | Override data_dir from config |
--out |
(from config) | Override out_dir from config |
--clips |
(from config) | Override clips_file from config |
--clip |
(first unannotated in list) | Open a specific clip by stem name |
--extras |
off | Also save GIFs and extra PNGs (see Output section) |
--no-skip |
off | Show already-annotated clips instead of skipping them |
Typical workflows
# Annotate clips listed in config/clips.txt (default)
uv run python -m river_annotation_tool.annotation_script
# Use a different config file
uv run python -m river_annotation_tool.annotation_script --config config/my_config.yaml
# Override paths from the command line
uv run python -m river_annotation_tool.annotation_script --data data/clips --out data/out
# Annotate a single specific clip
uv run python -m river_annotation_tool.annotation_script --clip left_20230615T120000
Configuration
All settings live in config/config.yaml. Copy config/config.example.yaml to get started.
storage: local # required: 'local' or 's3'
data_dir: data/clips # directory containing ZIP archives (local path or bucket/prefix for S3)
out_dir: data/annotation_results
clips_file: config/clips.txt
# optical_flow_config_file: config/optical_flow_config.yaml # optional, enables Auto Segment
display_max: 720 # longest side in pixels for display
fps_fallback: 25 # FPS to use if the video header is missing
max_frames: 100 # max frames to extract per clip
questions:
- section: River
items:
- key: flow
label: "Flow Regime"
options: [Turbulent, Laminar, Uncertain]
default: Laminar
# add more items or sections as needed
filenames:
video_in_zip: left.mp4 # video filename inside each ZIP archive
video_tmp_suffix: .mp4 # suffix for the extraction temp file
zip_extension: .zip # extension used when resolving clip names
mask: mask.png # saved water mask
metadata: metadata.json # saved survey answers
frame: frame.png # middle frame snapshot
overlay: overlay.png # frame with mask blended in green
mask_vis: mask_vis.png # greyscale mask PNG (--extras only)
gif_original_hires: video_original_hires.gif
gif_original_lowres: video_original_lowres.gif
gif_overlay_hires: video_overlay_hires.gif
gif_overlay_lowres: video_overlay_lowres.gif
Add, remove, or reorder questions directly in the YAML — the UI rebuilds automatically. key is what gets saved in metadata.json; default selects the pre-checked option (omit or set to null to leave unselected).
Optical flow segmentation (optional)
Set optical_flow_config_file in config.yaml to point to a YAML file that enables the Auto Segment button. When pressed, the tool computes a river mask from the loaded frames and replaces the current mask (undoable). The segmentation combines two criteria:
- Optical flow magnitude — pixels where the temporal median of frame-to-frame flow (scaled by FPS) exceeds a fraction of the maximum are considered moving water.
- Brightness — pixels outside a brightness window are excluded (removes sky, saturated glare, etc.).
# config/optical_flow_config.yaml
enabled: true
norm_squared_threshold: 0.06 # fraction of max flow² that counts as moving
gaussian_kernel: [5, 5] # blur kernel applied to the reference frame before brightness check
brightness_range: [2, 253] # [min, max] greyscale brightness to keep
enabled: false disables the button without removing the config file.
Clip list file
config/clips.txt lists the clip filenames to annotate, one per line. Lines starting with # are ignored. Clips are processed in order; already-annotated clips (those with an existing mask.png) are skipped automatically. Pass --no-skip to include them. When the last clip is reached, a dialog appears and the app exits.
# Example clips.txt
left_20230501T120000.zip
left_20230502T120000.zip
Copy config/clips.example.txt as a starting point.
Controls
The window is split into two panels: the video canvas on the left (~70% of the width) and the survey panel on the right. The video auto-plays as a looping preview. Drawing tools and mask controls are arranged above and beside the canvas; navigation buttons (Previous / Next / Skip) sit at the top.
Tool modes
Three drawing tools are available in the tool row. The active tool is highlighted in blue.
| Tool | How to activate | Description |
|---|---|---|
| Brush | Click Brush | Click and drag to paint the mask with a circular brush (default) |
| Polygon | Click Polygon | Click to place vertices and build closed shapes; use Fill mode to commit them |
| Fill | Click Fill | Click inside a closed polygon to fill it onto the mask |
Brush tool
| Action | How |
|---|---|
| Draw water mask | Click and drag on the video |
| Erase mask | Toggle Eraser button (turns orange when active), then drag |
| Brush preview | A white circle follows the cursor showing the current brush size |
| Adjust brush size | Brush size slider (2–50 px, default 5); click ↺ to reset |
Polygon tool
Polygons are drawn as overlays and do not affect the mask until you use Fill mode.
| Action | How |
|---|---|
| Add vertex | Left-click on the canvas |
| Remove last vertex | Right-click |
| Close a shape | Left-click near the first vertex (red dot) when ≥ 3 vertices are placed; completed shapes turn bold cyan |
| Draw multiple shapes | Each closed shape is kept independently; draw as many as needed |
| Cancel in-progress polygon | Cancel Current Poly — discards the unfinished polygon, keeps completed shapes |
| Delete last completed shape | Del Shape |
Fill tool
| Action | How |
|---|---|
| Fill a shape | Left-click anywhere inside a closed polygon; that shape's interior is painted onto the mask |
| Nested shapes | If a closed polygon lies entirely inside the target, its interior is left unfilled (acts as a hole) |
| Innermost shape | Clicking inside nested shapes always fills the innermost (smallest) polygon containing the click |
| Undo fill | Undo — each fill is a single undoable step |
Mask editing
| Action | How |
|---|---|
| Undo last action | Undo |
| Undo 10 actions | Undo×10 |
| Redo | Redo |
| Clear entire mask | Clear |
| Toggle mask overlay | Hide Mask / Show Mask — button turns red when hidden; does not affect mask data |
| Mask transparency | Mask Alpha slider (0–100%, default 15%); click ↺ to reset |
Starting-point shortcuts
| Action | How |
|---|---|
| Load mask from previous clip | Load Prev Mask — copies the saved mask of the previous clip onto the current one; undoable |
| Optical flow first guess | Auto Segment — replaces the current mask with an automatic river segmentation; undoable. Only available when optical_flow_config_file is set in config.yaml. |
Image display adjustments
Three vertical sliders sit to the left of the video and affect display only — they do not change what is saved.
| Slider | Effect | Range |
|---|---|---|
| Brightness | Shifts all pixel values up or down | −100 to +100 |
| Contrast | Scales pixel values around the midpoint | −100 to +100 |
| Gamma | Applies a power-law correction (higher = brighter) | 0.1× to 3.0× |
Click ↺ below any slider to restore its default value.
Navigation
| Action | How |
|---|---|
| Save and continue | Next — saves current clip and loads the next one. If the clip already has a saved annotation a dialog asks whether to replace it or keep the existing save. |
| Go back | Previous — saves current clip and returns to the previously viewed clip. Disabled on the first clip. |
| Skip without saving | Skip — discards any unsaved changes and loads the next clip without writing anything to disk. |
Output
Each annotated clip produces a folder <out_dir>/<clip_stem>/ with:
mask.png # Binary water mask at full source resolution (always)
metadata.json # Survey answers as JSON (always)
frame.png # Middle frame of the clip (always)
overlay.png # That frame with the mask blended in green (always)
# Only with --extras:
mask_vis.png # Mask rendered as a greyscale PNG
video_original_hires.gif # All frames at display resolution
video_original_lowres.gif # All frames at 50% of display resolution
video_overlay_hires.gif # Overlay GIF at display resolution
video_overlay_lowres.gif # Overlay GIF at 50% of display resolution
All output filenames can be overridden via the filenames: section in config/config.yaml.
Survey answers (metadata.json)
Keys and values are determined by the questions section in config/config.yaml. With the default config:
{
"flow": "Turbulent | Laminar | Uncertain",
"shadows": "Yes | No | Uncertain",
"artifacts": "Yes | No | Uncertain",
"lighting": "Day | Night | Uncertain",
"exposure": "Overexposed | Underexposed | Both | Normal | Uncertain",
"snowing": "Yes | No | Uncertain",
"snow_on_ground":"Yes | No | Uncertain"
}
Internals
Clip format
Each clip is a ZIP archive containing a video file (default left.mp4, configurable via filenames.video_in_zip). The filename encodes the recording timestamp (e.g. left_20230615T120000.zip).
Frame loading
Up to max_frames frames are extracted from the video and scaled so the longest side is display_max px. This display-resolution copy is what the annotator works on; the full-resolution dimensions are remembered separately so the saved mask is upscaled back to the original size on export.
Mask drawing
The mask is a binary array at display resolution. Brush strokes stamp a filled circle (draw or erase). Polygon shapes are stored as overlays and don't touch the mask until a Fill click rasterises them — the innermost polygon containing the click is filled, and any polygon whose centroid falls inside it is punched out as a hole.
Every mask-changing operation is pushed onto an undo stack before it executes. On save, the mask is upscaled to the original video resolution and written as an 8-bit PNG (0 or 255).
Resuming
When a clip is loaded that already has a saved mask.png and metadata.json, the mask is restored at display resolution and the survey answers are pre-filled.
Repository structure
.env.example # S3 credential template (copy to .env and fill in)
config/
config.yaml # Your local config (git-ignored, copy from example)
config.example.yaml # Example config to copy and edit
clips.txt # Your clip list (git-ignored, copy from example)
clips.example.txt # Example clip list
optical_flow_config.yaml # Optional optical flow parameters (enable via config.yaml)
src/river_annotation_tool/
annotation_script.py # Entry point — argument parsing and app launch
annotator.py # Main QMainWindow — orchestrates all components
clip_selector.py # Reads the clip list and picks the next clip
filesystem.py # Storage backend — local passthrough or S3 via s3fs
mask_canvas.py # Drawing widget — brush, undo, erase, mouse events
video_loader.py # ZIP extraction and frame resizing
compute_optical_flow.py # Optical flow river segmentation (Auto Segment button)
config.py # AppConfig dataclass and YAML loader
__init__.py # Package version
pyproject.toml # Project metadata and dependencies
Development
# Install pre-commit hooks
uv run pre-commit install
uv run pre-commit run --all-files # Run manually once
# Add a dependency
uv add <package>
uv add --dev <package> # Development-only