Splittend in several files

This commit is contained in:
2026-05-20 13:26:03 +02:00
parent 07aaac08ef
commit 5f8c579247
7 changed files with 715 additions and 621 deletions

138
README.md
View File

@@ -1,14 +1,6 @@
# River Annotation Tool
A desktop application for manually annotating river video clips as part of the [HydroScan](https://github.com/HydroScan) project. It lets annotators draw pixel-level masks over river regions of interest and answer structured survey questions about flow conditions, lighting, and scene quality.
## Features
- Load river video clips from ZIP archives (containing MP4s)
- Draw and erase masks with an adjustable brush on video frames
- Cycle through all frames with auto-playback at native FPS
- Answer structured questions across three categories: **River**, **Scene**, and **Weather**
- Resume saved annotation sessions; exports masks, metadata, and overlay GIFs
A desktop application for manually annotating river video clips as part of the [HydroScan](https://github.com/HydroScan) project. Annotators draw pixel-level water masks over river footage and answer structured survey questions about flow conditions, lighting, and scene quality.
## Requirements
@@ -22,66 +14,134 @@ A desktop application for manually annotating river video clips as part of the [
git clone <repo-url>
cd river-annotation-tool
# Install dependencies (creates a virtual environment automatically with uv)
# Install with uv (creates the virtual environment automatically)
uv sync
# Or with pip
python -m venv .venv
.venv\Scripts\activate # Windows
# source .venv/bin/activate # macOS/Linux
.venv\Scripts\activate # Windows
# source .venv/bin/activate # macOS/Linux
pip install -e .
```
## Usage
```sh
python -m river_annotation_tool.annotation_script \
--data <path/to/zip/files> \
--out <path/to/output/dir> \
[--clip <clip_name>]
python -m river_annotation_tool.annotation_script --data <path/to/zips> --out <path/to/output>
```
### Arguments
| Argument | Default | Description |
|---|---|---|
| `--data` | `../torrent-flow/data/examples_for_annotations/` | Directory containing ZIP files |
| `--out` | `data/annotation_results/` | Output directory for saved annotations |
| `--clip` | *(first clip)* | Specific clip to open (e.g. `left_20230501`) |
| `--data` | *(hardcoded path)* | Directory containing ZIP archives of clips |
| `--out` | `data/annotation_results/` | Directory where annotations are written |
| `--clip` | *(first unannotated clip)* | Open a specific clip by stem name (e.g. `left_20230501`) |
| `--time` | — | Target time of day `HH:MM` — picks the clip closest to this time for each day |
| `--daily` | off | Annotate one clip per day (at `--time`, default noon); advances to the next day on **Next** |
| `--skip-existing-day` | off | With `--daily`, skip entire days that already have any annotated clip |
| `--extras` | off | Also save GIFs and extra PNGs (see Output section) |
### Controls
### Typical workflows
```sh
# Annotate clips in chronological order (default)
python -m river_annotation_tool.annotation_script --data data/clips --out data/out
# One clip per day, always at the noon recording
python -m river_annotation_tool.annotation_script --data data/clips --out data/out --daily --time 12:00
# Resume a daily run, skip days already touched
python -m river_annotation_tool.annotation_script --data data/clips --out data/out \
--daily --time 12:00 --skip-existing-day
# Annotate a single specific clip
python -m river_annotation_tool.annotation_script --data data/clips --out data/out \
--clip left_20230615T120000
```
## Controls
The window shows the video on the left (auto-playing) and the survey panel on the right.
| Action | How |
|---|---|
| Draw mask | Click and drag on the canvas |
| Draw water mask | Click and drag on the video |
| Erase mask | Toggle **Eraser** button, then drag |
| Undo last stroke | **Undo** button |
| Play/pause frames | **Play / Pause** button |
| Save annotation | **Save** button |
| Change brush size | Slider in the toolbar |
| Undo last stroke | **Undo** |
| Clear entire mask | **Clear** |
| Adjust brush size | Slider next to the erase controls |
| Save and continue | **Next** — saves current clip and loads the next one |
| Skip without saving | **Skip** — discards changes and loads the next one |
| Save only | **Save** — writes to disk without advancing |
| Restore last save | **Reload Saved** — reverts mask and answers to what was last written |
## Output
Each clip is saved to `<output_dir>/<clip_stem>/`:
Each annotated clip produces a folder `<output_dir>/<clip_stem>/` with:
```
mask.png # Binary mask at full resolution
metadata.json # Survey answers
frame.png # Key frame
mask_vis.png # Mask visualisation
overlay.png # Frame + mask overlay
video_original_hires.gif
video_original_lowres.gif
video_overlay_hires.gif
video_overlay_lowres.gif
mask.png # Binary water mask at full source resolution (always)
metadata.json # Survey answers as JSON (always)
frame.png # Middle frame of the clip (always)
overlay.png # That frame with the mask blended in green (always)
# Only with --extras:
mask_vis.png # Mask rendered as a greyscale PNG
video_original_hires.gif # All frames at display resolution
video_original_lowres.gif # All frames at 50% of display resolution
video_overlay_hires.gif # Overlay GIF at display resolution
video_overlay_lowres.gif # Overlay GIF at 50% of display resolution
```
## Repository Structure
### Survey answers (`metadata.json`)
```json
{
"flow": "Turbulent | Laminar | Uncertain",
"shadows": "Yes | No | Uncertain",
"artifacts": "Yes | No | Uncertain",
"lighting": "Day | Night | Uncertain",
"exposure": "Overexposed | Underexposed | Both | Normal | Uncertain",
"snowing": "Yes | No | Uncertain",
"snow_on_ground":"Yes | No | Uncertain"
}
```
## How it works
### Clip format
Each clip is a ZIP archive containing a `left.mp4` video. The filename encodes the recording timestamp (e.g. `left_20230615T120000.zip`), which is used for sorting and daily filtering.
### Frame loading
Up to 100 frames are extracted from the video and scaled so the longest side is 480 px. This display-resolution copy is what the annotator works on; the full-resolution dimensions are remembered separately so the saved mask is upscaled back to the original size on export.
### Mask drawing
The mask is a binary NumPy array matching the display frame size. Each brush stroke stamps a filled circle of the selected radius, setting pixels to 1 (draw) or 0 (erase). The history stack stores a copy of the mask before each stroke, enabling unlimited undo. On save the mask is resized to the original video resolution with nearest-neighbour interpolation and written as an 8-bit PNG (0 or 255).
### Clip selection
`ClipSelector` scans the data directory, builds a sorted DataFrame of clips ordered by timestamp, and filters out clips that already have a `mask.png`. In daily mode it groups the remaining clips by calendar day and picks the one whose recording time is closest to the target hour; on **Next**, it moves to the first clip of the following day.
### Resuming
When a clip is loaded that already has a saved `mask.png` and `metadata.json`, the mask is restored at display resolution and the survey answers are pre-filled. **Reload Saved** lets you revert to the last save at any point during the current session.
## Repository structure
```
src/river_annotation_tool/
annotation_script.py # Main GUI application
annotation_script.py # Entry point — argument parsing and app launch
annotator.py # Main QMainWindow — orchestrates all components
clip_selector.py # Clip-picking logic (daily mode, time filtering)
mask_canvas.py # Drawing widget — brush, undo, erase, mouse events
video_loader.py # ZIP extraction and frame resizing
config.py # Config constants, question definitions, defaults
__init__.py # Package version
pyproject.toml # Project metadata and dependencies
requirements.txt # Pinned dependencies (generated)
```
## Development
@@ -89,7 +149,7 @@ requirements.txt # Pinned dependencies (generated)
```sh
# Install pre-commit hooks
pre-commit install
pre-commit run --all-files # Run hooks manually once
pre-commit run --all-files # Run manually once
# Add a dependency
uv add <package>

View File

@@ -1,595 +1,49 @@
import os
import zipfile
import tempfile
import json
import argparse
from pathlib import Path
import cv2
import numpy as np
import pandas as pd
from PIL import Image
from matplotlib import use
use("QtAgg")
from PySide6.QtWidgets import (
QApplication,
QMainWindow,
QWidget,
QPushButton,
QVBoxLayout,
QHBoxLayout,
QLabel,
QRadioButton,
QButtonGroup,
QGroupBox,
QSlider,
)
from PySide6.QtCore import Qt, QTimer
from PySide6.QtWidgets import QApplication
from matplotlib.backends.backend_qtagg import FigureCanvasQTAgg as FigureCanvas
from matplotlib.figure import Figure
from .annotator import Annotator
# ─────────────────────────────────────────────
# CONFIG
# ─────────────────────────────────────────────
class Config:
DISPLAY_MAX = 480
FPS_FALLBACK = 25
MAX_FRAMES = 100
# ─────────────────────────────────────────────
# QUESTIONS
# ─────────────────────────────────────────────
QUESTIONS = [
(
"River",
[
("flow", "Flow Regime", ["Turbulent", "Laminar", "Uncertain"]),
("shadows", "Strong Shadows", ["Yes", "No", "Uncertain"]),
("artifacts", "Artifacts on River", ["Yes", "No", "Uncertain"]),
],
),
(
"Scene",
[
("lighting", "Lighting", ["Day", "Night", "Uncertain"]),
(
"exposure",
"Exposure",
["Overexposed", "Underexposed", "Both", "Normal", "Uncertain"],
),
],
),
(
"Weather",
[
("snowing", "Snowing", ["Yes", "No", "Uncertain"]),
("snow_on_ground", "Snow on Ground", ["Yes", "No", "Uncertain"]),
],
),
]
# ─────────────────────────────────────────────
# DEFAULTS
# ─────────────────────────────────────────────
DEFAULTS = {
"flow": "Laminar",
"shadows": "No",
"artifacts": "No",
"lighting": "Day",
"exposure": "Normal",
"snowing": "No",
"snow_on_ground": "No",
}
# ─────────────────────────────────────────────
# VIDEO LOADING
# ─────────────────────────────────────────────
def load_frames(zip_path: Path, max_frames: int):
video_bytes = zipfile.ZipFile(zip_path).read("left.mp4")
with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as f:
f.write(video_bytes)
tmp_path = f.name
cap = cv2.VideoCapture(tmp_path)
fps = cap.get(cv2.CAP_PROP_FPS) or Config.FPS_FALLBACK
frames = []
while len(frames) < max_frames:
ok, frame = cap.read()
if not ok:
break
frames.append(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
cap.release()
os.unlink(tmp_path)
if not frames:
raise RuntimeError(f"No frames found in {zip_path}")
h, w = frames[0].shape[:2]
scale = Config.DISPLAY_MAX / max(h, w)
dh, dw = int(h * scale), int(w * scale)
frames = [cv2.resize(f, (dw, dh)) for f in frames]
return frames, fps, dh, dw, h, w
# ─────────────────────────────────────────────
# MAIN APP
# ─────────────────────────────────────────────
class Annotator(QMainWindow):
def __init__(self, data_dir: Path, out_dir: Path, clip: str = None, target_time: str = None, daily: bool = False, extras: bool = False, skip_existing_day: bool = False):
super().__init__()
self.data_dir = Path(data_dir)
self.out_dir = Path(out_dir)
self.target_time = target_time
self.daily = daily
self.extras = extras
self.skip_existing_day = skip_existing_day
self.current_date = None
self.history = []
self.erase_mode = False
self.frame_i = 0
self.drawing = False
self._pending_answers = None
self.setWindowTitle("River Annotator")
self.df = self._load_dataset()
self._load_clip(specific=clip)
self._init_canvas()
self._init_ui()
self._init_timer()
# ─────────────────────────────
# DATA
# ─────────────────────────────
def _load_dataset(self):
files = list(self.data_dir.glob("*.zip"))
if not files:
raise FileNotFoundError(f"No zip files in {self.data_dir}")
df = pd.DataFrame({"filename": files})
df["datetime"] = df["filename"].apply(
lambda x: pd.to_datetime(x.stem.split("_")[1], errors="coerce")
)
# sort by datetime
df = df.sort_values("datetime").reset_index(drop=True)
return df
def _load_clip(self, specific: str = None, next_day: bool = False):
if specific is not None:
matches = list(self.data_dir.glob(f"{specific}.zip"))
if not matches:
p = self.data_dir / specific
matches = [p] if p.exists() else []
if not matches:
raise FileNotFoundError(f"Clip '{specific}' not found in {self.data_dir}")
self.filename = matches[0]
else:
remaining = [
f
for f in self.df["filename"]
if not (self.out_dir / f.stem / "mask.png").exists()
]
if not remaining:
raise RuntimeError("No remaining clips to annotate")
if self.target_time or self.daily:
# Parse target time (format: HH:MM)
if self.target_time:
target_hour, target_minute = map(int, self.target_time.split(":"))
else:
target_hour, target_minute = 12, 0 # Default to noon
target_seconds = target_hour * 3600 + target_minute * 60
# Get datetimes for remaining files
remaining_datetimes = [
self.df[self.df["filename"] == f]["datetime"].values[0]
for f in remaining
]
# Group by day
df_remaining = pd.DataFrame({
"filename": remaining,
"datetime": remaining_datetimes
})
df_remaining["date"] = df_remaining["datetime"].dt.date
# In daily mode, filter to next day if needed
if self.daily and next_day and self.current_date is not None:
import datetime
next_date = self.current_date + datetime.timedelta(days=1)
df_remaining = df_remaining[df_remaining["date"] >= next_date]
# In daily mode, skip entire days that already have any annotated clip
if self.daily and self.skip_existing_day:
annotated_dates = set()
for f in self.df["filename"]:
if (self.out_dir / f.stem / "mask.png").exists():
dt = self.df[self.df["filename"] == f]["datetime"].values[0]
annotated_dates.add(pd.Timestamp(dt).date())
df_remaining = df_remaining[~df_remaining["date"].isin(annotated_dates)]
if df_remaining.empty:
raise RuntimeError("No remaining clips to annotate")
# For each day, find the clip closest to target time
closest_clips = []
dates_list = []
for date, group in df_remaining.groupby("date"):
group = group.copy()
group["time_seconds"] = group["datetime"].dt.hour * 3600 + group["datetime"].dt.minute * 60
group["time_diff"] = (group["time_seconds"] - target_seconds).abs()
closest = group.loc[group["time_diff"].idxmin()]
closest_clips.append(closest["filename"])
dates_list.append(date)
# In daily mode, take only the first day's clip
if self.daily:
self.filename = closest_clips[0]
self.current_date = dates_list[0]
else:
# Take the first one (earliest by date/time)
self.filename = closest_clips[0]
self.current_date = dates_list[0]
else:
# take the earliest one (after sorting by datetime)
self.filename = remaining[0]
# Extract date from filename
import datetime
dt = self.df[self.df["filename"] == self.filename]["datetime"].values[0]
self.current_date = pd.Timestamp(dt).date()
self.frames, self.fps, self.dh, self.dw, self.h, self.w = load_frames(
self.filename, Config.MAX_FRAMES
)
self.history = []
self.mask = np.zeros((self.dh, self.dw), dtype=np.uint8)
self._pending_answers = None
out = self.out_dir / self.filename.stem
mask_path = out / "mask.png"
meta_path = out / "metadata.json"
if mask_path.exists():
mask_full = np.array(Image.open(mask_path).convert("L"))
self.mask = cv2.resize(
(mask_full > 127).astype(np.uint8),
(self.dw, self.dh),
interpolation=cv2.INTER_NEAREST,
)
if meta_path.exists():
with open(meta_path) as f:
self._pending_answers = json.load(f)
def _set_answers(self, answers: dict):
for key, value in answers.items():
if key not in self.q_widgets:
continue
_, buttons, options = self.q_widgets[key]
for i, btn in enumerate(buttons):
btn.setChecked(options[i] == value)
# ─────────────────────────────
# UI
# ─────────────────────────────
def _init_canvas(self):
self.fig = Figure()
self.canvas = FigureCanvas(self.fig)
self.ax = self.fig.add_subplot(111)
self.ax.axis("off")
self.img = self.ax.imshow(self.frames[0])
self.mask_img = self.ax.imshow(np.zeros((self.dh, self.dw, 4)))
self.title_text = self.ax.set_title(self.filename.name, fontsize=10, pad=4)
def _init_ui(self):
self.q_widgets = {}
question_box = QVBoxLayout()
for section, qs in QUESTIONS:
group = QGroupBox(section)
vbox = QVBoxLayout()
for key, label, options in qs:
vbox.addWidget(QLabel(label))
btn_group = QButtonGroup(self)
row = QHBoxLayout()
buttons = []
default_value = DEFAULTS.get(key)
for opt in options:
btn = QRadioButton(opt)
btn_group.addButton(btn)
row.addWidget(btn)
buttons.append(btn)
if default_value == opt:
btn.setChecked(True)
if default_value is None and buttons:
buttons[-1].setChecked(True)
self.q_widgets[key] = (btn_group, buttons, options)
vbox.addLayout(row)
group.setLayout(vbox)
question_box.addWidget(group)
# Controls
self.btn_save = QPushButton("Save")
self.btn_next = QPushButton("Next")
self.btn_skip = QPushButton("Skip")
self.btn_clear = QPushButton("Clear")
self.btn_erase = QPushButton("Eraser")
self.btn_undo = QPushButton("Undo")
self.btn_reload = QPushButton("Reload Saved")
self.brush_slider = QSlider(Qt.Horizontal)
self.brush_slider.setRange(2, 50)
self.brush_slider.setValue(5)
row1 = QHBoxLayout()
for b in [self.btn_save, self.btn_next, self.btn_skip]:
row1.addWidget(b)
row2 = QHBoxLayout()
for b in [self.btn_clear, self.btn_erase, self.btn_undo, self.btn_reload]:
row2.addWidget(b)
row2.addWidget(QLabel("Brush"))
row2.addWidget(self.brush_slider)
left = QVBoxLayout()
left.addWidget(self.canvas)
left.addLayout(row1)
left.addLayout(row2)
main = QHBoxLayout()
left_widget = QWidget()
left_widget.setLayout(left)
right_widget = QWidget()
right_widget.setLayout(question_box)
main.addWidget(left_widget, 3)
main.addWidget(right_widget, 2)
container = QWidget()
container.setLayout(main)
self.setCentralWidget(container)
# events
self.btn_save.clicked.connect(self.save)
self.btn_next.clicked.connect(self.next_clip)
self.btn_skip.clicked.connect(self.skip_clip)
self.btn_clear.clicked.connect(self.clear_mask)
self.btn_erase.clicked.connect(self.toggle_eraser)
self.btn_undo.clicked.connect(self.undo)
self.btn_reload.clicked.connect(self.reload_saved)
self.canvas.mpl_connect("button_press_event", self.on_press)
self.canvas.mpl_connect("motion_notify_event", self.on_move)
self.canvas.mpl_connect("button_release_event", self.on_release)
if self._pending_answers:
self._set_answers(self._pending_answers)
self._pending_answers = None
def _init_timer(self):
self.timer = QTimer()
self.timer.timeout.connect(self.update_frame)
self.timer.start(int(1000 / self.fps))
# ─────────────────────────────
# ANNOTATION
# ─────────────────────────────
def get_answers(self):
out = {}
for key, (group, buttons, options) in self.q_widgets.items():
for i, btn in enumerate(buttons):
if btn.isChecked():
out[key] = options[i]
return out
def stamp(self, x, y):
if x is None or y is None:
return
self.history.append(self.mask.copy())
r = self.brush_slider.value()
ix, iy = int(x), int(y)
y0, y1 = max(0, iy - r), min(self.dh, iy + r + 1)
x0, x1 = max(0, ix - r), min(self.dw, ix + r + 1)
Y, X = np.ogrid[y0:y1, x0:x1]
circle = (X - ix) ** 2 + (Y - iy) ** 2 <= r**2
self.mask[y0:y1, x0:x1][circle] = 0 if self.erase_mode else 1
self.redraw_mask()
def redraw_mask(self):
rgba = np.zeros((self.dh, self.dw, 4))
rgba[..., 1] = self.mask * 0.7
rgba[..., 3] = self.mask * 0.4
self.mask_img.set_data(rgba)
self.canvas.draw_idle()
# ─────────────────────────────
# EVENTS
# ─────────────────────────────
def on_press(self, e):
if e.xdata is None:
return
self.drawing = True
self.stamp(e.xdata, e.ydata)
def on_move(self, e):
if self.drawing:
self.stamp(e.xdata, e.ydata)
def on_release(self, _):
self.drawing = False
def update_frame(self):
self.frame_i = (self.frame_i + 1) % len(self.frames)
self.img.set_data(self.frames[self.frame_i])
self.canvas.draw_idle()
# ─────────────────────────────
# HELPERS
# ─────────────────────────────
def _make_overlay(self, frame, alpha=0.4):
overlay = frame.copy()
green = np.zeros_like(frame)
green[..., 1] = 255
m = self.mask.astype(bool)
overlay[m] = (1 - alpha) * overlay[m] + alpha * green[m]
return overlay.astype(np.uint8)
def _save_gif(self, frames, out_path, scale=1.0):
h, w = frames[0].shape[:2]
nh, nw = max(1, int(h * scale)), max(1, int(w * scale))
pil_frames = [Image.fromarray(cv2.resize(f, (nw, nh))) for f in frames]
pil_frames[0].save(
out_path,
save_all=True,
append_images=pil_frames[1:],
duration=int(1000 / self.fps),
loop=0,
)
# ─────────────────────────────
# ACTIONS
# ─────────────────────────────
def reload_saved(self):
out = self.out_dir / self.filename.stem
mask_path = out / "mask.png"
meta_path = out / "metadata.json"
if not mask_path.exists():
return
mask_full = np.array(Image.open(mask_path).convert("L"))
self.mask = cv2.resize(
(mask_full > 127).astype(np.uint8),
(self.dw, self.dh),
interpolation=cv2.INTER_NEAREST,
)
self.history = []
self.redraw_mask()
if meta_path.exists():
with open(meta_path) as f:
self._set_answers(json.load(f))
def clear_mask(self):
self.mask[:] = 0
self.redraw_mask()
def undo(self):
if self.history:
self.mask = self.history.pop()
self.redraw_mask()
def toggle_eraser(self):
self.erase_mode = not self.erase_mode
self.btn_erase.setText("Eraser ON" if self.erase_mode else "Eraser")
def save(self):
out = self.out_dir / self.filename.stem
out.mkdir(parents=True, exist_ok=True)
mask_full = cv2.resize(
self.mask.astype(np.uint8),
(self.w, self.h),
interpolation=cv2.INTER_NEAREST,
)
Image.fromarray(mask_full * 255).save(out / "mask.png")
with open(out / "metadata.json", "w") as f:
json.dump(self.get_answers(), f, indent=2)
mid = len(self.frames) // 2
frame = self.frames[mid]
overlay_frame = self._make_overlay(frame)
Image.fromarray(frame).save(out / "frame.png")
Image.fromarray(overlay_frame).save(out / "overlay.png")
if self.extras:
Image.fromarray((self.mask * 255).astype(np.uint8)).save(out / "mask_vis.png")
overlay_frames = [self._make_overlay(f) for f in self.frames]
self._save_gif(self.frames, out / "video_original_hires.gif", scale=1.0)
self._save_gif(self.frames, out / "video_original_lowres.gif", scale=0.5)
self._save_gif(overlay_frames, out / "video_overlay_hires.gif", scale=1.0)
self._save_gif(overlay_frames, out / "video_overlay_lowres.gif", scale=0.5)
print("Saved:", out)
def next_clip(self):
self.save()
self._load_clip(next_day=self.daily)
self.frame_i = 0
self.img.set_data(self.frames[0])
self.title_text.set_text(self.filename.name)
self.redraw_mask()
if self._pending_answers:
self._set_answers(self._pending_answers)
self._pending_answers = None
def skip_clip(self):
self._load_clip(next_day=self.daily)
self.frame_i = 0
self.img.set_data(self.frames[0])
self.title_text.set_text(self.filename.name)
self.redraw_mask()
if self._pending_answers:
self._set_answers(self._pending_answers)
self._pending_answers = None
# ─────────────────────────────────────────────
# ENTRY POINT
# ─────────────────────────────────────────────
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--data", default=r"C:\Users\sieverin\HydroScan\Code\river-annotation-tool\data\filtered_data")
parser.add_argument(
"--data",
default=r"C:\Users\sieverin\HydroScan\Code\river-annotation-tool\data\filtered_data",
)
parser.add_argument("--out", default="data/annotation_results/")
parser.add_argument("--clip", default=None, help="Stem name of a specific clip to load (e.g. 'left_20230501')")
parser.add_argument("--time", default=None, help="Target time to filter clips by day (format: HH:MM, e.g. '14:30'). Selects the closest clip to this time for each day.")
parser.add_argument("--daily", action="store_true", help="Load only 1 clip per day at the specified time (requires --time).")
parser.add_argument("--extras", action="store_true", help="Also save GIFs, frame PNG, overlay PNG, and mask_vis PNG alongside the mask.")
parser.add_argument("--skip-existing-day", action="store_true", help="In --daily mode, skip days that already have any annotated clip.")
parser.add_argument(
"--clip",
default=None,
help="Stem name of a specific clip to load (e.g. 'left_20230501')",
)
parser.add_argument(
"--time",
default=None,
help="Target time to filter clips by day (format: HH:MM, e.g. '14:30'). "
"Selects the closest clip to this time for each day.",
)
parser.add_argument(
"--daily",
action="store_true",
help="Load only 1 clip per day at the specified time (requires --time).",
)
parser.add_argument(
"--extras",
action="store_true",
help="Also save GIFs, frame PNG, overlay PNG, and mask_vis PNG alongside the mask.",
)
parser.add_argument(
"--skip-existing-day",
action="store_true",
help="In --daily mode, skip days that already have any annotated clip.",
)
return parser.parse_args()
@@ -597,8 +51,14 @@ if __name__ == "__main__":
args = parse_args()
app = QApplication([])
win = Annotator(Path(args.data), Path(args.out), clip=args.clip, target_time=args.time, daily=args.daily, extras=args.extras, skip_existing_day=args.skip_existing_day)
win = Annotator(
Path(args.data),
Path(args.out),
clip=args.clip,
target_time=args.time,
daily=args.daily,
extras=args.extras,
skip_existing_day=args.skip_existing_day,
)
win.show()
app.exec()

View File

@@ -0,0 +1,268 @@
import json
from pathlib import Path
import cv2
import numpy as np
from PIL import Image
from PySide6.QtCore import QTimer
from PySide6.QtWidgets import (
QButtonGroup,
QGroupBox,
QHBoxLayout,
QLabel,
QMainWindow,
QPushButton,
QRadioButton,
QVBoxLayout,
QWidget,
)
from .clip_selector import ClipSelector
from .config import DEFAULTS, QUESTIONS, Config
from .mask_canvas import MaskCanvas
from .video_loader import load_frames
class Annotator(QMainWindow):
def __init__(
self,
data_dir: Path,
out_dir: Path,
clip: str = None,
target_time: str = None,
daily: bool = False,
extras: bool = False,
skip_existing_day: bool = False,
):
super().__init__()
self.out_dir = Path(out_dir)
self.extras = extras
self.selector = ClipSelector(
data_dir=Path(data_dir),
out_dir=self.out_dir,
target_time=target_time,
daily=daily,
skip_existing_day=skip_existing_day,
)
self.setWindowTitle("River Annotator")
self._load_clip(specific=clip)
self._init_ui()
self._init_timer()
# ── clip loading ───────────────────────────────────────────────
def _load_clip(self, specific: str = None, next_day: bool = False):
self.filename = self.selector.next(specific=specific, next_day=next_day)
self.frames, self.fps, self.dh, self.dw, self.h, self.w = load_frames(
self.filename, Config.MAX_FRAMES
)
self._pending_answers = self._read_saved_answers()
def _read_saved_mask(self):
mask_path = self.out_dir / self.filename.stem / "mask.png"
if not mask_path.exists():
return None
mask_full = np.array(Image.open(mask_path).convert("L"))
return cv2.resize(
(mask_full > 127).astype(np.uint8),
(self.dw, self.dh),
interpolation=cv2.INTER_NEAREST,
)
def _read_saved_answers(self):
meta_path = self.out_dir / self.filename.stem / "metadata.json"
if not meta_path.exists():
return None
with open(meta_path) as f:
return json.load(f)
# ── UI setup ───────────────────────────────────────────────────
def _init_ui(self):
self.mc = MaskCanvas(self.frames, self.dh, self.dw)
self.mc.set_title(self.filename.name)
self.mc.reset(self._read_saved_mask())
self.q_widgets = {}
question_panel = self._build_question_panel()
btn_save = QPushButton("Save")
btn_next = QPushButton("Next")
btn_skip = QPushButton("Skip")
btn_clear = QPushButton("Clear")
btn_undo = QPushButton("Undo")
btn_reload = QPushButton("Reload Saved")
row1 = QHBoxLayout()
for b in [btn_save, btn_next, btn_skip]:
row1.addWidget(b)
row2 = QHBoxLayout()
for b in [btn_clear, self.mc.btn_erase, btn_undo, btn_reload]:
row2.addWidget(b)
row2.addWidget(QLabel("Brush"))
row2.addWidget(self.mc.brush_slider)
left = QVBoxLayout()
left.addWidget(self.mc.canvas)
left.addLayout(row1)
left.addLayout(row2)
left_widget = QWidget()
left_widget.setLayout(left)
right_widget = QWidget()
right_widget.setLayout(question_panel)
main = QHBoxLayout()
main.addWidget(left_widget, 3)
main.addWidget(right_widget, 2)
container = QWidget()
container.setLayout(main)
self.setCentralWidget(container)
btn_save.clicked.connect(self.save)
btn_next.clicked.connect(self.next_clip)
btn_skip.clicked.connect(self.skip_clip)
btn_clear.clicked.connect(self.mc.clear)
btn_undo.clicked.connect(self.mc.undo)
btn_reload.clicked.connect(self.reload_saved)
if self._pending_answers:
self._set_answers(self._pending_answers)
self._pending_answers = None
def _build_question_panel(self) -> QVBoxLayout:
vbox = QVBoxLayout()
for section, qs in QUESTIONS:
group = QGroupBox(section)
gvbox = QVBoxLayout()
for key, label, options in qs:
gvbox.addWidget(QLabel(label))
btn_group = QButtonGroup(self)
row = QHBoxLayout()
buttons = []
default_value = DEFAULTS.get(key)
for opt in options:
btn = QRadioButton(opt)
btn_group.addButton(btn)
row.addWidget(btn)
buttons.append(btn)
if default_value == opt:
btn.setChecked(True)
if default_value is None and buttons:
buttons[-1].setChecked(True)
self.q_widgets[key] = (btn_group, buttons, options)
gvbox.addLayout(row)
group.setLayout(gvbox)
vbox.addWidget(group)
return vbox
def _set_answers(self, answers: dict):
for key, value in answers.items():
if key not in self.q_widgets:
continue
_, buttons, options = self.q_widgets[key]
for i, btn in enumerate(buttons):
btn.setChecked(options[i] == value)
def _init_timer(self):
self.frame_i = 0
self.timer = QTimer()
self.timer.timeout.connect(self._tick)
self.timer.start(int(1000 / self.fps))
def _tick(self):
self.frame_i = (self.frame_i + 1) % len(self.frames)
self.mc.set_frame(self.frames[self.frame_i])
# ── answers ────────────────────────────────────────────────────
def get_answers(self) -> dict:
out = {}
for key, (_, buttons, options) in self.q_widgets.items():
for i, btn in enumerate(buttons):
if btn.isChecked():
out[key] = options[i]
return out
# ── save helpers ───────────────────────────────────────────────
def _make_overlay(self, frame, alpha=0.4):
overlay = frame.copy()
green = np.zeros_like(frame)
green[..., 1] = 255
m = self.mc.mask.astype(bool)
overlay[m] = (1 - alpha) * overlay[m] + alpha * green[m]
return overlay.astype(np.uint8)
def _save_gif(self, frames, out_path, scale=1.0):
h, w = frames[0].shape[:2]
nh, nw = max(1, int(h * scale)), max(1, int(w * scale))
pil_frames = [Image.fromarray(cv2.resize(f, (nw, nh))) for f in frames]
pil_frames[0].save(
out_path,
save_all=True,
append_images=pil_frames[1:],
duration=int(1000 / self.fps),
loop=0,
)
# ── actions ────────────────────────────────────────────────────
def save(self):
out = self.out_dir / self.filename.stem
out.mkdir(parents=True, exist_ok=True)
mask_full = cv2.resize(
self.mc.mask.astype(np.uint8),
(self.w, self.h),
interpolation=cv2.INTER_NEAREST,
)
Image.fromarray(mask_full * 255).save(out / "mask.png")
with open(out / "metadata.json", "w") as f:
json.dump(self.get_answers(), f, indent=2)
mid = len(self.frames) // 2
frame = self.frames[mid]
Image.fromarray(frame).save(out / "frame.png")
Image.fromarray(self._make_overlay(frame)).save(out / "overlay.png")
if self.extras:
Image.fromarray((self.mc.mask * 255).astype(np.uint8)).save(out / "mask_vis.png")
overlay_frames = [self._make_overlay(f) for f in self.frames]
self._save_gif(self.frames, out / "video_original_hires.gif", scale=1.0)
self._save_gif(self.frames, out / "video_original_lowres.gif", scale=0.5)
self._save_gif(overlay_frames, out / "video_overlay_hires.gif", scale=1.0)
self._save_gif(overlay_frames, out / "video_overlay_lowres.gif", scale=0.5)
print("Saved:", out)
def reload_saved(self):
mask = self._read_saved_mask()
if mask is None:
return
self.mc.reset(mask)
answers = self._read_saved_answers()
if answers:
self._set_answers(answers)
def _advance_clip(self, next_day: bool):
self._load_clip(next_day=next_day)
self.frame_i = 0
self.mc.load_clip(
self.frames,
self.dh,
self.dw,
mask=self._read_saved_mask(),
title=self.filename.name,
)
if self._pending_answers:
self._set_answers(self._pending_answers)
self._pending_answers = None
def next_clip(self):
self.save()
self._advance_clip(next_day=self.selector.daily)
def skip_clip(self):
self._advance_clip(next_day=self.selector.daily)

View File

@@ -0,0 +1,108 @@
import datetime
from pathlib import Path
import pandas as pd
class ClipSelector:
"""Picks which clip to annotate next, handling daily/time-based filtering."""
def __init__(
self,
data_dir: Path,
out_dir: Path,
target_time: str = None,
daily: bool = False,
skip_existing_day: bool = False,
):
self.data_dir = data_dir
self.out_dir = out_dir
self.target_time = target_time
self.daily = daily
self.skip_existing_day = skip_existing_day
self.current_date = None
self.df = self._load_dataset()
def _load_dataset(self) -> pd.DataFrame:
files = list(self.data_dir.glob("*.zip"))
if not files:
raise FileNotFoundError(f"No zip files in {self.data_dir}")
df = pd.DataFrame({"filename": files})
df["datetime"] = df["filename"].apply(
lambda x: pd.to_datetime(x.stem.split("_")[1], errors="coerce")
)
return df.sort_values("datetime").reset_index(drop=True)
def is_annotated(self, path: Path) -> bool:
return (self.out_dir / path.stem / "mask.png").exists()
def next(self, specific: str = None, next_day: bool = False) -> Path:
if specific is not None:
return self._resolve_specific(specific)
return self._pick_next(next_day=next_day)
def _resolve_specific(self, specific: str) -> Path:
matches = list(self.data_dir.glob(f"{specific}.zip"))
if not matches:
p = self.data_dir / specific
matches = [p] if p.exists() else []
if not matches:
raise FileNotFoundError(f"Clip '{specific}' not found in {self.data_dir}")
return matches[0]
def _pick_next(self, next_day: bool = False) -> Path:
remaining = [f for f in self.df["filename"] if not self.is_annotated(f)]
if not remaining:
raise RuntimeError("No remaining clips to annotate")
if not (self.target_time or self.daily):
filename = remaining[0]
dt = self.df[self.df["filename"] == filename]["datetime"].values[0]
self.current_date = pd.Timestamp(dt).date()
return filename
return self._pick_by_time(remaining, next_day)
def _pick_by_time(self, remaining: list, next_day: bool) -> Path:
if self.target_time:
target_hour, target_minute = map(int, self.target_time.split(":"))
else:
target_hour, target_minute = 12, 0
target_seconds = target_hour * 3600 + target_minute * 60
remaining_datetimes = [
self.df[self.df["filename"] == f]["datetime"].values[0] for f in remaining
]
df_remaining = pd.DataFrame({"filename": remaining, "datetime": remaining_datetimes})
df_remaining["date"] = df_remaining["datetime"].dt.date
if self.daily and next_day and self.current_date is not None:
next_date = self.current_date + datetime.timedelta(days=1)
df_remaining = df_remaining[df_remaining["date"] >= next_date]
if self.daily and self.skip_existing_day:
annotated_dates = set()
for f in self.df["filename"]:
if self.is_annotated(f):
dt = self.df[self.df["filename"] == f]["datetime"].values[0]
annotated_dates.add(pd.Timestamp(dt).date())
df_remaining = df_remaining[~df_remaining["date"].isin(annotated_dates)]
if df_remaining.empty:
raise RuntimeError("No remaining clips to annotate")
closest_clips, dates_list = [], []
for date, group in df_remaining.groupby("date"):
group = group.copy()
group["time_seconds"] = (
group["datetime"].dt.hour * 3600 + group["datetime"].dt.minute * 60
)
group["time_diff"] = (group["time_seconds"] - target_seconds).abs()
closest = group.loc[group["time_diff"].idxmin()]
closest_clips.append(closest["filename"])
dates_list.append(date)
self.current_date = dates_list[0]
return closest_clips[0]

View File

@@ -0,0 +1,44 @@
class Config:
DISPLAY_MAX = 480
FPS_FALLBACK = 25
MAX_FRAMES = 100
QUESTIONS = [
(
"River",
[
("flow", "Flow Regime", ["Turbulent", "Laminar", "Uncertain"]),
("shadows", "Strong Shadows", ["Yes", "No", "Uncertain"]),
("artifacts", "Artifacts on River", ["Yes", "No", "Uncertain"]),
],
),
(
"Scene",
[
("lighting", "Lighting", ["Day", "Night", "Uncertain"]),
(
"exposure",
"Exposure",
["Overexposed", "Underexposed", "Both", "Normal", "Uncertain"],
),
],
),
(
"Weather",
[
("snowing", "Snowing", ["Yes", "No", "Uncertain"]),
("snow_on_ground", "Snow on Ground", ["Yes", "No", "Uncertain"]),
],
),
]
DEFAULTS = {
"flow": "Laminar",
"shadows": "No",
"artifacts": "No",
"lighting": "Day",
"exposure": "Normal",
"snowing": "No",
"snow_on_ground": "No",
}

View File

@@ -0,0 +1,114 @@
import numpy as np
from matplotlib.backends.backend_qtagg import FigureCanvasQTAgg as FigureCanvas
from matplotlib.figure import Figure
from PySide6.QtCore import Qt
from PySide6.QtWidgets import QPushButton, QSlider
class MaskCanvas:
"""Matplotlib canvas with brush-based mask drawing, undo, and erase."""
def __init__(self, frames, dh: int, dw: int):
self.dh = dh
self.dw = dw
self.mask = np.zeros((dh, dw), dtype=np.uint8)
self.history: list[np.ndarray] = []
self.erase_mode = False
self.drawing = False
self._build_figure(frames)
self._build_controls()
self._connect_events()
def _build_figure(self, frames):
self.fig = Figure()
self.canvas = FigureCanvas(self.fig)
self.ax = self.fig.add_subplot(111)
self.ax.axis("off")
self.img_artist = self.ax.imshow(frames[0])
self.mask_artist = self.ax.imshow(np.zeros((self.dh, self.dw, 4)))
self.title_text = self.ax.set_title("", fontsize=10, pad=4)
def _build_controls(self):
self.btn_erase = QPushButton("Eraser")
self.brush_slider = QSlider(Qt.Horizontal)
self.brush_slider.setRange(2, 50)
self.brush_slider.setValue(5)
def _connect_events(self):
self.canvas.mpl_connect("button_press_event", self._on_press)
self.canvas.mpl_connect("motion_notify_event", self._on_move)
self.canvas.mpl_connect("button_release_event", self._on_release)
self.btn_erase.clicked.connect(self.toggle_erase)
# ── clip transition ────────────────────────────────────────────
def load_clip(self, frames, dh: int, dw: int, mask=None, title: str = ""):
self.dh = dh
self.dw = dw
self.mask = mask if mask is not None else np.zeros((dh, dw), dtype=np.uint8)
self.history = []
self.img_artist.set_data(frames[0])
self.set_title(title)
self.redraw()
# ── frame / title ──────────────────────────────────────────────
def set_frame(self, frame):
self.img_artist.set_data(frame)
self.canvas.draw_idle()
def set_title(self, text: str):
self.title_text.set_text(text)
# ── mask ops ───────────────────────────────────────────────────
def reset(self, mask=None):
self.mask = mask if mask is not None else np.zeros((self.dh, self.dw), dtype=np.uint8)
self.history = []
self.redraw()
def redraw(self):
rgba = np.zeros((self.dh, self.dw, 4))
rgba[..., 1] = self.mask * 0.7
rgba[..., 3] = self.mask * 0.4
self.mask_artist.set_data(rgba)
self.canvas.draw_idle()
def clear(self):
self.mask[:] = 0
self.redraw()
def undo(self):
if self.history:
self.mask = self.history.pop()
self.redraw()
def toggle_erase(self):
self.erase_mode = not self.erase_mode
self.btn_erase.setText("Eraser ON" if self.erase_mode else "Eraser")
def stamp(self, x, y):
if x is None or y is None:
return
self.history.append(self.mask.copy())
r = self.brush_slider.value()
ix, iy = int(x), int(y)
y0, y1 = max(0, iy - r), min(self.dh, iy + r + 1)
x0, x1 = max(0, ix - r), min(self.dw, ix + r + 1)
Y, X = np.ogrid[y0:y1, x0:x1]
circle = (X - ix) ** 2 + (Y - iy) ** 2 <= r**2
self.mask[y0:y1, x0:x1][circle] = 0 if self.erase_mode else 1
self.redraw()
# ── mouse events ───────────────────────────────────────────────
def _on_press(self, e):
if e.xdata is None:
return
self.drawing = True
self.stamp(e.xdata, e.ydata)
def _on_move(self, e):
if self.drawing:
self.stamp(e.xdata, e.ydata)
def _on_release(self, _):
self.drawing = False

View File

@@ -0,0 +1,40 @@
import os
import tempfile
import zipfile
from pathlib import Path
import cv2
from .config import Config
def load_frames(zip_path: Path, max_frames: int):
video_bytes = zipfile.ZipFile(zip_path).read("left.mp4")
with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as f:
f.write(video_bytes)
tmp_path = f.name
cap = cv2.VideoCapture(tmp_path)
fps = cap.get(cv2.CAP_PROP_FPS) or Config.FPS_FALLBACK
frames = []
while len(frames) < max_frames:
ok, frame = cap.read()
if not ok:
break
frames.append(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
cap.release()
os.unlink(tmp_path)
if not frames:
raise RuntimeError(f"No frames found in {zip_path}")
h, w = frames[0].shape[:2]
scale = Config.DISPLAY_MAX / max(h, w)
dh, dw = int(h * scale), int(w * scale)
frames = [cv2.resize(f, (dw, dh)) for f in frames]
return frames, fps, dh, dw, h, w