Move questions and optical flow to separate config files; clean up config.example.yaml

- questions: extracted from config.yaml into config/questions.yaml (committed, like optical_flow_config.yaml)
- optical_flow_config_file and questions_config_file are now required fields
- data_dir and out_dir are now required (no defaults)
- filenames: trimmed to input-only in example; output filenames stay as code defaults
- annotator: remove optional guard around optical flow config loading

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-20 16:30:59 +02:00
parent 9e135ea28c
commit 3036a93d04
5 changed files with 89 additions and 101 deletions

View File

@@ -21,6 +21,7 @@ cp config/clips.example.txt config/clips.txt
# 3. Edit config/config.yaml (set data_dir and out_dir)
# Edit config/clips.txt (list clips to annotate)
# Edit config/questions.yaml to customise survey questions (optional)
# 4. Run
uv run python -m river_annotation_tool.annotation_script
@@ -48,7 +49,7 @@ cp config/config.example.yaml config/config.yaml
cp config/clips.example.txt config/clips.txt
```
Edit `config/config.yaml` to set your `data_dir` and `out_dir`, then edit `config/clips.txt` to list the clips you want to annotate. See the [Configuration](#configuration) section for all available options.
Edit `config/config.yaml` to set your `data_dir` and `out_dir`, then edit `config/clips.txt` to list the clips you want to annotate. Survey questions are defined in `config/questions.yaml` (committed to the repo; edit to customise). See the [Configuration](#configuration) section for all available options.
### S3 storage (optional)
@@ -115,49 +116,38 @@ uv run python -m river_annotation_tool.annotation_script --clip left_20230615T12
## Configuration
All settings live in `config/config.yaml`. Copy `config/config.example.yaml` to get started.
Main settings live in `config/config.yaml`. Copy `config/config.example.yaml` to get started.
```yaml
storage: local # required: 'local' or 's3'
data_dir: data/clips # directory containing ZIP archives (local path or bucket/prefix for S3)
out_dir: data/annotation_results
data_dir: # required: directory containing ZIP archives (local path or bucket/prefix for S3)
out_dir: # required: where to write annotations
clips_file: config/clips.txt
# optical_flow_config_file: config/optical_flow_config.yaml # optional, enables Auto Segment
optical_flow_config_file: config/optical_flow_config.yaml
questions_config_file: config/questions.yaml
display_max: 720 # longest side in pixels for display
fps_fallback: 25 # FPS to use if the video header is missing
max_frames: 100 # max frames to extract per clip
questions:
- section: River
items:
- key: flow
label: "Flow Regime"
options: [Turbulent, Laminar, Uncertain]
default: Laminar
# add more items or sections as needed
# Override input filenames only if your ZIP archives differ from the defaults
filenames:
video_in_zip: left.mp4 # video filename inside each ZIP archive
video_tmp_suffix: .mp4 # suffix for the extraction temp file
zip_extension: .zip # extension used when resolving clip names
mask: mask.png # saved water mask
metadata: metadata.json # saved survey answers
frame: frame.png # middle frame snapshot
overlay: overlay.png # frame with mask blended in green
mask_vis: mask_vis.png # greyscale mask PNG (--extras only)
gif_original_hires: video_original_hires.gif
gif_original_lowres: video_original_lowres.gif
gif_overlay_hires: video_overlay_hires.gif
gif_overlay_lowres: video_overlay_lowres.gif
video_in_zip: left.mp4
video_tmp_suffix: .mp4
zip_extension: .zip
```
Add, remove, or reorder questions directly in the YAML — the UI rebuilds automatically. `key` is what gets saved in `metadata.json`; `default` selects the pre-checked option (omit or set to `null` to leave unselected).
Output filenames (`mask.png`, `metadata.json`, etc.) have sensible defaults and can be overridden in the `filenames:` block — see [`config.py`](src/river_annotation_tool/config.py) for the full list.
### Optical flow segmentation (optional)
### Survey questions
Set `optical_flow_config_file` in `config.yaml` to point to a YAML file that enables the **Auto Segment** button. When pressed, the tool computes a river mask from the loaded frames and replaces the current mask (undoable). The segmentation combines two criteria:
Survey questions are defined in `config/questions.yaml` (committed to the repo). Add, remove, or reorder sections and items — the UI rebuilds automatically. `key` is what gets saved in `metadata.json`; `default` selects the pre-checked option (omit or set to `null` to leave unselected).
### Optical flow segmentation
`config/optical_flow_config.yaml` controls the **Auto Segment** button. When pressed, the tool computes a river mask from the loaded frames and replaces the current mask (undoable). The segmentation combines two criteria:
- **Optical flow magnitude** — pixels where the temporal median of frame-to-frame flow (scaled by FPS) exceeds a fraction of the maximum are considered moving water.
- **Brightness** — pixels outside a brightness window are excluded (removes sky, saturated glare, etc.).
@@ -245,7 +235,7 @@ Polygons are drawn as overlays and do not affect the mask until you use **Fill**
| Action | How |
|---|---|
| Load mask from previous clip | **Load Prev Mask** — copies the saved mask of the previous clip onto the current one; undoable |
| Optical flow first guess | **Auto Segment** — replaces the current mask with an automatic river segmentation; undoable. Only available when `optical_flow_config_file` is set in `config.yaml`. |
| Optical flow first guess | **Auto Segment** — replaces the current mask with an automatic river segmentation; undoable. Disabled when `enabled: false` in `config/optical_flow_config.yaml`. |
### Image display adjustments
@@ -289,7 +279,7 @@ All output filenames can be overridden via the `filenames:` section in `config/c
### Survey answers (`metadata.json`)
Keys and values are determined by the `questions` section in `config/config.yaml`. With the default config:
Keys and values are determined by `config/questions.yaml`. With the default questions:
```json
{
@@ -332,7 +322,8 @@ config/
config.example.yaml # Example config to copy and edit
clips.txt # Your clip list (git-ignored, copy from example)
clips.example.txt # Example clip list
optical_flow_config.yaml # Optional optical flow parameters (enable via config.yaml)
questions.yaml # Survey question definitions
optical_flow_config.yaml # Optical flow parameters (set enabled: false to disable Auto Segment)
src/river_annotation_tool/
annotation_script.py # Entry point — argument parsing and app launch
annotator.py # Main QMainWindow — orchestrates all components

View File

@@ -1,66 +1,21 @@
# For local storage, set data_dir and out_dir to file-system paths:
storage: local # 'local' (default) or 's3'
data_dir: data/filtered_data
out_dir: data/annotation_results
# For S3 storage, set storage: s3 and use bucket/prefix paths:
# storage: s3
# data_dir: my-bucket/clips
# out_dir: my-bucket/annotation_results
# Credentials are read from env vars (copy .env.example to .env):
storage: local # 'local' or 's3'
# Required: set these to your actual paths (local path or bucket/prefix for S3)
data_dir:
out_dir:
# For S3 credentials, copy .env.example to .env and fill in:
# S3_ACCESS_KEY, S3_SECRET_ACCESS_KEY, S3_ENDPOINT_URL
clips_file: config/clips.txt
optical_flow_config_file: config/optical_flow_config.yaml
questions_config_file: config/questions.yaml
display_max: 720
fps_fallback: 25
max_frames: 100
questions:
- section: River
items:
- key: flow
label: Flow Regime
options: [Turbulent, Laminar, Uncertain]
default: Laminar
- key: shadows
label: Strong Shadows
options: [Yes, No, Uncertain]
default: No
- key: artifacts
label: Artifacts on River
options: [Yes, No, Uncertain]
default: No
- section: Scene
items:
- key: lighting
label: Lighting
options: [Day, Night, Uncertain]
default: Day
- key: exposure
label: Exposure
options: [Overexposed, Underexposed, Both, Normal, Uncertain]
default: Normal
- section: Weather
items:
- key: snowing
label: Snowing
options: [Yes, No, Uncertain]
default: No
- key: snow_on_ground
label: Snow on Ground
options: [Yes, No, Uncertain]
default: No
# Input filenames (override if your ZIP archives differ)
filenames:
video_in_zip: left.mp4
video_tmp_suffix: .mp4
zip_extension: .zip
mask: mask.png
metadata: metadata.json
frame: frame.png
overlay: overlay.png
mask_vis: mask_vis.png
gif_original_hires: video_original_hires.gif
gif_original_lowres: video_original_lowres.gif
gif_overlay_hires: video_overlay_hires.gif
gif_overlay_lowres: video_overlay_lowres.gif

34
config/questions.yaml Normal file
View File

@@ -0,0 +1,34 @@
- section: River
items:
- key: flow
label: Flow Regime
options: [Turbulent, Laminar, Uncertain]
default: Laminar
- key: shadows
label: Strong Shadows
options: [Yes, No, Uncertain]
default: No
- key: artifacts
label: Artifacts on River
options: [Yes, No, Uncertain]
default: No
- section: Scene
items:
- key: lighting
label: Lighting
options: [Day, Night, Uncertain]
default: Day
- key: exposure
label: Exposure
options: [Overexposed, Underexposed, Both, Normal, Uncertain]
default: Normal
- section: Weather
items:
- key: snowing
label: Snowing
options: [Yes, No, Uncertain]
default: No
- key: snow_on_ground
label: Snow on Ground
options: [Yes, No, Uncertain]
default: No

View File

@@ -43,11 +43,7 @@ class Annotator(QMainWindow):
self.fs = fs
self.out_dir = config.out_dir
self.extras = extras
self.of_cfg = (
load_optical_flow_config(Path(config.optical_flow_config_file))
if config.optical_flow_config_file
else None
)
self.of_cfg = load_optical_flow_config(Path(config.optical_flow_config_file))
self.selector = ClipSelector(
data_dir=config.data_dir,
@@ -171,7 +167,7 @@ class Annotator(QMainWindow):
btn_redo = QPushButton("Redo")
btn_load_prev_mask = QPushButton("Load Prev Mask")
btn_auto_segment = QPushButton("Auto Segment")
btn_auto_segment.setEnabled(self.of_cfg is not None and self.of_cfg.enabled)
btn_auto_segment.setEnabled(self.of_cfg.enabled)
row1 = QHBoxLayout()
for b in [

View File

@@ -22,16 +22,17 @@ class FilenameConfig:
@dataclass
class AppConfig:
storage: str # required: 'local' or 's3'
storage: str
data_dir: str
out_dir: str
optical_flow_config_file: str
questions_config_file: str
display_max: int = 480
fps_fallback: int = 25
max_frames: int = 100
data_dir: str = "data/clips"
out_dir: str = "data/annotation_results"
clips_file: str = "config/clips.txt"
optical_flow_config_file: str = ""
questions: list = field(default_factory=list)
filenames: FilenameConfig = field(default_factory=FilenameConfig)
questions: list = field(default_factory=list, init=False)
def get_questions(self):
return [
@@ -69,14 +70,25 @@ def load_optical_flow_config(path: Path) -> OpticalFlowConfig:
return OpticalFlowConfig(**data)
def load_questions_config(path: Path) -> list:
with open(path) as f:
return yaml.safe_load(f)
def load_config(path: Path) -> AppConfig:
with open(path) as f:
data = yaml.safe_load(f)
if "storage" not in data:
raise ValueError(
f"{path}: missing required field 'storage'. Set it to 'local' or 's3'."
)
for required in (
"storage",
"data_dir",
"out_dir",
"optical_flow_config_file",
"questions_config_file",
):
if not data.get(required):
raise ValueError(f"{path}: missing required field '{required}'.")
fn_data = data.pop("filenames", {})
cfg = AppConfig(**data)
cfg.filenames = FilenameConfig(**fn_data)
cfg.questions = load_questions_config(Path(cfg.questions_config_file))
return cfg