Move questions and optical flow to separate config files; clean up config.example.yaml

- questions: extracted from config.yaml into config/questions.yaml (committed, like optical_flow_config.yaml) - optical_flow_config_file and questions_config_file are now required fields - data_dir and out_dir are now required (no defaults) - filenames: trimmed to input-only in example; output filenames stay as code defaults - annotator: remove optional guard around optical flow config loading Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 16:30:59 +02:00
parent 9e135ea28c
commit 3036a93d04
5 changed files with 89 additions and 101 deletions
--- a/README.md
+++ b/README.md
@@ -21,6 +21,7 @@ cp config/clips.example.txt config/clips.txt

 # 3. Edit config/config.yaml (set data_dir and out_dir)
 #    Edit config/clips.txt (list clips to annotate)
+#    Edit config/questions.yaml to customise survey questions (optional)

 # 4. Run
 uv run python -m river_annotation_tool.annotation_script
@@ -48,7 +49,7 @@ cp config/config.example.yaml config/config.yaml
 cp config/clips.example.txt config/clips.txt
 ```

-Edit `config/config.yaml` to set your `data_dir` and `out_dir`, then edit `config/clips.txt` to list the clips you want to annotate. See the [Configuration](#configuration) section for all available options.
+Edit `config/config.yaml` to set your `data_dir` and `out_dir`, then edit `config/clips.txt` to list the clips you want to annotate. Survey questions are defined in `config/questions.yaml` (committed to the repo; edit to customise). See the [Configuration](#configuration) section for all available options.

 ### S3 storage (optional)

@@ -115,49 +116,38 @@ uv run python -m river_annotation_tool.annotation_script --clip left_20230615T12

 ## Configuration

-All settings live in `config/config.yaml`. Copy `config/config.example.yaml` to get started.
+Main settings live in `config/config.yaml`. Copy `config/config.example.yaml` to get started.

 ```yaml
 storage: local            # required: 'local' or 's3'

-data_dir: data/clips      # directory containing ZIP archives (local path or bucket/prefix for S3)
-out_dir: data/annotation_results
+data_dir:                 # required: directory containing ZIP archives (local path or bucket/prefix for S3)
+out_dir:                  # required: where to write annotations
+
 clips_file: config/clips.txt
-# optical_flow_config_file: config/optical_flow_config.yaml   # optional, enables Auto Segment
+optical_flow_config_file: config/optical_flow_config.yaml
+questions_config_file: config/questions.yaml

 display_max: 720          # longest side in pixels for display
 fps_fallback: 25          # FPS to use if the video header is missing
 max_frames: 100           # max frames to extract per clip

-questions:
-  - section: River
-    items:
-      - key: flow
-        label: "Flow Regime"
-        options: [Turbulent, Laminar, Uncertain]
-        default: Laminar
-      # add more items or sections as needed
-
+# Override input filenames only if your ZIP archives differ from the defaults
 filenames:
-  video_in_zip: left.mp4           # video filename inside each ZIP archive
-  video_tmp_suffix: .mp4           # suffix for the extraction temp file
-  zip_extension: .zip              # extension used when resolving clip names
-  mask: mask.png                   # saved water mask
-  metadata: metadata.json          # saved survey answers
-  frame: frame.png                 # middle frame snapshot
-  overlay: overlay.png             # frame with mask blended in green
-  mask_vis: mask_vis.png           # greyscale mask PNG (--extras only)
-  gif_original_hires: video_original_hires.gif
-  gif_original_lowres: video_original_lowres.gif
-  gif_overlay_hires: video_overlay_hires.gif
-  gif_overlay_lowres: video_overlay_lowres.gif
+  video_in_zip: left.mp4
+  video_tmp_suffix: .mp4
+  zip_extension: .zip
 ```

-Add, remove, or reorder questions directly in the YAML — the UI rebuilds automatically. `key` is what gets saved in `metadata.json`; `default` selects the pre-checked option (omit or set to `null` to leave unselected).
+Output filenames (`mask.png`, `metadata.json`, etc.) have sensible defaults and can be overridden in the `filenames:` block — see [`config.py`](src/river_annotation_tool/config.py) for the full list.

-### Optical flow segmentation (optional)
+### Survey questions

-Set `optical_flow_config_file` in `config.yaml` to point to a YAML file that enables the **Auto Segment** button. When pressed, the tool computes a river mask from the loaded frames and replaces the current mask (undoable). The segmentation combines two criteria:
+Survey questions are defined in `config/questions.yaml` (committed to the repo). Add, remove, or reorder sections and items — the UI rebuilds automatically. `key` is what gets saved in `metadata.json`; `default` selects the pre-checked option (omit or set to `null` to leave unselected).
+
+### Optical flow segmentation
+
+`config/optical_flow_config.yaml` controls the **Auto Segment** button. When pressed, the tool computes a river mask from the loaded frames and replaces the current mask (undoable). The segmentation combines two criteria:

 - **Optical flow magnitude** — pixels where the temporal median of frame-to-frame flow (scaled by FPS) exceeds a fraction of the maximum are considered moving water.
 - **Brightness** — pixels outside a brightness window are excluded (removes sky, saturated glare, etc.).
@@ -245,7 +235,7 @@ Polygons are drawn as overlays and do not affect the mask until you use **Fill**
 | Action | How |
 |---|---|
 | Load mask from previous clip | **Load Prev Mask** — copies the saved mask of the previous clip onto the current one; undoable |
-| Optical flow first guess | **Auto Segment** — replaces the current mask with an automatic river segmentation; undoable. Only available when `optical_flow_config_file` is set in `config.yaml`. |
+| Optical flow first guess | **Auto Segment** — replaces the current mask with an automatic river segmentation; undoable. Disabled when `enabled: false` in `config/optical_flow_config.yaml`. |

 ### Image display adjustments

@@ -289,7 +279,7 @@ All output filenames can be overridden via the `filenames:` section in `config/c

 ### Survey answers (`metadata.json`)

-Keys and values are determined by the `questions` section in `config/config.yaml`. With the default config:
+Keys and values are determined by `config/questions.yaml`. With the default questions:

 ```json
 {
@@ -332,7 +322,8 @@ config/
    config.example.yaml             # Example config to copy and edit
    clips.txt                       # Your clip list (git-ignored, copy from example)
    clips.example.txt               # Example clip list
-    optical_flow_config.yaml        # Optional optical flow parameters (enable via config.yaml)
+    questions.yaml                  # Survey question definitions
+    optical_flow_config.yaml        # Optical flow parameters (set enabled: false to disable Auto Segment)
 src/river_annotation_tool/
    annotation_script.py            # Entry point — argument parsing and app launch
    annotator.py                    # Main QMainWindow — orchestrates all components
--- a/config/config.example.yaml
+++ b/config/config.example.yaml
@@ -1,66 +1,21 @@
-# For local storage, set data_dir and out_dir to file-system paths:
-storage: local   # 'local' (default) or 's3'
-data_dir: data/filtered_data
-out_dir: data/annotation_results
-# For S3 storage, set storage: s3 and use bucket/prefix paths:
-# storage: s3
-# data_dir: my-bucket/clips
-# out_dir: my-bucket/annotation_results
-# Credentials are read from env vars (copy .env.example to .env):
+storage: local   # 'local' or 's3'
+
+# Required: set these to your actual paths (local path or bucket/prefix for S3)
+data_dir:
+out_dir:
+# For S3 credentials, copy .env.example to .env and fill in:
 # S3_ACCESS_KEY, S3_SECRET_ACCESS_KEY, S3_ENDPOINT_URL
+
 clips_file: config/clips.txt
 optical_flow_config_file: config/optical_flow_config.yaml
+questions_config_file: config/questions.yaml

 display_max: 720
 fps_fallback: 25
 max_frames: 100

-questions:
-  - section: River
-    items:
-      - key: flow
-        label: Flow Regime
-        options: [Turbulent, Laminar, Uncertain]
-        default: Laminar
-      - key: shadows
-        label: Strong Shadows
-        options: [Yes, No, Uncertain]
-        default: No
-      - key: artifacts
-        label: Artifacts on River
-        options: [Yes, No, Uncertain]
-        default: No
-  - section: Scene
-    items:
-      - key: lighting
-        label: Lighting
-        options: [Day, Night, Uncertain]
-        default: Day
-      - key: exposure
-        label: Exposure
-        options: [Overexposed, Underexposed, Both, Normal, Uncertain]
-        default: Normal
-  - section: Weather
-    items:
-      - key: snowing
-        label: Snowing
-        options: [Yes, No, Uncertain]
-        default: No
-      - key: snow_on_ground
-        label: Snow on Ground
-        options: [Yes, No, Uncertain]
-        default: No
-
+# Input filenames (override if your ZIP archives differ)
 filenames:
  video_in_zip: left.mp4
  video_tmp_suffix: .mp4
  zip_extension: .zip
-  mask: mask.png
-  metadata: metadata.json
-  frame: frame.png
-  overlay: overlay.png
-  mask_vis: mask_vis.png
-  gif_original_hires: video_original_hires.gif
-  gif_original_lowres: video_original_lowres.gif
-  gif_overlay_hires: video_overlay_hires.gif
-  gif_overlay_lowres: video_overlay_lowres.gif
--- a/config/questions.yaml
+++ b/config/questions.yaml
@@ -0,0 +1,34 @@
+- section: River
+  items:
+    - key: flow
+      label: Flow Regime
+      options: [Turbulent, Laminar, Uncertain]
+      default: Laminar
+    - key: shadows
+      label: Strong Shadows
+      options: [Yes, No, Uncertain]
+      default: No
+    - key: artifacts
+      label: Artifacts on River
+      options: [Yes, No, Uncertain]
+      default: No
+- section: Scene
+  items:
+    - key: lighting
+      label: Lighting
+      options: [Day, Night, Uncertain]
+      default: Day
+    - key: exposure
+      label: Exposure
+      options: [Overexposed, Underexposed, Both, Normal, Uncertain]
+      default: Normal
+- section: Weather
+  items:
+    - key: snowing
+      label: Snowing
+      options: [Yes, No, Uncertain]
+      default: No
+    - key: snow_on_ground
+      label: Snow on Ground
+      options: [Yes, No, Uncertain]
+      default: No
--- a/src/river_annotation_tool/annotator.py
+++ b/src/river_annotation_tool/annotator.py
@@ -43,11 +43,7 @@ class Annotator(QMainWindow):
        self.fs = fs
        self.out_dir = config.out_dir
        self.extras = extras
-        self.of_cfg = (
-            load_optical_flow_config(Path(config.optical_flow_config_file))
-            if config.optical_flow_config_file
-            else None
-        )
+        self.of_cfg = load_optical_flow_config(Path(config.optical_flow_config_file))

        self.selector = ClipSelector(
            data_dir=config.data_dir,
@@ -171,7 +167,7 @@ class Annotator(QMainWindow):
        btn_redo = QPushButton("Redo")
        btn_load_prev_mask = QPushButton("Load Prev Mask")
        btn_auto_segment = QPushButton("Auto Segment")
-        btn_auto_segment.setEnabled(self.of_cfg is not None and self.of_cfg.enabled)
+        btn_auto_segment.setEnabled(self.of_cfg.enabled)

        row1 = QHBoxLayout()
        for b in [
--- a/src/river_annotation_tool/config.py
+++ b/src/river_annotation_tool/config.py
@@ -22,16 +22,17 @@ class FilenameConfig:

@dataclass
 class AppConfig:
-    storage: str  # required: 'local' or 's3'
+    storage: str
+    data_dir: str
+    out_dir: str
+    optical_flow_config_file: str
+    questions_config_file: str
    display_max: int = 480
    fps_fallback: int = 25
    max_frames: int = 100
-    data_dir: str = "data/clips"
-    out_dir: str = "data/annotation_results"
    clips_file: str = "config/clips.txt"
-    optical_flow_config_file: str = ""
-    questions: list = field(default_factory=list)
    filenames: FilenameConfig = field(default_factory=FilenameConfig)
+    questions: list = field(default_factory=list, init=False)

    def get_questions(self):
        return [
@@ -69,14 +70,25 @@ def load_optical_flow_config(path: Path) -> OpticalFlowConfig:
    return OpticalFlowConfig(**data)


+def load_questions_config(path: Path) -> list:
+    with open(path) as f:
+        return yaml.safe_load(f)
+
+
 def load_config(path: Path) -> AppConfig:
    with open(path) as f:
        data = yaml.safe_load(f)
-    if "storage" not in data:
-        raise ValueError(
-            f"{path}: missing required field 'storage'. Set it to 'local' or 's3'."
-        )
+    for required in (
+        "storage",
+        "data_dir",
+        "out_dir",
+        "optical_flow_config_file",
+        "questions_config_file",
+    ):
+        if not data.get(required):
+            raise ValueError(f"{path}: missing required field '{required}'.")
    fn_data = data.pop("filenames", {})
    cfg = AppConfig(**data)
    cfg.filenames = FilenameConfig(**fn_data)
+    cfg.questions = load_questions_config(Path(cfg.questions_config_file))
    return cfg