Move questions and optical flow to separate config files; clean up config.example.yaml

- questions: extracted from config.yaml into config/questions.yaml (committed, like optical_flow_config.yaml) - optical_flow_config_file and questions_config_file are now required fields - data_dir and out_dir are now required (no defaults) - filenames: trimmed to input-only in example; output filenames stay as code defaults - annotator: remove optional guard around optical flow config loading Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 16:30:59 +02:00
parent 9e135ea28c
commit 3036a93d04
5 changed files with 89 additions and 101 deletions
--- a/README.md
+++ b/README.md
@@ -21,6 +21,7 @@ cp config/clips.example.txt config/clips.txt
 # 3. Edit config/config.yaml (set data_dir and out_dir)
 #    Edit config/clips.txt (list clips to annotate)
 #    Edit config/questions.yaml to customise survey questions (optional)
 # 4. Run
 uv run python -m river_annotation_tool.annotation_script
@@ -48,7 +49,7 @@ cp config/config.example.yaml config/config.yaml
 cp config/clips.example.txt config/clips.txt
 ```
-Edit `config/config.yaml` to set your `data_dir` and `out_dir`, then edit `config/clips.txt` to list the clips you want to annotate. See the [Configuration](#configuration) section for all available options.
+Edit `config/config.yaml` to set your `data_dir` and `out_dir`, then edit `config/clips.txt` to list the clips you want to annotate. Survey questions are defined in `config/questions.yaml` (committed to the repo; edit to customise). See the [Configuration](#configuration) section for all available options.
 ### S3 storage (optional)
@@ -115,49 +116,38 @@ uv run python -m river_annotation_tool.annotation_script --clip left_20230615T12
 ## Configuration
-All settings live in `config/config.yaml`. Copy `config/config.example.yaml` to get started.
+Main settings live in `config/config.yaml`. Copy `config/config.example.yaml` to get started.
 ```yaml
 storage: local            # required: 'local' or 's3'
-data_dir: data/clips      # directory containing ZIP archives (local path or bucket/prefix for S3)
+data_dir:                 # required: directory containing ZIP archives (local path or bucket/prefix for S3)
-out_dir: data/annotation_results
+out_dir:                  # required: where to write annotations
 clips_file: config/clips.txt
-# optical_flow_config_file: config/optical_flow_config.yaml   # optional, enables Auto Segment
+optical_flow_config_file: config/optical_flow_config.yaml
 questions_config_file: config/questions.yaml
 display_max: 720          # longest side in pixels for display
 fps_fallback: 25          # FPS to use if the video header is missing
 max_frames: 100           # max frames to extract per clip
-questions:
+# Override input filenames only if your ZIP archives differ from the defaults
  - section: River
    items:
      - key: flow
        label: "Flow Regime"
        options: [Turbulent, Laminar, Uncertain]
        default: Laminar
      # add more items or sections as needed
 filenames:
-  video_in_zip: left.mp4           # video filename inside each ZIP archive
+  video_in_zip: left.mp4
-  video_tmp_suffix: .mp4           # suffix for the extraction temp file
+  video_tmp_suffix: .mp4
-  zip_extension: .zip              # extension used when resolving clip names
+  zip_extension: .zip
  mask: mask.png                   # saved water mask
  metadata: metadata.json          # saved survey answers
  frame: frame.png                 # middle frame snapshot
  overlay: overlay.png             # frame with mask blended in green
  mask_vis: mask_vis.png           # greyscale mask PNG (--extras only)
  gif_original_hires: video_original_hires.gif
  gif_original_lowres: video_original_lowres.gif
  gif_overlay_hires: video_overlay_hires.gif
  gif_overlay_lowres: video_overlay_lowres.gif
 ```
-Add, remove, or reorder questions directly in the YAML — the UI rebuilds automatically. `key` is what gets saved in `metadata.json`; `default` selects the pre-checked option (omit or set to `null` to leave unselected).
+Output filenames (`mask.png`, `metadata.json`, etc.) have sensible defaults and can be overridden in the `filenames:` block — see [`config.py`](src/river_annotation_tool/config.py) for the full list.
-### Optical flow segmentation (optional)
+### Survey questions
-Set `optical_flow_config_file` in `config.yaml` to point to a YAML file that enables the **Auto Segment** button. When pressed, the tool computes a river mask from the loaded frames and replaces the current mask (undoable). The segmentation combines two criteria:
+Survey questions are defined in `config/questions.yaml` (committed to the repo). Add, remove, or reorder sections and items — the UI rebuilds automatically. `key` is what gets saved in `metadata.json`; `default` selects the pre-checked option (omit or set to `null` to leave unselected).
 ### Optical flow segmentation
 `config/optical_flow_config.yaml` controls the **Auto Segment** button. When pressed, the tool computes a river mask from the loaded frames and replaces the current mask (undoable). The segmentation combines two criteria:
 - **Optical flow magnitude** — pixels where the temporal median of frame-to-frame flow (scaled by FPS) exceeds a fraction of the maximum are considered moving water.
 - **Brightness** — pixels outside a brightness window are excluded (removes sky, saturated glare, etc.).
@@ -245,7 +235,7 @@ Polygons are drawn as overlays and do not affect the mask until you use **Fill**
 | Action | How |
 |---|---|
 | Load mask from previous clip | **Load Prev Mask** — copies the saved mask of the previous clip onto the current one; undoable |
-| Optical flow first guess | **Auto Segment** — replaces the current mask with an automatic river segmentation; undoable. Only available when `optical_flow_config_file` is set in `config.yaml`. |
+| Optical flow first guess | **Auto Segment** — replaces the current mask with an automatic river segmentation; undoable. Disabled when `enabled: false` in `config/optical_flow_config.yaml`. |
 ### Image display adjustments
@@ -289,7 +279,7 @@ All output filenames can be overridden via the `filenames:` section in `config/c
 ### Survey answers (`metadata.json`)
-Keys and values are determined by the `questions` section in `config/config.yaml`. With the default config:
+Keys and values are determined by `config/questions.yaml`. With the default questions:
 ```json
 {
@@ -332,7 +322,8 @@ config/
    config.example.yaml             # Example config to copy and edit
    clips.txt                       # Your clip list (git-ignored, copy from example)
    clips.example.txt               # Example clip list
-    optical_flow_config.yaml        # Optional optical flow parameters (enable via config.yaml)
+    questions.yaml                  # Survey question definitions
    optical_flow_config.yaml        # Optical flow parameters (set enabled: false to disable Auto Segment)
 src/river_annotation_tool/
    annotation_script.py            # Entry point — argument parsing and app launch
    annotator.py                    # Main QMainWindow — orchestrates all components
--- a/config/config.example.yaml
+++ b/config/config.example.yaml
@@ -1,66 +1,21 @@
-# For local storage, set data_dir and out_dir to file-system paths:
+storage: local   # 'local' or 's3'
-storage: local   # 'local' (default) or 's3'
+
-data_dir: data/filtered_data
+# Required: set these to your actual paths (local path or bucket/prefix for S3)
-out_dir: data/annotation_results
+data_dir:
-# For S3 storage, set storage: s3 and use bucket/prefix paths:
+out_dir:
-# storage: s3
+# For S3 credentials, copy .env.example to .env and fill in:
 # data_dir: my-bucket/clips
 # out_dir: my-bucket/annotation_results
 # Credentials are read from env vars (copy .env.example to .env):
 # S3_ACCESS_KEY, S3_SECRET_ACCESS_KEY, S3_ENDPOINT_URL
 clips_file: config/clips.txt
 optical_flow_config_file: config/optical_flow_config.yaml
 questions_config_file: config/questions.yaml
 display_max: 720
 fps_fallback: 25
 max_frames: 100
-questions:
+# Input filenames (override if your ZIP archives differ)
  - section: River
    items:
      - key: flow
        label: Flow Regime
        options: [Turbulent, Laminar, Uncertain]
        default: Laminar
      - key: shadows
        label: Strong Shadows
        options: [Yes, No, Uncertain]
        default: No
      - key: artifacts
        label: Artifacts on River
        options: [Yes, No, Uncertain]
        default: No
  - section: Scene
    items:
      - key: lighting
        label: Lighting
        options: [Day, Night, Uncertain]
        default: Day
      - key: exposure
        label: Exposure
        options: [Overexposed, Underexposed, Both, Normal, Uncertain]
        default: Normal
  - section: Weather
    items:
      - key: snowing
        label: Snowing
        options: [Yes, No, Uncertain]
        default: No
      - key: snow_on_ground
        label: Snow on Ground
        options: [Yes, No, Uncertain]
        default: No
 filenames:
  video_in_zip: left.mp4
  video_tmp_suffix: .mp4
  zip_extension: .zip
  mask: mask.png
  metadata: metadata.json
  frame: frame.png
  overlay: overlay.png
  mask_vis: mask_vis.png
  gif_original_hires: video_original_hires.gif
  gif_original_lowres: video_original_lowres.gif
  gif_overlay_hires: video_overlay_hires.gif
  gif_overlay_lowres: video_overlay_lowres.gif
--- a/config/questions.yaml
+++ b/config/questions.yaml
@@ -0,0 +1,34 @@
 - section: River
  items:
    - key: flow
      label: Flow Regime
      options: [Turbulent, Laminar, Uncertain]
      default: Laminar
    - key: shadows
      label: Strong Shadows
      options: [Yes, No, Uncertain]
      default: No
    - key: artifacts
      label: Artifacts on River
      options: [Yes, No, Uncertain]
      default: No
 - section: Scene
  items:
    - key: lighting
      label: Lighting
      options: [Day, Night, Uncertain]
      default: Day
    - key: exposure
      label: Exposure
      options: [Overexposed, Underexposed, Both, Normal, Uncertain]
      default: Normal
 - section: Weather
  items:
    - key: snowing
      label: Snowing
      options: [Yes, No, Uncertain]
      default: No
    - key: snow_on_ground
      label: Snow on Ground
      options: [Yes, No, Uncertain]
      default: No
--- a/src/river_annotation_tool/annotator.py
+++ b/src/river_annotation_tool/annotator.py
@@ -43,11 +43,7 @@ class Annotator(QMainWindow):
        self.fs = fs
        self.out_dir = config.out_dir
        self.extras = extras
-        self.of_cfg = (
+        self.of_cfg = load_optical_flow_config(Path(config.optical_flow_config_file))
            load_optical_flow_config(Path(config.optical_flow_config_file))
            if config.optical_flow_config_file
            else None
        )
        self.selector = ClipSelector(
            data_dir=config.data_dir,
@@ -171,7 +167,7 @@ class Annotator(QMainWindow):
        btn_redo = QPushButton("Redo")
        btn_load_prev_mask = QPushButton("Load Prev Mask")
        btn_auto_segment = QPushButton("Auto Segment")
-        btn_auto_segment.setEnabled(self.of_cfg is not None and self.of_cfg.enabled)
+        btn_auto_segment.setEnabled(self.of_cfg.enabled)
        row1 = QHBoxLayout()
        for b in [
--- a/src/river_annotation_tool/config.py
+++ b/src/river_annotation_tool/config.py
@@ -22,16 +22,17 @@ class FilenameConfig:
@dataclass
 class AppConfig:
-    storage: str  # required: 'local' or 's3'
+    storage: str
    data_dir: str
    out_dir: str
    optical_flow_config_file: str
    questions_config_file: str
    display_max: int = 480
    fps_fallback: int = 25
    max_frames: int = 100
    data_dir: str = "data/clips"
    out_dir: str = "data/annotation_results"
    clips_file: str = "config/clips.txt"
    optical_flow_config_file: str = ""
    questions: list = field(default_factory=list)
    filenames: FilenameConfig = field(default_factory=FilenameConfig)
    questions: list = field(default_factory=list, init=False)
    def get_questions(self):
        return [
@@ -69,14 +70,25 @@ def load_optical_flow_config(path: Path) -> OpticalFlowConfig:
    return OpticalFlowConfig(**data)
 def load_questions_config(path: Path) -> list:
    with open(path) as f:
        return yaml.safe_load(f)
 def load_config(path: Path) -> AppConfig:
    with open(path) as f:
        data = yaml.safe_load(f)
-    if "storage" not in data:
+    for required in (
-        raise ValueError(
+        "storage",
-            f"{path}: missing required field 'storage'. Set it to 'local' or 's3'."
+        "data_dir",
-        )
+        "out_dir",
        "optical_flow_config_file",
        "questions_config_file",
    ):
        if not data.get(required):
            raise ValueError(f"{path}: missing required field '{required}'.")
    fn_data = data.pop("filenames", {})
    cfg = AppConfig(**data)
    cfg.filenames = FilenameConfig(**fn_data)
    cfg.questions = load_questions_config(Path(cfg.questions_config_file))
    return cfg