support for multiple data streams (#187)

* Multi view (#115) * [add] multiview dictionary and multiviewdataset * [add] multiview to get_dataset * [fix] copy package & type for dataset * [add] dynamic keys * [add] added Multiviewheatmaplabeled to base.basesupervisedtracker type * [add] mouse mirror dataset * [add] test dataset for Multiviewheatmap * [add] test for multiview heatmap data module * [add] hydra raise notinplementederror for regression and multiview * [add] prediction handler and keypoints structure for Multiview * [fix] keypoints for multiview in prediction is now saved in csvs correctly * [add] make_multiview_dataset * [fix] datacheck is more strict now * [change] the dataset is now stacked along channel dimension. [fix] tests are vaild now * [fix] images are stacked along the batch dimension and representation was reshaped * [add] data type to fusion * [fix] datadicts in data utils * [add] test_supervised_multiview_heatmap and skipped video_dataloader * update code to use original image and video pixel coordinates for all predictions (#117) * Squashed commit of the following: commit 9b7de3c Author: Selmaan <[email protected]> Date: Mon Sep 18 16:39:04 2023 -0400 Update config_birdCOM.yaml commit 1aae9de Author: Selmaan <[email protected]> Date: Mon Sep 18 15:59:43 2023 -0400 update configs for detector net Added params to default config need for dynamic crop algorithm and detector network. Then created a new bird config with edited fields * add bbox to dataset dictionaries add a xyhw bbox entry to dictionary for dataset results, for example and batch. Also update the base dataset class (for backwards compatability) to output a bounding box in the __getitem__ call. This 'default' bbox is just the full image dimensions * Update datasets.py fix automatic detection of image height and width * Update heatmap_tracker.py add code to convert from bounding box to original image dimension coordinates * Update test_pca.py fix keys in unit test * add bbox conversion get_loss_inputs_labeled and predict_step now call convert_bbox_coords, which converts predicted keypoints from (potentially cropped) image intrinsic coordinates to original image coordinates, using bbox info * code linting fixed code formatting following flake8 * Update heatmap_tracker.py add label reminding me to implement bbox conversion later for unlabeled data * Update augmentations.py do not resize data in the augmentation pipeline when using dynamic crop algorithm (cropping will be handled by dynamic pipeline) * Update config_birdCOM.yaml * Update scripts.py * Update .gitignore * Update datasets.py add new detector dataset, and correct previous image height and width calculation in BastTrackingDataset * Update heatmap_tracker.py transform target along with predicted keypoints from transformed image coordinates to original image coordinates * Update scripts.py fix typo in get_detector_model * Update config_birdCOM.yaml set useful parameter values for this dataset * add dynamic labeled dicts these are unused for now, will be implemented in the future for multi-instance detection * Update config_birdCOM.yaml redo COM config to be independent pipeline from later POS pipeline * Update scripts.py remove image size checking when setting up dataset. unrelated, also set 'columns for singleview pca' to None automatically if it's not set in the config * remove keypoints rescaling in predict remove rescaling of keypoints according to static config info (which is now removed). Keypoints are dynamically rescaled in model's predict_step now already! * add bbox conversion for unlabeled data update predict step and get unsupervised losses to convert keypoints predicted on unlabeled video frame data to original image coords using bbox info * update dali dataloader with frame sizes dali dataloader now outputs the size of loaded video frames in bbox info * pre computing heatmaps is very slow I think there is very little upside to having these precomputed? Creates a big lag for my use case whenever I try to create a dataset. * add bbox to DynamicDict * image_orig_dims no longer in configs removed image_orig_dims from config, so also does not need to be copied here. It will always be inferred from video/image data during inference * Update config_birdCOM.yaml * Create config_birdCOM_backup.yaml * delete commented out code * code linting * add bbox to unlabeled batch keys * changes from pull request review mostly moving convert_bbox_coords to be a function in models.base rather than a method of heatmap tracker class * fix bbox to device with images fixes edge case where frames and bbox where on separate devices * add bbox conversion unit test the test artifically crops and shifts an image and offsets the detected keypoint locations accordingly, and verifies that bbox_conversion for the original and re-cropped data match * combine multiview and dynamic crop PRs * [add] compute_metrics for multiview, it is a loop for each file (#119) * bug fix for multi-view metric computation (#121) * final bug fixes for multiview * bbox bug fix; closes #109, #120 * [docs] multiview separate * Multiview (#126) * [fix] preds_file typo to list * [fix] list of preds_file is being processed now * [add] hydra for compute metrics1 * [add] multiview heatmap context * [add] multiview heatmap context conftest * [add] dataset test * [add] mview data module test * [add] context and thir tests * [add] dynamic naming dataset basic * [fix] flake8 * PR fixes * add bbox coord transform to context models --------- Co-authored-by: themattinthehatt <[email protected]> * tweaks to streamlit to show labeled data results from all views * [fix] fiftyone app now compatible with multiple views * [fix] dataset typechecking error, new unit tests * remove detector code from multiview branch * remove detector code from multiview branch * update IO code to properly find multiview videos * update video_pipe to work for multiple views * update LitDaliWrapper to work for multiple views * semisupervised multiview training without error * bug fixes with dali augmentations * multiview semisupervised context dataloader + model tests passing * affine transform bug fix + refactoring + unit test --------- Co-authored-by: Farzad Ziaie Nezhad <[email protected]> Co-authored-by: Selmaan <[email protected]>
paninski-lab · Jul 18, 2024 · 3c74e8b · 3c74e8b
1 parent 4967266
commit 3c74e8b
Show file tree

Hide file tree

Showing 54 changed files with 2,832 additions and 603 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,12 +1,3 @@
-lightning_logs/
-grid_artifacts/
-.vscode/
-outputs/
-multirun/
-preds/
-scripts/.ipynb_checkpoints
-#scripts/configs_*
-
 # Byte-compiled / optimized / DLL files
 __pycache__/
 *.py[cod]
@@ -144,10 +135,19 @@ dmypy.json
 **/.DS_Store
 
 # specific stuff like csv predictions on test_vid.mp4
-toy_datasets/toymouseRunningData/unlabeled_videos/*.csv
-toy_datasets/toymouseRunningData/unlabeled_videos/test_vid_labeled.mp4
-toy_datasets/toymouseRunningData/unlabeled_videos/test_vid_*.mp4
-toy_datasets/toymouseRunningData/barObstacleScaling1/*.npy
+data/mirror-mouse-example/videos/*.csv
+data/mirror-mouse-example/videos/test_vid_*.mp4
 
-# split dataset that is computed on the fly
+# other datasets
 data/mirror-mouse-example_split/
+data/Chickadee
+
+# random other outputs that might end up in the repo
+lightning_logs/
+grid_artifacts/
+.vscode/
+outputs/
+multirun/
+preds/
+scripts/.ipynb_checkpoints
+tb_logs
diff --git a/docs/api/lightning_pose.data.datasets.BaseTrackingDataset.rst b/docs/api/lightning_pose.data.datasets.BaseTrackingDataset.rst
@@ -5,3 +5,15 @@ BaseTrackingDataset
 
 .. autoclass:: BaseTrackingDataset
    :show-inheritance:
+
+   .. rubric:: Attributes Summary
+
+   .. autosummary::
+
+      ~BaseTrackingDataset.height
+      ~BaseTrackingDataset.width
+
+   .. rubric:: Attributes Documentation
+
+   .. autoattribute:: height
+   .. autoattribute:: width
diff --git a/docs/api/lightning_pose.data.datasets.HeatmapDataset.rst b/docs/api/lightning_pose.data.datasets.HeatmapDataset.rst
@@ -5,3 +5,25 @@ HeatmapDataset
 
 .. autoclass:: HeatmapDataset
    :show-inheritance:
+
+   .. rubric:: Attributes Summary
+
+   .. autosummary::
+
+      ~HeatmapDataset.output_shape
+
+   .. rubric:: Methods Summary
+
+   .. autosummary::
+
+      ~HeatmapDataset.compute_heatmap
+      ~HeatmapDataset.compute_heatmaps
+
+   .. rubric:: Attributes Documentation
+
+   .. autoattribute:: output_shape
+
+   .. rubric:: Methods Documentation
+
+   .. automethod:: compute_heatmap
+   .. automethod:: compute_heatmaps
diff --git a/docs/api/lightning_pose.data.datasets.MultiviewHeatmapDataset.rst b/docs/api/lightning_pose.data.datasets.MultiviewHeatmapDataset.rst
@@ -0,0 +1,35 @@
+MultiviewHeatmapDataset
+=======================
+
+.. currentmodule:: lightning_pose.data.datasets
+
+.. autoclass:: MultiviewHeatmapDataset
+   :show-inheritance:
+
+   .. rubric:: Attributes Summary
+
+   .. autosummary::
+
+      ~MultiviewHeatmapDataset.height
+      ~MultiviewHeatmapDataset.num_views
+      ~MultiviewHeatmapDataset.output_shape
+      ~MultiviewHeatmapDataset.width
+
+   .. rubric:: Methods Summary
+
+   .. autosummary::
+
+      ~MultiviewHeatmapDataset.check_data_images_names
+      ~MultiviewHeatmapDataset.fusion
+
+   .. rubric:: Attributes Documentation
+
+   .. autoattribute:: height
+   .. autoattribute:: num_views
+   .. autoattribute:: output_shape
+   .. autoattribute:: width
+
+   .. rubric:: Methods Documentation
+
+   .. automethod:: check_data_images_names
+   .. automethod:: fusion
diff --git a/docs/api/lightning_pose.data.utils.MultiviewHeatmapLabeledBatchDict.rst b/docs/api/lightning_pose.data.utils.MultiviewHeatmapLabeledBatchDict.rst
@@ -0,0 +1,7 @@
+MultiviewHeatmapLabeledBatchDict
+================================
+
+.. currentmodule:: lightning_pose.data.utils
+
+.. autoclass:: MultiviewHeatmapLabeledBatchDict
+   :show-inheritance:
diff --git a/docs/api/lightning_pose.data.utils.MultiviewHeatmapLabeledExampleDict.rst b/docs/api/lightning_pose.data.utils.MultiviewHeatmapLabeledExampleDict.rst
@@ -0,0 +1,7 @@
+MultiviewHeatmapLabeledExampleDict
+==================================
+
+.. currentmodule:: lightning_pose.data.utils
+
+.. autoclass:: MultiviewHeatmapLabeledExampleDict
+   :show-inheritance:
diff --git a/docs/api/lightning_pose.data.utils.MultiviewLabeledBatchDict.rst b/docs/api/lightning_pose.data.utils.MultiviewLabeledBatchDict.rst
@@ -0,0 +1,7 @@
+MultiviewLabeledBatchDict
+=========================
+
+.. currentmodule:: lightning_pose.data.utils
+
+.. autoclass:: MultiviewLabeledBatchDict
+   :show-inheritance:
diff --git a/docs/api/lightning_pose.data.utils.MultiviewLabeledExampleDict.rst b/docs/api/lightning_pose.data.utils.MultiviewLabeledExampleDict.rst
@@ -0,0 +1,7 @@
+MultiviewLabeledExampleDict
+===========================
+
+.. currentmodule:: lightning_pose.data.utils
+
+.. autoclass:: MultiviewLabeledExampleDict
+   :show-inheritance:
diff --git a/docs/api/lightning_pose.data.utils.MultiviewUnlabeledBatchDict.rst b/docs/api/lightning_pose.data.utils.MultiviewUnlabeledBatchDict.rst
@@ -0,0 +1,17 @@
+MultiviewUnlabeledBatchDict
+===========================
+
+.. currentmodule:: lightning_pose.data.utils
+
+.. autoclass:: MultiviewUnlabeledBatchDict
+   :show-inheritance:
+
+   .. rubric:: Attributes Summary
+
+   .. autosummary::
+
+      ~MultiviewUnlabeledBatchDict.is_multiview
+
+   .. rubric:: Attributes Documentation
+
+   .. autoattribute:: is_multiview
diff --git a/docs/api/lightning_pose.data.utils.UnlabeledBatchDict.rst b/docs/api/lightning_pose.data.utils.UnlabeledBatchDict.rst
@@ -5,3 +5,13 @@ UnlabeledBatchDict
 
 .. autoclass:: UnlabeledBatchDict
    :show-inheritance:
+
+   .. rubric:: Attributes Summary
+
+   .. autosummary::
+
+      ~UnlabeledBatchDict.is_multiview
+
+   .. rubric:: Attributes Documentation
+
+   .. autoattribute:: is_multiview
diff --git a/docs/api/lightning_pose.data.utils.undo_affine_transform_batch.rst b/docs/api/lightning_pose.data.utils.undo_affine_transform_batch.rst
@@ -0,0 +1,6 @@
+undo_affine_transform_batch
+===========================
+
+.. currentmodule:: lightning_pose.data.utils
+
+.. autofunction:: undo_affine_transform_batch
diff --git a/docs/api/lightning_pose.models.base.convert_bbox_coords.rst b/docs/api/lightning_pose.models.base.convert_bbox_coords.rst
@@ -0,0 +1,6 @@
+convert_bbox_coords
+===================
+
+.. currentmodule:: lightning_pose.models.base
+
+.. autofunction:: convert_bbox_coords
diff --git a/docs/api/lightning_pose.models.base.normalized_to_bbox.rst b/docs/api/lightning_pose.models.base.normalized_to_bbox.rst
@@ -0,0 +1,6 @@
+normalized_to_bbox
+==================
+
+.. currentmodule:: lightning_pose.models.base
+
+.. autofunction:: normalized_to_bbox
diff --git a/docs/api/lightning_pose.utils.fiftyone.FiftyOneImagePlotter.rst b/docs/api/lightning_pose.utils.fiftyone.FiftyOneImagePlotter.rst
@@ -10,9 +10,6 @@ FiftyOneImagePlotter
 
    .. autosummary::
 
-      ~FiftyOneImagePlotter.image_paths
-      ~FiftyOneImagePlotter.img_height
-      ~FiftyOneImagePlotter.img_width
       ~FiftyOneImagePlotter.model_names
       ~FiftyOneImagePlotter.num_keypoints
 
@@ -27,13 +24,11 @@ FiftyOneImagePlotter
       ~FiftyOneImagePlotter.get_keypoints_per_image
       ~FiftyOneImagePlotter.get_model_abs_paths
       ~FiftyOneImagePlotter.get_pred_keypoints_dict
+      ~FiftyOneImagePlotter.img_height_width
       ~FiftyOneImagePlotter.load_model_predictions
 
    .. rubric:: Attributes Documentation
 
-   .. autoattribute:: image_paths
-   .. autoattribute:: img_height
-   .. autoattribute:: img_width
    .. autoattribute:: model_names
    .. autoattribute:: num_keypoints
 
@@ -46,4 +41,5 @@ FiftyOneImagePlotter
    .. automethod:: get_keypoints_per_image
    .. automethod:: get_model_abs_paths
    .. automethod:: get_pred_keypoints_dict
+   .. automethod:: img_height_width
    .. automethod:: load_model_predictions
diff --git a/docs/roadmap.md b/docs/roadmap.md
@@ -11,7 +11,7 @@
 ## Multi-view support for non-mirrored setups
 - [x] implement supervised datasets/dataloaders that work with multiple views ([#115](https://github.com/danbider/lightning-pose/pull/115))
 - [x] context frames for multi-view ([#126](https://github.com/danbider/lightning-pose/pull/126))
-- [ ] unsupervised losses for multi-view
+- [x] unsupervised losses for multi-view ([#187](https://github.com/danbider/lightning-pose/pull/187))
 
 ## Single-view dynamic crop (small animals in large frames)
 - [ ] implement dynamic cropping pipeline with detector model and pose estimator

diff --git a/docs/source/user_guide/inference.rst b/docs/source/user_guide/inference.rst
@@ -1,3 +1,5 @@
+.. _inference:
+
 #########
 Inference
 #########

diff --git a/docs/source/user_guide_advanced/multiview_separate.rst b/docs/source/user_guide_advanced/multiview_separate.rst
@@ -4,6 +4,96 @@
 Multiview: separate data streams
 ################################
 
-As of November 2023 we are actively working to support this feature.
-If you would like to use this feature please add a comment to
-`this open issue <https://github.com/danbider/lightning-pose/issues/120>`_.
+In addition to the mirrored setups discussed on the previous page, Lightning Pose also supports
+more traditional multiview data, where the same scene is captured from different angles with
+different cameras.
+Each view is treated as an independent input to a single network.
+This way, the network can learn from different perspectives and be agnostic to the correlations
+between the different views.
+Similar to the single view setup, Lightning Pose produces a separate csv file with the predicted
+keypoints for each video
+
+.. note::
+
+    As of July 2024, the non-mirrored multiview feature of Lightning Pose now supports context
+    frames and some unsupervised losses.
+    The Multiview PCA loss operates across all views, while the temporal loss operates on single
+    views.
+    The Pose PCA loss is not yet implemented for the multiview case.
+
+Organizing your data
+====================
+
+As an example, let’s assume a dataset has two camera views from a given session ("session0"),
+which we’ll call “view0” and “view1”.
+Lightning Pose assumes the following project directory structure:
+
+.. code-block::
+
+    /path/to/project/
+      ├── <LABELED_DATA_DIR>/
+      │   ├── session0_view0/
+      │   └── session0_view1/
+      ├── <VIDEO_DIR>/
+      │   ├── session0_view0.mp4
+      │   └── session0_view1.mp4
+      ├── view0.csv
+      └── view1.csv
+
+* ``<LABELED_DATA_DIR>/``: The directory name, any subdirectory names, and image names are all flexible, as long as they are consistent with the first column of `<view_name>.csv` files (see below). As an example, each session/view pair can have its own subdirectory, which contains images that correspond to the labels. The same frames from all the views must have the same names; for example, the images corresponding to time point 39 should be named "<LABELED_DATA_DIR>/session0_view0/img000039.png" and "<LABELED_DATA_DIR>/session0_view1/img000039.png".
+
+* ``<VIDEO_DIR>/``: This is a single directory of videos, which **must** following the naming convention ``<session_name>_<view_name>.csv``. So in our example there should be two videos, named ``session0_view0.mp4`` and ``session0_view1.mp4``.
+
+* ``<view_name>.csv``: For each view (camera) there should be a table with keypoint labels (rows: frames; columns: keypoints). Note that these files can take any name, and need to be listed in the config file under the ``data.csv_file`` section. Each csv file must contain the same set of keypoints, and each must have the same number of rows (corresponding to specific points in time).
+
+
+The configuration file
+======================
+
+Like the single view case, users interact with Lighting Pose through a single configuration file.
+This file points to data directories, defines the type of models to fit, and specifies a wide range
+of hyperparameters.
+
+A template file can be found
+`here <https://github.com/danbider/lightning-pose/blob/main/scripts/configs/config_default.yaml>`_.
+When training a model on a new dataset, you must copy/paste this template onto your local machine
+and update the arguments to match your data.
+
+To switch to multiview from single view you need to change two data parameters.
+Again, assume that we are working with the two-view dataset used as an example above:
+
+.. code-block:: yaml
+
+    data:
+      csv_file:
+        - view0.csv
+        - view1.csv
+      view_names:
+        - view0
+        - view1
+      mirrored_column_matches: [see bullet below]
+      columns_for_singleview_pca: [see bullet below]
+
+
+* ``csv_file``: list of csv filenames for each view
+* ``view_names``: list view names
+* ``mirrored_column_matches``: if you would like to use the Multiview PCA loss, you must ensure the
+  following:
+  (1) the same set of keypoints are labeled across all views (though there can be missing data);
+  (2) this config field should be a list of the indices corresponding to a *single view* which are
+  included in the loss for all views;
+  for example if you have 10 keypoints in each view, and you want to include the zeroth, first, and
+  fifth in the Multiview PCA loss, this field should look like
+  ``mirrored_column_matches: [0, 1, 5]``;
+  (3) as in the non-multiview case, you must specify you want to use this loss
+  :ref:`elsewhere in the config file <unsup_config>`.
+* ``columns_for_singleview_pca``: NOT YET IMPLEMENTED
+
+Training and inference
+======================
+
+Once the data are properly organized and the config files updated, :ref:`training <training>` and
+:ref:`inference <inference>` in this multiview setup proceed exactly the same as for the single
+view case.
+Because the trained network is view-agnostic,
+during inference videos are processed and saved one view at a time.