Skip to content

Commit

Permalink
support for multiple data streams (#187)
Browse files Browse the repository at this point in the history
* Multi view (#115)

* [add] multiview dictionary and multiviewdataset

* [add] multiview to get_dataset

* [fix] copy package & type for dataset

* [add] dynamic keys

* [add] added Multiviewheatmaplabeled to base.basesupervisedtracker type

* [add] mouse mirror dataset

* [add] test dataset for Multiviewheatmap

* [add] test for multiview heatmap data module

* [add] hydra raise notinplementederror for regression and multiview

* [add] prediction handler and keypoints structure for Multiview

* [fix] keypoints for multiview in prediction is now saved in csvs correctly

* [add] make_multiview_dataset

* [fix] datacheck is more strict now

* [change] the dataset is now stacked along channel dimension. [fix] tests are vaild now

* [fix] images are stacked along the batch dimension and representation was reshaped

* [add] data type to fusion

* [fix] datadicts in data utils

* [add] test_supervised_multiview_heatmap and skipped video_dataloader

* update code to use original image and video pixel coordinates for all predictions (#117)

* Squashed commit of the following:

commit 9b7de3c
Author: Selmaan <[email protected]>
Date:   Mon Sep 18 16:39:04 2023 -0400

    Update config_birdCOM.yaml

commit 1aae9de
Author: Selmaan <[email protected]>
Date:   Mon Sep 18 15:59:43 2023 -0400

    update configs for detector net

    Added params to default config need for dynamic crop algorithm and detector network. Then created a new bird config with edited fields

* add bbox to dataset dictionaries

add a xyhw bbox entry to dictionary for dataset results, for example and batch. Also update the base dataset class (for backwards compatability) to output a bounding box in the __getitem__ call. This 'default' bbox is just the full image dimensions

* Update datasets.py

fix automatic detection of image height and width

* Update heatmap_tracker.py

add code to convert from bounding box to original image dimension coordinates

* Update test_pca.py

fix keys in unit test

* add bbox conversion

get_loss_inputs_labeled and predict_step now call convert_bbox_coords, which converts predicted keypoints from (potentially cropped) image intrinsic coordinates to original image coordinates, using bbox info

* code linting

fixed code formatting following flake8

* Update heatmap_tracker.py

add label reminding me to implement bbox conversion later for unlabeled data

* Update augmentations.py

do not resize data in the augmentation pipeline when using dynamic crop algorithm (cropping will be handled by dynamic pipeline)

* Update config_birdCOM.yaml

* Update scripts.py

* Update .gitignore

* Update datasets.py

add new detector dataset, and correct previous image height and width calculation in BastTrackingDataset

* Update heatmap_tracker.py

transform target along with predicted keypoints from transformed image coordinates to original image coordinates

* Update scripts.py

fix typo in get_detector_model

* Update config_birdCOM.yaml

set useful parameter values for this dataset

* add dynamic labeled dicts

these are unused for now, will be implemented in the future for multi-instance detection

* Update config_birdCOM.yaml

redo COM config to be independent pipeline from later POS pipeline

* Update scripts.py

remove image size checking when setting up dataset.

unrelated, also set 'columns for singleview pca' to None automatically if it's not set in the config

* remove keypoints rescaling in predict

remove rescaling of keypoints according to static config info (which is now removed). Keypoints are dynamically rescaled in model's predict_step now already!

* add bbox conversion for unlabeled data

update predict step and get unsupervised losses to convert keypoints predicted on unlabeled video frame data to original image coords using bbox info

* update dali dataloader with frame sizes

dali dataloader now outputs the size of loaded video frames in bbox info

* pre computing heatmaps is very slow

I think there is very little upside to having these precomputed? Creates a big lag for my use case whenever I try to create a dataset.

* add bbox to DynamicDict

* image_orig_dims no longer in configs

removed image_orig_dims from config, so also does not need to be copied here. It will always be inferred from video/image data during inference

* Update config_birdCOM.yaml

* Create config_birdCOM_backup.yaml

* delete commented out code

* code linting

* add bbox to unlabeled batch keys

* changes from pull request review

mostly moving convert_bbox_coords to be a function in models.base rather than a method of heatmap tracker class

* fix bbox to device with images

fixes edge case where frames and bbox where on separate devices

* add bbox conversion unit test

the test artifically crops and shifts an image and offsets the detected keypoint locations accordingly, and verifies that bbox_conversion for the original and re-cropped data match

* combine multiview and dynamic crop PRs

* [add] compute_metrics for multiview, it is a loop for each file (#119)

* bug fix for multi-view metric computation (#121)

* final bug fixes for multiview

* bbox bug fix; closes #109, #120

* [docs] multiview separate

* Multiview (#126)

* [fix] preds_file typo to list

* [fix] list of preds_file is being processed now

* [add] hydra for compute metrics1

* [add] multiview heatmap context

* [add] multiview heatmap context conftest

* [add] dataset test

* [add] mview data module test

* [add] context and thir tests

* [add] dynamic naming dataset basic

* [fix] flake8

* PR fixes

* add bbox coord transform to context models

---------

Co-authored-by: themattinthehatt <[email protected]>

* tweaks to streamlit to show labeled data results from all views

* [fix] fiftyone app now compatible with multiple views

* [fix] dataset typechecking error, new unit tests

* remove detector code from multiview branch

* remove detector code from multiview branch

* update IO code to properly find multiview videos

* update video_pipe to work for multiple views

* update LitDaliWrapper to work for multiple views

* semisupervised multiview training without error

* bug fixes with dali augmentations

* multiview semisupervised context dataloader + model tests passing

* affine transform bug fix + refactoring + unit test

---------

Co-authored-by: Farzad Ziaie Nezhad <[email protected]>
Co-authored-by: Selmaan <[email protected]>
  • Loading branch information
3 people authored Jul 18, 2024
1 parent 4967266 commit 3c74e8b
Show file tree
Hide file tree
Showing 54 changed files with 2,832 additions and 603 deletions.
28 changes: 14 additions & 14 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,12 +1,3 @@
lightning_logs/
grid_artifacts/
.vscode/
outputs/
multirun/
preds/
scripts/.ipynb_checkpoints
#scripts/configs_*

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
Expand Down Expand Up @@ -144,10 +135,19 @@ dmypy.json
**/.DS_Store

# specific stuff like csv predictions on test_vid.mp4
toy_datasets/toymouseRunningData/unlabeled_videos/*.csv
toy_datasets/toymouseRunningData/unlabeled_videos/test_vid_labeled.mp4
toy_datasets/toymouseRunningData/unlabeled_videos/test_vid_*.mp4
toy_datasets/toymouseRunningData/barObstacleScaling1/*.npy
data/mirror-mouse-example/videos/*.csv
data/mirror-mouse-example/videos/test_vid_*.mp4

# split dataset that is computed on the fly
# other datasets
data/mirror-mouse-example_split/
data/Chickadee

# random other outputs that might end up in the repo
lightning_logs/
grid_artifacts/
.vscode/
outputs/
multirun/
preds/
scripts/.ipynb_checkpoints
tb_logs
12 changes: 12 additions & 0 deletions docs/api/lightning_pose.data.datasets.BaseTrackingDataset.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,15 @@ BaseTrackingDataset

.. autoclass:: BaseTrackingDataset
:show-inheritance:

.. rubric:: Attributes Summary

.. autosummary::

~BaseTrackingDataset.height
~BaseTrackingDataset.width

.. rubric:: Attributes Documentation

.. autoattribute:: height
.. autoattribute:: width
22 changes: 22 additions & 0 deletions docs/api/lightning_pose.data.datasets.HeatmapDataset.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,25 @@ HeatmapDataset

.. autoclass:: HeatmapDataset
:show-inheritance:

.. rubric:: Attributes Summary

.. autosummary::

~HeatmapDataset.output_shape

.. rubric:: Methods Summary

.. autosummary::

~HeatmapDataset.compute_heatmap
~HeatmapDataset.compute_heatmaps

.. rubric:: Attributes Documentation

.. autoattribute:: output_shape

.. rubric:: Methods Documentation

.. automethod:: compute_heatmap
.. automethod:: compute_heatmaps
35 changes: 35 additions & 0 deletions docs/api/lightning_pose.data.datasets.MultiviewHeatmapDataset.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
MultiviewHeatmapDataset
=======================

.. currentmodule:: lightning_pose.data.datasets

.. autoclass:: MultiviewHeatmapDataset
:show-inheritance:

.. rubric:: Attributes Summary

.. autosummary::

~MultiviewHeatmapDataset.height
~MultiviewHeatmapDataset.num_views
~MultiviewHeatmapDataset.output_shape
~MultiviewHeatmapDataset.width

.. rubric:: Methods Summary

.. autosummary::

~MultiviewHeatmapDataset.check_data_images_names
~MultiviewHeatmapDataset.fusion

.. rubric:: Attributes Documentation

.. autoattribute:: height
.. autoattribute:: num_views
.. autoattribute:: output_shape
.. autoattribute:: width

.. rubric:: Methods Documentation

.. automethod:: check_data_images_names
.. automethod:: fusion
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
MultiviewHeatmapLabeledBatchDict
================================

.. currentmodule:: lightning_pose.data.utils

.. autoclass:: MultiviewHeatmapLabeledBatchDict
:show-inheritance:
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
MultiviewHeatmapLabeledExampleDict
==================================

.. currentmodule:: lightning_pose.data.utils

.. autoclass:: MultiviewHeatmapLabeledExampleDict
:show-inheritance:
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
MultiviewLabeledBatchDict
=========================

.. currentmodule:: lightning_pose.data.utils

.. autoclass:: MultiviewLabeledBatchDict
:show-inheritance:
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
MultiviewLabeledExampleDict
===========================

.. currentmodule:: lightning_pose.data.utils

.. autoclass:: MultiviewLabeledExampleDict
:show-inheritance:
17 changes: 17 additions & 0 deletions docs/api/lightning_pose.data.utils.MultiviewUnlabeledBatchDict.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
MultiviewUnlabeledBatchDict
===========================

.. currentmodule:: lightning_pose.data.utils

.. autoclass:: MultiviewUnlabeledBatchDict
:show-inheritance:

.. rubric:: Attributes Summary

.. autosummary::

~MultiviewUnlabeledBatchDict.is_multiview

.. rubric:: Attributes Documentation

.. autoattribute:: is_multiview
10 changes: 10 additions & 0 deletions docs/api/lightning_pose.data.utils.UnlabeledBatchDict.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,13 @@ UnlabeledBatchDict

.. autoclass:: UnlabeledBatchDict
:show-inheritance:

.. rubric:: Attributes Summary

.. autosummary::

~UnlabeledBatchDict.is_multiview

.. rubric:: Attributes Documentation

.. autoattribute:: is_multiview
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
undo_affine_transform_batch
===========================

.. currentmodule:: lightning_pose.data.utils

.. autofunction:: undo_affine_transform_batch
6 changes: 6 additions & 0 deletions docs/api/lightning_pose.models.base.convert_bbox_coords.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
convert_bbox_coords
===================

.. currentmodule:: lightning_pose.models.base

.. autofunction:: convert_bbox_coords
6 changes: 6 additions & 0 deletions docs/api/lightning_pose.models.base.normalized_to_bbox.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
normalized_to_bbox
==================

.. currentmodule:: lightning_pose.models.base

.. autofunction:: normalized_to_bbox
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,6 @@ FiftyOneImagePlotter

.. autosummary::

~FiftyOneImagePlotter.image_paths
~FiftyOneImagePlotter.img_height
~FiftyOneImagePlotter.img_width
~FiftyOneImagePlotter.model_names
~FiftyOneImagePlotter.num_keypoints

Expand All @@ -27,13 +24,11 @@ FiftyOneImagePlotter
~FiftyOneImagePlotter.get_keypoints_per_image
~FiftyOneImagePlotter.get_model_abs_paths
~FiftyOneImagePlotter.get_pred_keypoints_dict
~FiftyOneImagePlotter.img_height_width
~FiftyOneImagePlotter.load_model_predictions

.. rubric:: Attributes Documentation

.. autoattribute:: image_paths
.. autoattribute:: img_height
.. autoattribute:: img_width
.. autoattribute:: model_names
.. autoattribute:: num_keypoints

Expand All @@ -46,4 +41,5 @@ FiftyOneImagePlotter
.. automethod:: get_keypoints_per_image
.. automethod:: get_model_abs_paths
.. automethod:: get_pred_keypoints_dict
.. automethod:: img_height_width
.. automethod:: load_model_predictions
2 changes: 1 addition & 1 deletion docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
## Multi-view support for non-mirrored setups
- [x] implement supervised datasets/dataloaders that work with multiple views ([#115](https://github.com/danbider/lightning-pose/pull/115))
- [x] context frames for multi-view ([#126](https://github.com/danbider/lightning-pose/pull/126))
- [ ] unsupervised losses for multi-view
- [x] unsupervised losses for multi-view ([#187](https://github.com/danbider/lightning-pose/pull/187))

## Single-view dynamic crop (small animals in large frames)
- [ ] implement dynamic cropping pipeline with detector model and pose estimator
Expand Down
2 changes: 2 additions & 0 deletions docs/source/user_guide/inference.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _inference:

#########
Inference
#########
Expand Down
96 changes: 93 additions & 3 deletions docs/source/user_guide_advanced/multiview_separate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,96 @@
Multiview: separate data streams
################################

As of November 2023 we are actively working to support this feature.
If you would like to use this feature please add a comment to
`this open issue <https://github.com/danbider/lightning-pose/issues/120>`_.
In addition to the mirrored setups discussed on the previous page, Lightning Pose also supports
more traditional multiview data, where the same scene is captured from different angles with
different cameras.
Each view is treated as an independent input to a single network.
This way, the network can learn from different perspectives and be agnostic to the correlations
between the different views.
Similar to the single view setup, Lightning Pose produces a separate csv file with the predicted
keypoints for each video

.. note::

As of July 2024, the non-mirrored multiview feature of Lightning Pose now supports context
frames and some unsupervised losses.
The Multiview PCA loss operates across all views, while the temporal loss operates on single
views.
The Pose PCA loss is not yet implemented for the multiview case.

Organizing your data
====================

As an example, let’s assume a dataset has two camera views from a given session ("session0"),
which we’ll call “view0” and “view1”.
Lightning Pose assumes the following project directory structure:

.. code-block::
/path/to/project/
├── <LABELED_DATA_DIR>/
│   ├── session0_view0/
│ └── session0_view1/
├── <VIDEO_DIR>/
│   ├── session0_view0.mp4
│ └── session0_view1.mp4
├── view0.csv
└── view1.csv
* ``<LABELED_DATA_DIR>/``: The directory name, any subdirectory names, and image names are all flexible, as long as they are consistent with the first column of `<view_name>.csv` files (see below). As an example, each session/view pair can have its own subdirectory, which contains images that correspond to the labels. The same frames from all the views must have the same names; for example, the images corresponding to time point 39 should be named "<LABELED_DATA_DIR>/session0_view0/img000039.png" and "<LABELED_DATA_DIR>/session0_view1/img000039.png".

* ``<VIDEO_DIR>/``: This is a single directory of videos, which **must** following the naming convention ``<session_name>_<view_name>.csv``. So in our example there should be two videos, named ``session0_view0.mp4`` and ``session0_view1.mp4``.

* ``<view_name>.csv``: For each view (camera) there should be a table with keypoint labels (rows: frames; columns: keypoints). Note that these files can take any name, and need to be listed in the config file under the ``data.csv_file`` section. Each csv file must contain the same set of keypoints, and each must have the same number of rows (corresponding to specific points in time).


The configuration file
======================

Like the single view case, users interact with Lighting Pose through a single configuration file.
This file points to data directories, defines the type of models to fit, and specifies a wide range
of hyperparameters.

A template file can be found
`here <https://github.com/danbider/lightning-pose/blob/main/scripts/configs/config_default.yaml>`_.
When training a model on a new dataset, you must copy/paste this template onto your local machine
and update the arguments to match your data.

To switch to multiview from single view you need to change two data parameters.
Again, assume that we are working with the two-view dataset used as an example above:

.. code-block:: yaml
data:
csv_file:
- view0.csv
- view1.csv
view_names:
- view0
- view1
mirrored_column_matches: [see bullet below]
columns_for_singleview_pca: [see bullet below]
* ``csv_file``: list of csv filenames for each view
* ``view_names``: list view names
* ``mirrored_column_matches``: if you would like to use the Multiview PCA loss, you must ensure the
following:
(1) the same set of keypoints are labeled across all views (though there can be missing data);
(2) this config field should be a list of the indices corresponding to a *single view* which are
included in the loss for all views;
for example if you have 10 keypoints in each view, and you want to include the zeroth, first, and
fifth in the Multiview PCA loss, this field should look like
``mirrored_column_matches: [0, 1, 5]``;
(3) as in the non-multiview case, you must specify you want to use this loss
:ref:`elsewhere in the config file <unsup_config>`.
* ``columns_for_singleview_pca``: NOT YET IMPLEMENTED

Training and inference
======================

Once the data are properly organized and the config files updated, :ref:`training <training>` and
:ref:`inference <inference>` in this multiview setup proceed exactly the same as for the single
view case.
Because the trained network is view-agnostic,
during inference videos are processed and saved one view at a time.
Loading

0 comments on commit 3c74e8b

Please sign in to comment.