-
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
75 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
--- | ||
layout: publication | ||
title: "No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations" | ||
image: assets/img/publications/2024_fungi/overview.jpg | ||
hide: false | ||
category: [3d-perception, limited-supervision, foundation, unsupervised, self-supervised] | ||
authors: Gilles Puy, Spyros Gidaris, Alexandre Boulch, Oriane Siméoni, Corentin Sautier, Patrick Pérez, Andrei Bursuc, Renaud Marlet | ||
venue: CVPR | ||
venue_long: Computer Vision and Pattern Recognition | ||
year: 2024 | ||
month: 6 | ||
code_url: https://github.com/valeoai/ScaLR | ||
paper_url: https://arxiv.org/abs/2310.17504 | ||
blog_url: | ||
slides_url: | ||
bib_url: | ||
permalink: /publications/scalr/ | ||
--- | ||
|
||
<h1 align="center"> {{page.title}} </h1> | ||
<!-- Simple call of authors --> | ||
<!-- <h3 align="center"> {{page.authors}} </h3> --> | ||
<!-- Alternatively you can add links to author pages --> | ||
<h3 align="center"> <a href="https://sites.google.com/site/puygilles/home">Gilles Puy</a> <a href="https://scholar.google.fr/citations?user=7atfg7EAAAAJ&hl=en">Spyros Gidaris</a> <a href="https://www.boulch.eu/">Alexandre Boulch</a> <a href="https://osimeoni.github.io/">Oriane Siméoni</a> <a href="https://csautier.github.io/">Corentin Sautier</a> <a href="https://abursuc.github.io/">Andrei Bursuc</a> <a href="https://ptrckprz.github.io/">Patrick Pérez</a> <a href="http://imagine.enpc.fr/~marletr/">Renaud Marlet</a></h3> | ||
|
||
|
||
<h3 align="center"> {{page.venue}} {{page.year}} </h3> | ||
|
||
<div align="center"> | ||
<p> | ||
{% if page.paper_url %} | ||
<a href="{{ page.paper_url }}"><i class="far fa-file-pdf"></i> Paper</a> | ||
{% endif %} | ||
{% if page.code_url %} | ||
<a href="{{ page.code_url }}"><i class="fab fa-github"></i> Code</a> | ||
{% endif %} | ||
{% if page.blog_url %} | ||
<a href="{{ page.blog_url }}"><i class="fab fa-blogger"></i> Blog</a> | ||
{% endif %} | ||
{% if page.slides_url %} | ||
<a href="{{ page.slides_url }}"><i class="far fa-file-pdf"></i> Slides</a> | ||
{% endif %} | ||
{% if page.bib_url %} | ||
<a href="{{ page.bib_url}}"><i class="far fa-file-alt"></i> BibTeX</a> | ||
{% endif %} | ||
</p> | ||
</div> | ||
|
||
<div class="publication-teaser"> | ||
<img src="../../{{ page.image }}" alt="project teaser"/> | ||
</div> | ||
|
||
|
||
<hr> | ||
|
||
<h2 align="center"> Abstract</h2> | ||
|
||
<p align="justify">Self-supervised image backbones can be used to address complex 2D tasks (e.g., semantic segmentation, object discovery) very efficiently and with little or no downstream supervision. Ideally, 3D backbones for lidar should be able to inherit these properties after distillation of these powerful 2D features. The most recent methods for image-to-lidar distillation on autonomous driving data show promising results, obtained thanks to distillation methods that keep improving. Yet, we still notice a large performance gap when measuring the quality of distilled and fully supervised features by linear probing. In this work, instead of focusing only on the distillation method, we study the effect of three pillars for distillation: the 3D backbone, the pretrained 2D backbones, and the pretraining dataset. In particular, thanks to our scalable distillation method named ScaLR, we show that scaling the 2D and 3D backbones and pretraining on diverse datasets leads to a substantial improvement of the feature quality. This allows us to significantly reduce the gap between the quality of distilled and fully-supervised 3D features, and to improve the robustness of the pretrained backbones to domain gaps and perturbations. </p> | ||
|
||
|
||
<hr> | ||
|
||
|
||
<h2 align="center">BibTeX</h2> | ||
<left> | ||
<pre class="bibtex-box"> | ||
@inproceedings{puy24scalr, | ||
title={Three Pillars improving Vision Foundation Model Distillation for Lidar}, | ||
author={Puy, Gilles and Gidaris, Spyros and Boulch, Alexandre and Sim\'eoni, Oriane and Sautier, Corentin and P\'erez, Patrick and Bursuc, Andrei and Marlet, Renaud}, | ||
booktitle={CVPR}, | ||
year={2024} | ||
}</pre> | ||
</left> | ||
|
||
<br> |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.