About
I am a third year Machine Learning PhD student at Georgia Tech advised by Prof. Zsolt Kira. My research interests lie at the intersection of computer vision and natural language processing.
I am currently working on vision-and-language pretraining, (egocentric) video understanding and domain adaptation.
At Georgia Tech, I have been working on building vision-and-language models capable of understanding the open-world. Previously, I have worked on domain generalization and robustness at USC/Meta AI and on embodied visual navigation at CMU.
|
|
Grounding Descriptions in Images informs Zero-shot Visual Recognition
Shaunak Halbe* et. al.
Under Review
preprint
We propose a new pretraining strategy for CLIP to learn fine-grained visual representations that exhibit strong zero-shot transfer performance.
|
|
Continual Adaptation of Vision Transformers for Federated Learning
Shaunak Halbe*, James Smith, Junjiao Tian, Zsolt Kira
Transactions on Machine Learning Research (TMLR) 2024
Short Version: FL@FM Workshop, NeurIPS 2023 (Oral)
paper /
talk
We propose a novel prompt learning and aggregation scheme for distributed training of foundation models
|
|
Open-World Dialogue Driven Object Navigation
Conference on Robot Learning (CoRL) 2023 (Demo Track)
Coming Soon
We demonstrate robot navigation to an open-set of objects described in natural language
|
|
Robustness through Data Augmentation Loss Consistency
Tianjian Huang*, Shaunak Halbe*, Chinnadhurai Sankar, Pooyan Amini, Satwik Kottur, Alborz Geramifard, Meisam Razaviyayn, Ahmad Beirami
Transactions on Machine Learning Research (TMLR) 2022
paper
We introduce a novel loss-level regularizer to improve robustness to spurious correlations in generative models
|
|
A Closer Look at Rehearsal-Free Continual Learning
James Smith, Junjiao Tian, Shaunak Halbe, Yen-Chang Hsu, Zsolt Kira
CVPR-W 2023
paper
We introduce knowledge distillation and regularization baselines using Foundation Models for rehearsal-free continual learning
|
|
Reason & Act : A Modular Approach to Explanation Driven Agents for Vision and Language Navigation
Shaunak Halbe, Ingrid Navarro, Jean Oh
CMU Robotics Institute Working Papers Journal
paper /
poster /
talk
We present a modular agent for navigation with improved cross-modal grounding and semantic reasoning.
|
|
Exploring Weaknesses of VQA Models through Attribution Driven Insights
Shaunak Halbe
ACL-W 2020
Short Version: CVPR-W 2020
paper /
talk
We present a consistency analysis of VQA models through the lens of attribution to evaluate adversarial robustness.
|
Service & Teaching
- Graduate Teaching Assistant: CS 7643
Deep Learning, Fall 2023
- Reviewer: CVPR 2023, NeurIPS-W 2023, CMU RI Working Papers Journal 2021
- Volunteer: NeurIPS 2023, CoRL 2023, NAACL 2021, ACL 2020
|
Website template cloned from here!
|
|