Shaunak Halbe

ML PhD student
Georgia Institute of Technology


Email  /  CV  /  LinkedIn  /  Google Scholar

profile photo
About  |  Research |  Awards |  Service  
About

I am a second year Machine Learning PhD student at Georgia Tech advised by Prof. Zsolt Kira. My research interests lie at the intersection of computer vision and natural language processing.

Currently, I am interested in vision-and-language pretraining, representation learning, domain adaptation and video understanding.

At Georgia Tech, I have been working on building vision-and-language models capable of understanding the open-world. Previously, I have worked on domain generalization and robustness at USC/Meta AI and on embodied visual navigation at CMU.

Research
Grounding Descriptions in Images informs Zero-shot Visual Recognition

Shaunak Halbe* et. al.

Under Review
preprint

We propose a new pretraining strategy for CLIP to learn fine-grained visual representations that exhibit strong zero-shot transfer performance.

HePCo: Data-Free Heterogeneous Prompt Consolidation for Continual Federated Learning

Shaunak Halbe*, James Smith, Junjiao Tian, Zsolt Kira

Workshop on Federated Learning in the Age of Foundation Models (Oral)
NeurIPS 2023
Long Version: Under Submission
preprint

We propose a novel prompt learning and aggregation scheme for distributed training of foundation models

Open-World Dialogue Driven Object Navigation

Conference on Robot Learning (CoRL) 2023 (Demo Track)
Coming Soon

We demonstrate robot navigation to an open-set of objects described in natural language

Robustness through Data Augmentation Loss Consistency

Tianjian Huang*, Shaunak Halbe*, Chinnadhurai Sankar, Pooyan Amini, Satwik Kottur, Alborz Geramifard, Meisam Razaviyayn, Ahmad Beirami

Transactions on Machine Learning Research (TMLR) 2022
preprint

We introduce a novel loss-level regularizer to improve robustness to spurious correlations in generative models

A Closer Look at Rehearsal-Free Continual Learning

James Smith, Junjiao Tian, Shaunak Halbe, Yen-Chang Hsu, Zsolt Kira

Workshop on Continual Learning in Computer Vision (CLVision)
CVPR 2023
preprint

We introduce knowledge distillation and regularization baselines using Foundation Models for rehearsal-free continual learning

Reason & Act : A Modular Approach to Explanation Driven Agents for Vision and Language Navigation

Shaunak Halbe, Ingrid Navarro, Jean Oh
CMU Robotics Institute of Summer Scholars Working Papers Journal
paper / poster / talk

We present a modular agent in the form of a global and local planner. Additionally, we incorporate two novel components within the agent to encourage Cross-Modal Grounding and Visual Reasoning.

Exploring Weaknesses of VQA Models through Attribution Driven Insights

Shaunak Halbe
Second Grand-Challenge and Workshop on Multimodal Language
ACL 2020
Visual Question Answering and Dialog Workshop
CVPR 2020
paper / talk

We present a consistency analysis of VQA models through the lens of attribution to evaluate adversarial robustness.

Awards

Service & Teaching

  • Graduate Teaching Assistant: CS 7643 Deep Learning, Fall 2023

  • Reviewer: CVPR 2023, NeurIPS-W 2023, CMU RI Working Papers Journal 2021

  • Volunteer: NeurIPS 2023, CoRL 2023, NAACL 2021, ACL 2020


Website template cloned from here!