Visual Question Answering from Image Sets
Research Project, Carnegie Mellon University, LTI, 2020
- Adapted a Transformer based VQA model (LXMERT) for the task of Image-Set VQA and developed an Adversarial Regularization method to reduce dependance on language biases & improve performance on out-of-domain data
- Introduced a new pre-training objective which utilized object bounding boxes extracted from an RCNN to improve model performance on object description questions by 4%