• 2023: I am excited to be spending my summer as an AI Resident at Google X (Mineral), working on adapting multimodal language models for object localization.
  • 2023: Our paper got accepted to CVPR!
  • 2022: We started [AI+X], a group where we brainstorm research/venture ideas on how AI can impact concurrent research areas.
  • 2022: I am excited to be spending my summer as a Research Scientist Intern at Meta (Facebook) AI (FAIR), working on the compositionality of large vision-language models.
  • 2019: Two papers accepted to EMNLP and AAAI HCOMP.
  • 2019: I won runners-up at the SRI CVT Shark Tank Competition that supported my mini-project on understanding image-text content to reduce radicalization of opinions on social media.
  • 2017: The startup I interned at (Blue River Technologies) got acquired by John Deere for $305 million! The weed detection technology I helped develop played a key part in this process.
  • 2016: Our paper got accepted to EMNLP.
  • 2016: I won Employee of the Fortnight award as an intern at Blue River Technology for developing a model to detect weeds (vs lettuce plants) in farms.
  • 2016: I will be serving as the Vice President of Tau Beta Pi, Virginia Beta Chapter, from Fall 2016 for a year.
  • 2014: Our UAV for helping locating natural disaster victims was featured in National News : Deccan Chronicle, Indian Express
  • 2013: I won a silver medal at SRM University Research Day for my white-paper presentation on an Electro-Mechanical Exoskeleton.
  • 2012: I won an Academic Merit Scholarship from SRM University that waives a part of my undergraduate tuition.



Arijit Ray, Filip Radenovic, Abhimanyu Dubey, Bryan Plummer, Ranjay Krishna, Kate Saenko, Cola: How to adapt vision-language models to Compose Objects Localized with Attributes?, [arxiv] [project page, data]


Reuben Tan, Arijit Ray, Andrea Burns, Bryan A. Plummer, Justin Salamon, Oriol Nieto, Bryan Russell, Kate Saenko, Language-Guided Audio-Visual Source Separation via Trimodal Consistency, CVPR 2023, [arxiv] [code]


Ajay Divakaran, Karan Sikka, Arijit Ray, Xiao Lin, Yi Yao User-targeted content generation using multimodal embeddings, US Patent App. 17/191,698 [webpage]

Kamran Alipour, Arijit Ray, Xiao Lin, Michael Cogswell, Jurgen Schulze, Yi Yao, Giedrius Burachas Improving Users' Mental Model with Attention-directed Counterfactual Edits, 2021 Applied AI Letters (Wiley) [pdf]

Arijit Ray, Michael Cogswell, Xiao Lin, Kamran Alipour, Ajay Divakaran, Yi Yao, Giedrius Burachas, Knowing What VQA Does Not: Pointing to Error-Inducing Regions to Improve Explanation Helpfulness, 2021 Applied AI Letters (Wiley), [pdf] [arXiv] [Project Page]


Arijit Ray, Karan Sikka, Ajay Divakaran, Stefan Lee, Giedrius Burachas, Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation , 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), also at CVPR-W 2019 VQA and Visual Dialog Workshop, [arXiv], [bibTex] [Data]

Arijit Ray, Yi Yao, Rakesh Kumar, Ajay Divakaran, Giedrius Burachas, Can You Explain That: Lucid Explanations Help Human-AI Collaboratve Image Retrieval , 2019 AAAI Conference on Human Computation and Crowdsourcing (AAAI-HCOMP 2019), [arXiv], [bibTex] [press coverage]


Arijit Ray, Gordon Christie, Mohit Bansal, Dhruv Batra, and Devi Parikh, "Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions.", 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP 2016). [pdf] [code] [Video]

Prashant Chandrasekar, Xuan Zhang, Saurabh Chakravarty, Arijit Ray, John Krulick, and Alla Rozovskaya, "The Virginia Tech System at CoNLL-2016 Shared Task on Shallow Discourse Parsing", CoNLL Shared Task (2016).

The Art of Deep Connection - Towards Natural and Pragmatic Conversational Agent Interactions. [Master's Thesis], Virginia Tech E-Library, 2017

Make RBF Networks Fast Again- Exploiting Multi-Threaded Computing to Speed Up RBF Networks, Multiprocessor Programming Class Project, Fall 2016, [draft paper] [code]

Object Prediction using Image Context: Predict next object in an image reasoned on present image context in a sequential manner, Computer Vision Class Project Fall 2015

Online Demo for Predicting Plausibility of Common Sense Assertions: Enter a three-phrase tuple to assess the plausibility score based on a joint language-vision common-sense reasoning, Class Project, Fall 2015

Learning to Listen: Matching Cover songs with Original Productions: Match Original Songs to Cover Songs using an Ensemble of Supervised and Unsupervised Approaches, Machine Learning Class Project, Fall 2015.

Ray, Arijit, Kishan Prudhvi Guddanti, and N. Chellammal. "An Approach to Intelligent Traction Control Using Regression Networks and Anomaly Detection.", Junior (3rd Year) Semester Project, Fall 2013, published in Applied Artificial Intelligence 29.6 (2015): 597-616.


Press Coverage


Some of the amazing people I have been fortunate to work with:

Prof. Dhruv Batra, Prof. Stefan Lee, Dr. Dhruv Mahajan, Dr. Filip Radenovic, Dr. Abhimanyu Dubey, Dr. Ajay Divakaran, Dr. Yi Yao, Dr. Giedrius Burachas


When I am not training LLMs, I like creating music and engineering simple gadgets. In middle school, I opened an informal research society to encourage fellow students to take an interest in science by constructing simple gadgets. We won multiple accolades in school and city-level exhibitions.


Contact Me

Have a question?

Best way to reach me would be to drop an email to array at bu dot edu. Please don't spam me!