Vijay Murari Tiyyala

Hi, I am a Research Assistant at Center for Language and Speech Processing at Johns Hopkins University where I am fortunate to be advised by Prof. Mark Dredze. Broadly, my research interests are in NLP and Machine Learning.

Previously I received my Master’s degree in Computer Science at Johns Hopkins University in 2023 where I was fortunate to work with Prof. Mark Dredze, Prof. Daniel Khashabi, Prof. David Yarowsky. I also frequently collaborated with Prof. John W. Ayers on AI-related projects in health. CV

Currently, I am working on and constantly thinking about the following problems:

Interpretability: Understanding the internal mechanisms of large language models (LLMs)
- Knowledge localization
- Memorization and Knowledge editing
- Using the findings from above to build more robust and trustworthy LLMs
Data & Reasoning in LLMs: LLMs are really good at certain difficult tasks while fail to perform well on simple ones.
- Using the findings from interpretability, to continually improve the models through interaction and feedback to update knowledge and generalize the models to make them better at downstream tasks?
- Understanding the role of pre-training data{distribution, ordering, etc} and the training process play in developing robust reasoning capabilites
Evaluations: How can we evaluate the capabilities of LLMs in a more comprehensive manner?
- Faithfulness, Hallucinations

Please feel free to reach out if you are interested in any of the above topics or just want to chat about research!

In Progress/Preprint

BabyData: Exploring the Trade-offs Between Dataset Size and Pre-training Progression: Insights into Fine-tunability and In-context Learning
Vijay M. Tiyyala*, Kaiser Sun*, Naomi Saphra, Jessica Forde, Mark Dredze [In Progress]
Training and Aligning Large Language Models with Augmented Clinical Responses: Improving Empathy, Accuracy, and Trust in Healthcare Communication
Matthew R. Allen*, Vijay M. Tiyyala*, Nimit Desai, Karthik Ramesh, Job Shiach, Mark Dredze, Mike Hogarth, John W. Ayers [In Progress]
Evaluating Clinician and AI Chatbot Responses to Clinical Questions Posed in a Health System Using the CREATE TRUST Framework
Armaan Johal, Atharva Yeola, Vijay M. Tiyyala, …, Mark Dredze, John W. Ayers [In Progress]

*Equal contribution

Publications

Krey`ol-MT: Building MT for Latin American, Caribbean, and Colonial African Creole Languages
Nathaniel R. Robinson and Raj Dabre and … Vijay M. Tiyyala … Sanjeev Khudanpur and Stephen D. Richardson and Kenton Murray [NAACL 2024]
AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies
Xiao Ye, Andrew Wang, Jacob Choi, Yining Lu, Shreya Sharma, Lingfeng Shen, Vijay M. Tiyyala, Nicholas Andrews, Daniel Khashabi [EMNLP 2024]
Waldo: Automated Discovery of Adverse Events from Unstructured Self Reports
Karan Desai*, Vijay M. Tiyyala*, …, Mark Dredze, John W. Ayers [JAMIA Under Review]
HIVTrends.org: Public, Real-Time, and Validated HIV Testing Sales Trends from Search Query Surveillance
Vijay M. Tiyyala, Atharva Yeola, Karan Desai, Nimit Desai, Mathew R. Allen, Vin Somasundaram, Mark Dredze, Mike Hogarth, Nadir Weibel, Ravi Goyal, Davey M. Smith, John W. Ayers [JAMIA Under Review]

Research Experience

Research Assistant - CLSP, Johns Hopkins University 2023-2024

Developed methods to enhance LLM fine-tuning and improve empathetic AI responses in healthcare applications.

Research Intern - CLSP, Johns Hopkins University 2022-2023

Worked on retrieval-augmented generation (RAG) systems and evaluated clinician-AI collaboration frameworks in healthcare.
Worked on instruction following ability in code LLMs using self-instruct.

Graduate Research Assistant - CLSP, Johns Hopkins University 2022-2023

Improved machine translation accuracy for medical terminologies in low-resource languages, enhancing accessibility and precision.

Projects

Empathy-Enhanced LLMs: Refining AI Responses with Fine-Grained Human Feedback

Designed and fine-tuned models to improve empathy, quality, and factuality in healthcare chatbot responses, ensuring they are both accurate and compassionate.

Developed a tool using LLMs to simulate public responses to health policies, aiding decision-making during the COVID-19 pandemic.

Medical Terminology Translation and Multilingual Matrix Construction

Created a massive multilingual matrix for medical terms, enhancing machine translation for low-resource languages.

Public Health Data Science

  Collaborated with [John W. Ayers](https://ayersresearch.org/) on multiple projects to leverage AI & data for public health insights. This included developing models to monitor trends in health behaviors and improve disease forecasting, contributing to impactful public health research.

Feel free to reach out for potential collaborations. You can find more about my work and projects on LinkedIn and GitHub.