Hi, I'm
 
Y
i
d
a
 
C
h
e
n
 
😊



< CS Ph.D. Student at Harvard >

About Me

Computer Science PhD candidate at Harvard Insight + Interaction Lab led by Prof. Fernanda Viegas and Prof. Martin Wattenberg. 📖 I am interested in making generative AIs more controllable through mechanistic interpretability and internal world models.

Before entering Harvard, I was a Computer Science and Engineering student at Bucknell University 🦬 advised by Prof. Joshua Stough (Bucknell University) and Prof. Christopher Haggerty (Columbia University & New York-Presbyterian). My past projects focused on segmenting sparsely annotated medical videos using multi-task learning 🧡. My works were funded by the Ciffolillo Healthcare Technology Inventors Program (HTIP) 🏥. Our papers ( [1] [2] ) are published at the SPIE Medical Imaging 2021 & 2022 Conferences with oral presentations 📝.

I also developed a color analysis toolkit for films (KALMUS) 🎬. You can find the project's GitHub Repo here, KALMUS-Color-Toolkit. KALMUS' development was supported by the Mellon Academic Year Research Fellowship awarded 🥇 by Bucknell Humanities Center, and now used as a instructional software at Bucknell.

Reviewer for EMNLP, NeurIPS 2024 Creative AI Track, and NeurIPS Interpretable AI Workshop.
First-year/Pre-concentration Advisor for Harvard College.
Judge for National Collegiate Research Conference 2024 (NCRC) at Harvard.

Research 📋

Not all users are turned away by chatbots for making suspicious requests. Your identity plays a key role when chatbot decides whether to refuse your potentially problematic inquiry.

Our new study shows that the ChatGPT guardrails sensitive knowledge to users with different gender, racial, and political demographics. In particular, younger, female, and Asian-American users are more likely to trigger the refusal from the ChatGPT when querying sensitive information. We proposed a new evaluation framework to identify such bias in the chatbot's refusal behaviors.

guardrail sensitivity
Paper Preprint (to appear at EMNLP 2024 Main)

Demo
EMNLP 2024 Main Chatbot Human-AI Interaction Bias

Have you ever thought about if chatbot LLMs are internally modeling your profile? If they are, how might this model of you influence the answers they give to your questions?

Our experiments provide evidence that a conversational AI is internally profling its user during the chat. We design an end-to-end prototype—TalkTuner—that expose this internal user model to the users. User study shows this new chatbot UI design impacts the user's trust in AIs and expose biases of LLM systems.

dashboard-overview
Paper Preprint

Demo
preprint Chatbot NLP Interpretability

Recent work from Gurnee et al. show that the activations of neural language models have high correlation with the spatial and temporal properties of their inputs. However, the work didn't estabilish a causal link between two. Our experiments, filling a part of this blank, show that editing LLM's spatial representations can improve the model's performance on a simple spatial task.

language_logit_change
Paper Preprint

Demo
preprint NLP Interpretability

Does the 2D image generative model has an internal model of 3D geometry? Can a 2D neural network see beyond the X-Y dimension of a matrix of pixels? Our project found controllable representations of 3D geometry inside diffusion model.

diffusion_model
Paper

Youtube Video (55K Views)

Demo
NeurIPS 2023 CV Interpretability

Wonder how attention flows inside your Vision Transformer? What visual patterns are recognize by machine's attention? Does machine's attention resemble human's visual cognition? Collaboration with Catherine Yeh. My main contribution is the visualization of learnt attention in vision transformer models.

attention_viz
Paper

Demo
IEEE VIS 2023 Visualization Interpretability

KALMUS is a Python package for the computational analysis of colors in films. It provides quantitative tools to study and compare the use of film color. This package serves two purposes: (1) various ways to measure, calculate and compare a film’s colors and (2) various ways to visualize a film’s color.

kalmus_img kalmus_img
kalmus_img

Demo
JOSS 2021 Visualization Digital Humanities

We aim to further improve the accuracy and clinical applicability of echocardiography video segmentation by extending the analysis from half-heartbeat (End-diastolic to End-systolic phases) to multi-heartbeat video. We proposed a sliding window data augmentation technique for efficiently learning moition tracking and semantic segmentation from sparesely annotated echo videos (only 2 annotations per video). Paper .

fully_automated_video_segmentation

Demo
SPIE Medical Imaging 2022 CV Medical Imaging

We assessed a 3D-UNet's performance on jointly segmenting sparsely annotated half-heartbeat echocardiography videos and estimating cardiac structure's motions. The 3D-UNet was trained on CAMUS dataset, and we evaluated its generalizability on Stanford's EchoNet dataset. Comparing with traditional frame-based segmentation method, our results show that the joint learning of motion tracking and segmentation enhances the segmentation performance on video data. The video model also has better generalizability on unseen dataset than frame-based model.

4ch_video_segmentation

Demo
SPIE Medical Imaging 2021 CV Medical Imaging

Project Repos 💻

Designing a Dashboard for Transparency and Control of Conversational AI, https://arxiv.org/abs/2406.07882

Demo Github
interface-design interpretability large-language-models

Linear probe found representations of scene attributes in a text-to-image diffusion model

Demo Github
explainability image-editing interpretability scene stable-diffusion

The implementation of CLAS-FV described in "Fully automated multi-heartbeat echocardiography video segmentation and motion tracking".

Demo Github
computer-vision deep-learning echocardiography

KALMUS Color Toolkit for color analysis and visualization in film studies.

Demo Github
Digital Humanities Film Studies

Skills 🤺

Python
Java
Haskell
Ruby
C++
LaTeX
MATLAB
Bash
SQLite
Android
PyTorch
Keras
Matplotlib
seaborn
Scikit-Learn
Scikit-Image
NumPy
Pandas
OpenCV
SciPy
PIL
JavaFX
pytest
Git
Scrum
GitHub CI/CD
Codecov
PyCharm
Intellij

My journey so far... 🛸

For more information, have a look at my curriculum vitae .

  • Served as the judge for National Collegiate Research Conference 2024 at Harvard. Reviewed and oversaw 8 undergraduate student projects.

    Harvard NCRC Judge
  • I am now a new member of Insight + Interaction Lab at Harvard SEAS. It's so exciting to work with everyone here! Check out our group: Insight + Interaction Lab.

    Harvard Ph.D. Insight + Interaction
  • I graduated from Bucknell with Bachelor degree in Computer Science & Engineering and Summa Cum Laude distinction. I am pleased to receive the Bucknell Prize in Computer Science and Engineering (1 per class year), University Prize for Men, and President’s Award for Distinguished Academic Achievement.

    In this fall, I will join the Harvard SEAS to pursue a Doctorate degree in Computer Science. It's my honor to be mentored by Prof. Fernanda Viegas and Prof. Martin Wattenberg ( Insight + Interaction Lab).

    Bucknell Undergraduate Graduation
  • Our paper, "Fully automated multi-heartbeat echocardiography video segmentation and motion tracking", has been published at the SPIE Medical Imaging 2022: Image Processing Conference.

    You can find the manuscript and recorded presentation here: link.

    Published SPIE Medical Imaging Oral Presentation
  • Our work, "Fully automated multi-heartbeat echocardiography video segmentation and motion tracking", has accepted for an oral presentation at SPIE Medical Imaging 2022: Image Processing Conference.
    Accepted SPIE Medical Imaging Oral Presentation
  • Bucknell University June 2021 - Present
    HTIP research fellow for fully automated multi-heartbeat echocardiography video segmentation project
    PyTorch Scikit-Learn Matplotlib Python
  • Our paper, "KALMUS: tools for color analysis of films", has been published on the Journal of Open Source Software! Our manuscript is open access, and you can find it here: link.

    The GitHub repo of associated Python package, KALMUS, is here: link.

    For installation instruction and detailed usage guide, please refer to the KALMUS documentation page.

    Published JOSS Open Access KALMUS Python Package
  • Our paper, "Assessing the generalizability of temporally-coherent echocardiography video segmentation", has been published at SPIE Medical Imaging: Image processing 2021 conference!

    You can find the manuscript here: link.
    The presentation slides are available here: link.

    Published SPIE Oral Presentation Ei Compendex
  • Our paper, "Assessing the generalizability of temporally-coherent echocardiography video segmentation", has been accepted for an oral presentation by SPIE Medical Imaging: Image processing 2021 conference!

    The preprint of paper is available here: link.

    Accepted SPIE Oral Presentation
  • Bucknell University, Res Ed Aug. 2020 - Dec. 2020
    Residential Advisor
    Leadership Communication
  • Bucknell University June 2020 - May 2021
    HTIP research fellow for joint motion tracking & echocardiography segmentation project.
    PyTorch Scikit-Learn Matplotlib Python
  • Bucknell University Nov. 2019 - May 2021
    Mellon academic year research fellow for Film Color Analysis. Built a tool, KALMUS, for quantitative color analysis in film production.
    OpenCV Scikit-Image Matplotlib GitHub CI Codecov Python
  • B.S. Computer Science and Engineering
    C++ Haskell Andriod Java JavaFX

Contact 📪

Thank you so much for visiting my website!