MLLM Safety · Alignment · Vulnerability Detection

Boyuan Chen

PhD Candidate at NYU Tandon & NYU Abu Dhabi, working to make multimodal large language models safer, more robust, and more trustworthy.

The Scholar

Curiosity, disciplined.

I am a PhD candidate in Computer Science & Engineering at NYU Tandon School of Engineering and NYU Abu Dhabi, co-advised by Prof. Siddharth Garg and Prof. Muhammad Shafique.

My research investigates the safety, adversarial robustness, and vulnerability detection of multimodal large language models — spanning jailbreak attacks, multimodal hallucination, and automated code vulnerability analysis.

Born and raised in Beijing, China, I attended the first year of the Early Development Program (EDP) at RDFZ (The High School Affiliated to Renmin University of China) before moving abroad. I then completed a double major in Computer Science and Mathematics at Pomona College, California. I care about work that is both technically sharp and genuinely useful, and I believe the best ideas come from the meeting of different disciplines and cultures.

At a Glance

RolePhD Candidate, CSE
AffiliationNYU Tandon & NYU Abu Dhabi
AdvisorsS. Garg · M. Shafique
FocusMLLM Safety & Robustness
UndergradPomona College (CS + Math)
LanguagesChinese · English · French · Arabic
Based inAbu Dhabi, UAE

Publications

Peer-reviewed work on adversarial machine learning, multimodal reasoning, and secure code — presented at AAAI, IJCNN, and WCCI.

AAAI ’26 · Oral 2025

MetaCipher: A Time-Persistent and Universal Multi-Agent Framework for Cipher-Based Jailbreak Attacks

Boyuan Chen, Minghao Shao, Abdul Basit, Siddharth Garg, Muhammad Shafique

An RL-driven multi-agent framework using dynamic cipher selection to bypass LLM safety guardrails. A modular Q-table selector adapts cipher strategy against evolving defenses, achieving over 92% ASR on non-reasoning LLMs and over 74% on reasoning-capable models within 10 queries.

JailbreakMulti-Agent RLLLM Safety
WCCI ’26 2026

GroundCount: Grounding Vision-Language Models with Object Detection for Mitigating Counting Hallucinations

Boyuan Chen, Minghao Shao, Siddharth Garg, Ramesh Karri, Muhammad Shafique

Three object-detection-model / VLM integration strategies for counting tasks. Prompt augmentation achieves up to 81.3% counting accuracy (+6.6pp) while cutting inference time by 22%, with ablations revealing when positional grounding helps versus hurts.

Vision-LanguageHallucinationGrounding
WCCI ’26 2026

RAVEN: Retrieval-Augmented Vulnerability Exploration Network for Memory Corruption Analysis

Parteek Jamwal*, Minghao Shao*, Boyuan Chen*, et al.

A synthetic data-collection pipeline for code vulnerability reports, powering a 7B LLM fine-tuned with QLoRA via curriculum training. Led a research team of 12 to analyze memory corruption across user code and binary programs.

Vulnerability DetectionRAGFine-Tuning
IJCNN ’25 · Oral 2025

Model Cascading for Code: A Cascaded Black-Box Multi-Model Framework for Cost-Efficient Code Completion

Boyuan Chen, Mingzhi Zhu, Brendan Dolan-Gavitt, Muhammad Shafique, Siddharth Garg

A black-box multi-model cascading framework for cost-efficient code completion with self-testing — routing queries to smaller models when feasible and escalating only when necessary, validated through large-scale ablations on HPC.

Code GenerationModel CascadingEfficiency
Symposium ’22 2022

Pre-Syndromic Surveillance for Improved Detection of Emerging Public Health Threats

Boyuan Chen — Lead Engineer & Mentor; Advised by Prof. Daniel Neill (NYU Courant)

An open-source public health surveillance system deployed for the NYC Department of Health, processing syndromic data from 50 NYC hospitals. Built contrastive topic modeling and Poisson scan-statistic pipelines, with an 11× speedup via Numba JIT.

Public HealthAnomaly DetectionSystems
Beyond the Lab

The other games I play.

Research sharpens the mind, but the body and the board keep it honest. A few pursuits keep me grounded, competitive, and endlessly curious.

Chess

The Royal Game

Rated around 1800, with multiple UAE tournament championships. Chess taught me to think many moves ahead — a habit that quietly shapes how I design experiments and adversarial strategies.

🏓

Table Tennis

Speed & Spin

A lifelong love of the fastest racket sport in the world. There is nothing quite like a long rally to reset the mind — pure reflex, rhythm, and the joy of a perfectly placed loop drive.

🏸

Badminton

Precision in Flight

Explosive footwork and delicate touch in equal measure. Badminton is where I go to move fast, laugh loud, and remember that the best play is often the deceptively simple one.

Honor of Kings

Honor of Kings

5v5 MOBA

#1 Arlie in the Middle East, with 10,000+ power heroes across all five lanes. Years of ranked play sharpened my instincts for tempo, positioning, and reading an opponent — the same intuitions I bring to adversarial research.