About Me
I am a final-year Ph.D. student at CISPA Helmholtz Center for Information Security, supervised by Michael Backes. Prior to that, I obtained my bachelor (2018) and master (2021) degrees from University of Science and Technology of China (USTC).
Research Interests
- Agentic RL
- LLM Post-Training & Alignment
- Certifiable Robustness Methods
Publications
-
Jailbreaking Attacks vs. Content Safety Filters: How Far Are We in the LLM Safety Arms Race?
-
Provably Cost-Sensitive Adversarial Defense via Randomized Smoothing
-
Inside the Black Box: Detecting Data Leakage in Pre-trained Language Encoders
-
Label Incorporated Graph Neural Networks for Text Classification
Research Experience
Summer 2021: NLP Research Intern, Alibaba DAMO
- Use self-training methods to optimize the machine translation performance
- Optimize the evaluation metrics for machine translation, let the model evaluate its own performance without external reference.
Summer 2020: NLP Research Intern, Baidu Talent Intelligence Center
- Extract hierarchical relations between skills in the JD dataset.
- Skills representation learning