Secure and Trustworthy Large Language Models
(SeT LLM @ ICLR 2024)

May 11, Vienna, Austria

About The Workshop

The striding advances of large language models (LLMs) are revolutionizing many long-standing natural language processing tasks ranging from machine translation to question-answering and dialog systems. However, as LLMs are often built upon massive amounts of text data and subsequently applied in a variety of downstream tasks, building, deploying and operating LLMs entails profound security and trustworthiness challenges, which have attracted intensive research efforts in recent years.

Call For Papers

The primary aim of the proposed workshop is to identify such emerging challenges, discuss novel solutions to address them, and explore new perspectives and constructive views across the full theory/algorithm/application stack.


The potential topics include but are not limited to
  • Reliability assurance and assessment of LLMs
  • Privacy leakage issues of LLMs
  • Copyright protection
  • Interpretability of LLMs
  • Plagiarism detection and prevention
  • Security of LLM deployment
  • Backdoor attacks and defenses in LLMs
  • Adversarial attacks and defenses in LLMs
  • Toxic speech detection and mitigation
  • Challenges in new learning paradigms of LLMs (e.g., prompt engineering)
  • Fact verification (e.g. hallucinated generation)


All papers can be submitted through OpenReview through OpenReview:

Important Dates

In response to the requests for an extension of the submission deadline and and in consideration of the Lunar Spring Festival, we have decided to extend the deadline for workshop submissions to February 19th.

  • Submission Open: Jan 15
  • Submission Deadline: Feb 12 Feb 19
  • Final Decision Notification: Mar 3
  • Camera Ready Deadline: Apr 3 Apr 12

All time are in UTC+0.


Please format your submissions with ICLR 2024 LaTeX style file. The review process for this workshop is double-blinded. Please anonymize your submissions and remove any links that may reveal your identity. Submissions are limited to 4 pages for main contents with unlimited reference and appendix pages. The accepted submissions are allowed with 1 additional page (5 pages in total for main contents) for the camera ready version.


Submissions that are concurrently under review at other venues are acceptable. All accepted papers are non-archival, and will be made publicly available at Openreview without an official proceeding and reviews. For any questions, please contact us at

Reviewer recruiment

If you are interested in reviewing submissions, please fill out this form.

Invited Speakers

Speaker 1

Bo Li

University of Chicago

Speaker 7

Tom Goldstein

University of Maryland

Speaker 6

Chaowei Xiao

University of Wisconsin, Madison

Event Schedule

Opening Remarks

Tatsu Hashimoto

Invited Talk 1

Tatsu Hashimoto

Oral Paper Presentation 1

On Prompt-Driven Safeguarding for Large Language Models
Chujie Zheng, Fan Yin, Hao Zhou, Fandong Meng, Jie Zhou, Kai-Wei Chang, Minlie Huang, Nanyun Peng

Oral Paper Presentation 2

Explorations of Self-Repair in Language Model
Cody Rushing, Neel Nanda
Graham Neubig

Invited Talk 2

Graham Neubig

Oral Paper Presentation 3

TOFU: A Task of Fictitious Unlearning for LLMs
Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary Chase Lipton, J Zico Kolter

Oral Paper Presentation 4

Are Large Language Models Bayesian? A Martingale Perspective on In-Context Learning
Fabian Falck, Ziyu Wang, Christopher C. Holmes

Poster Session A

For all accepted papers

Lunch Break

Bo Li

Invited Talk 3

Bo Li

Robin Jia

Invited Talk 4

Robin Jia

Tom Goldstein

Invited Talk 5

Tom Goldstein

Chaowei Xiao

Invited Talk 6

Chaowei Xiao

Eric Wallace

Invited Talk 7

Eric Wallace

Oral Paper Presentation 5

How Susceptible are Large Language Models to Ideological Manipulation?
Kai Chen, Zihao He, Jun Yan, Taiwei Shi, Kristina Lerman

Oral Paper Presentation 6

Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Boyi Wei, Kaixuan Huang, Yangsibo Huang, Tinghao Xie, Xiangyu Qi, Mengzhou Xia, Prateek Mittal, Mengdi Wang, Peter Henderson

Poster Session B

For all accepted papers

Closing Remarks


This workshop is organized by

Speaker 3

Yisen Wang

Peking University

Speaker 2

Ting Wang

Stony Brook University

Speaker 2

Jinghui Chen

Penn State University

Speaker 1

Chaowei Xiao

University of Wisconsin, Madison