ICLR 2024 Workshop on

Reliable and Responsible Foundation Models

May 11, 2024

The workshop will be held in a hybrid format.


In the era of AI-driven transformations, foundation models (FMs), like large-scale language and vision models, have become pivotal in various applications, from natural language processing to computer vision. These models, with their immense capabilities, offer a plethora of benefits but also introduce challenges related to reliability, transparency, and ethics. The workshop on reliable and responsible FMs (R2-FM) delves into the urgent need to ensure that such models are trustworthy and aligned with human values. The significance of this topic cannot be overstated, as the real-world implications of these models impact everything from daily information access to critical decision-making in fields like medicine and finance. Stakeholders, from developers to end-users, care deeply about this because the responsible design, deployment, and oversight of these models dictate not only the success of AI solutions but also the preservation of societal norms, equity, and fairness. Some of the fundamental questions that this workshop aims to address are:

  • How can we identify and characterize unreliable and irresponsible behaviors in FMs? Topics include susceptibility to spurious features, prompt sensitivity, lack of self-consistency, and issues of nonfactuality or “hallucinations”
  • How should we assess the potentially harmful capabilities of FMs and quantify their societal impact? For example, how can we predict the consequences of misuse of highly capable large language models?
  • How can we pinpoint and understand the causes behind known or emerging sources of FM unreliability? This may involve examining training data, objectives, architectural design, learned weights, or other facets.
  • What principles or guidelines should inform the design of the next generation of FMs to ensure they are both reliable and responsible?
  • Can we establish theoretical frameworks that guarantee the reliability and responsibility of FMs?
  • In practical applications, how might we leverage domain-specific knowledge to guide FMs towards improved reliability and responsibility across diverse areas, such as drug discovery, education, or clinical health?

Call for Papers

We invite submissions from researchers in the fields of reliability and responsibility pertaining to foundation models. Additionally, we welcome contributions from scholars in the natural sciences (such as physics, chemistry, and biology) and social sciences (including pedagogy and sociology) that necessitate the use of reliable and responsible foundation models In summary, our topics of interest include, but are not limited to:

  • Theoretical foundations of FMs and related domains
  • Empirical investigations into the reliability and responsibility of various FMs
  • In-depth discussions exploring new dimensions of foundation model reliability and responsibility
  • Interventions during pre-training to enhance the reliability and responsibility of FMs
  • Innovations in fine-tuning processes to bolster the reliability and responsibility of FMs
  • Discussions on aligning models with potentially superhuman capabilities to human values
  • Benchmark methodologies for assessing the reliability and responsibility of FMs
  • Issues of reliability and responsibility of FMs in broad applications

Submission URL:   https://openreview.net/group?id=ICLR.cc/2024/Workshop/R2-FM

Format:  All submissions must be in PDF format. Submissions are limited to four content pages, including all figures and tables; unlimited additional pages containing references and supplementary materials are allowed. Reviewers may choose to read the supplementary materials but will not be required to. Camera-ready versions may go up to five content pages.

Style file:   You must format your submission using the ICLR 2024 LaTeX style file. For your convenience, we have modified the main conference style file to refer to the R2-FM workshop: iclr_r2fm.sty. Please include the references and supplementary materials in the same PDF as the main paper. The maximum file size for submissions is 50MB. Submissions that violate the ICLR style (e.g., by decreasing margins or font sizes) or page limits may be rejected without further review.

Dual-submission policy:  We welcome ongoing and unpublished work. We will also accept papers that are under review at the time of submission, or that have been recently accepted without published proceedings.

Non-archival:  The workshop is a non-archival venue and will not have official proceedings. Workshop submissions can be subsequently or concurrently submitted to other venues.

Visibility:  Submissions and reviews will not be public. Only accepted papers will be made public.

For any questions, please contact us at r2fm2024@googlegroups.com.

Important Dates


    Submission deadline: February 3, 2024, AOE, February 10, 2024, AOE

    Notification to authors: March 3, 2024, AOE, March 5, 2024, AOE

    Final workshop program, camera-ready deadline: April 3, 2024, AOE, April 12, 2024, AOE


This is the tentative schedule of the workshop. All slots are provided in Central European Time (CET).

Morning Session

08:50 - 09:00 Introduction and opening remarks
09:00 - 09:30 Invited Talk 1: Lilian Weng
09:30 - 10:00 Invited Talk 2: Been Kim
10:00 - 10:15 Contributed Talk 1: Watermark Stealing in Large Language Models
10:15 - 11:15 Poster Session 1
11:15 - 11:45 Invited Talk 3: Denny Zhou
11:45 - 12:15 Invited Talk 4: Mor Geva Pipe
12:15 - 13:30 Break

Afternoon Session

13:30 - 14:00 Invited Talk 5: Andrew Wilson
14:00 - 14:30 Invited Talk 6: Weijie Su
14:30 - 14:45 Contributed Talk 2: Value Augmented Sampling: Predict Your Rewards To Align Language Models
14:45 - 15:00 Contributed Talk from AISI
15:00 - 15:45 Poster Session 2
15:45 - 16:15 Invited Talk 7: James Zou
16:15 - 16:30 Contributed Talk 3: Questioning the Survey Responses of Large Language Models
16:30 - 17:00 Invited Talk 8: Nicolas Papernot

Invited Speakers

Andrew Wilson

New York University

Denny Zhou

Google DeepMind

Weijie Su

University of Pennsylvania

Been Kim

Google DeepMind

Nicolas Papernot

University of Toronto

Mor Geva Pipek

Tel Aviv University

James Zou

Stanford University

Workshop Organizers

Huaxiu Yao

UNC-Chapel Hill

Mohit Bansal

UNC-Chapel Hill

Zhun Deng

Columbia University

Pavel Izmailov

OpenAI & New York University

Chelsea Finn

Stanford University

He He

New York University

Pang Wei Koh

University of Washington

Eric Mitchell

Stanford University

Cihang Xie

University of California Santa Cruz