Jhilmil

If AI is going to be deployed globally, it should work globally. We are conducting cultural and social evaluations to better measure and improve the contextually-specific capabilities of frontier AI. We are partnering with international NGOs, local experts, and AI labs to build regional benchmarks of AI.

The Collective Intelligence Project, in partnership with Microsoft Research India and Karya, is launching a pilot to test a community‐driven approach for evaluating frontier AI models on real‐world, social impact criteria in India.

Locally grounded perspectives are vital to shaping large-language models that address diverse cultural communities and societal needs. To get these perspectives, people on the ground must be consulted and become active participants in validating model responses. The challenge is in figuring out an accessible and engaging way to elicit meaningful input from people. 

Our solution is to integrate civil society organizations (CSOs) into the process. They are often best placed to provide both contextual understanding and subject-matter expertise. CSOs also have strong incentives to shape the conversation around AI development and governance. By leveraging participatory engagement with CSOs in order to craft culturally resonant evaluations, we are building a structured process for individuals across India’s varied communities to review responses to prompts from a given model. If successful, this pilot will serve as the foundation for community-led, culturally-specific evaluations of frontier AI models.

The Challenge

AI is on track to lead to profound and pervasive societal shifts. This year, choices that are consequential for the global public at large—how and when to release models, determining underlying principles for AI behavior, building for cultural pluralism and language diversity—will be made. By default, these decisions fall to a small percentage of those likely to be affected: a recipe for blind spots, overlooked points of failure, and monoculture. The disconnect between high-impact decisions and meaningful public input may grow as AI capabilities accelerate.

Our goal is to change this dynamic. First, by eliciting a broad spectrum of data from ~the globe. (values, stories, perspectives, preferences). Second, by using this knowledge to steer model development and AI policy. 

The Project

Global AI Dialogues, built in partnership with Remesh and Prolific, creates the infrastructure for regular global public input into the future of AI. 

Our approach utilizes a structured collective dialogue process combining demographic data collection via Prolific, and deliberative discussion and consensus-building through Remesh.ai. Participants engage in 15-60 minute sessions where they deliberate on key issues.

Each Global Dialogue will include:

  1. Longitudinal benchmarks to track AI’s impact and progress

  2. Specific scenarios for input in model responses and policy decision-making

  3. Partner questions from relevant organizations (research labs, AI companies, civil society organizations, governments) who use Global Dialogues in pursuit of their own agendas. 

The results will be used to:

  1. Create an open, longitudinal dataset of humanity’s views, values, preferences, and experiences with AI; we already have multiple partners interested in building on this data. 

  2. Inform specific decisions being made about AI development.

  3. Enable tech - such as evaluations - to be built on top of it.

  4. Make common priorities obvious and easier to advocate for.

Call for Partnerships

Global Dialogues is a collective intelligence project; as such, it will be most impactful as a coalitional effort.

We invite you to partner with us by doing the following (in order of increasing commitment):

  • Submit a question to join our rideshare mission: A rideshare mission is when multiple independent payloads are integrated into a single launch. Global Dialogues is intended to be such a mission—add your question to our launches! Ask people what they think of your current pet hypotheses, about the risks you’re most concerned about, or how they expect AI will impact their lives. We will work with you to design an effective question, whether you are interested in scenarios, value-based questions, or experience-based questions. Our core purpose is to ensure AI development is done in service of all; ask any question that will help you reach this goal. 

  • Join the benchmarking coalition: We are working to build a set of longitudinal benchmarks. These will be constructed from questions that we will ask in every GD, enabling us to track a core set of views towards / impacts of accelerating AI. 

  • Work with us to create a topic- or decision-specific Global Dialogue: We are working with partners who are considering specific decisions or are interested in a core set of topics (e.g. human-AI relationships, interspecies communication, Global South conceptions of AI safety, AI and faith) to deploy partner-specific Global Dialogues that can dig in on a specific topic. 

  • Build on the data. The goal of open data is to enable open science: contribute by building tools or extracting insights from the GD data to support your work. 

  • Be a committed audience. CIP ran a series of alignment assemblies in 2023-24 alongside committed audiences like OpenAI, Anthropic, and the UK AISI, where partners committed to taking into account public voice in their decision-making. Join GDs as a committed audience to do the same.

Contact: hi@cip.org