Methodology

How we read the submissions

To turn 1,534 written submissions into something you can compare and count, we classify each one against two fixed vocabularies — the concerns it raises and the kinds of measures it proposes. Every label a machine assigns must be backed by a quote from the author's own words, and contested labels are adjudicated by a stronger model. This page explains exactly how that works, and shows both vocabularies in full.

This is an independent CeSIA project. Classification is machine-assisted and human-checked; it is our reading of the submissions, not an official UN record. The authoritative texts are the original submissions on un.org.

1,534: written submissions classified
2: independent axes (risk · governance measure)
31: risk codes across 9 domains
15: measure types across 7 categories

1 · The source material

Ahead of the first Global Dialogue session, organisations sent written inputs to the United Nations. We work from the full set of 1,534 submissions published on un.org, captured once with a provenance record (a content hash of the source, so any later change is detectable). Each submission answers a fixed set of questions; we read the 4 free-text answers in which authors describe the risks they worry about and the measures they propose.

Identity fields — the organisation's name, its stakeholder category and its region — are deliberately withheld from the model while it classifies, so a label can never be influenced by who is speaking. Those fields are joined back only afterwards, for the breakdowns on the analysis page.

The original submissions on un.org

2 · Preparing the text

Submissions arrive in many languages and pass through several systems before reaching us, which mangles punctuation and accents along the way. Before any classification we:

Repair the encoding. Curly quotes, dashes and accented characters that were corrupted in transit are restored, so the text a model reads matches what the author actually wrote — this matters because every label has to be matched back to a verbatim quote.
Guard the form's structure. We assert the submission form's columns are exactly as expected before reading them, so a silent change to the source can never quietly misalign a question with the wrong answer.
Tidy the labels. Stakeholder categories and regions are normalised to a fixed, closed set (trimming stray spelling and spacing variants) so the breakdowns are clean.

We keep both the raw and the repaired text; the repaired version is what models see, and we record its hash so every downstream label is traceable to an exact input.

3 · The two ontologies

Every submission is classified along two independent axes. Keeping them separate lets us ask distinct questions — what are people worried about? and what do they want done about it? — without the two blurring together.

Risk — the concern raised

What harm or concern a passage is about. The backbone is the MIT AI Risk Repository (domains 1–7); we add two domains (8–9) for concerns the submissions raise repeatedly but that the base taxonomy under-covers — global power asymmetry and sovereignty, and the governance vacuum. Codes in those two domains are flagged EXT throughout.

31 codes · 9 domains

Vehicle — the form of the measure

The concrete form a proposed governance measure takes — a treaty, a standard, an audit, a new institution — independent of the harm it addresses. Drawn from the OECD STIP instrument taxonomy, the MIT mitigation taxonomy, and Maas & Villalobos on international AI institutions. Each measure is assigned one primary vehicle.

15 codes · 7 categories

The risk taxonomy in full

Select a domain to expand its codes.

Domain 1 · Discrimination & Toxicity 3 codes

1.1

unfair discrimination & misrepresentation

Unequal treatment of individuals or groups by AI, often based on race, gender, or other sensitive characteristics, resulting in unfair outcomes and representation of those groups.

1.2

exposure to toxic content (incl. child safety / CSAM / self-harm)

AI exposing users to harmful, abusive, unsafe or inappropriate content. May involve AI creating, describing, providing advice, or encouraging action. Examples of toxic content include hate-speech, violence, extremism, illegal acts, child sexual abuse material, as well as content that violates community norms such as profanity, inflammatory political speech, or pornography.

1.3

unequal performance across groups

Accuracy and effectiveness of AI decisions and actions is dependent on group membership, where decisions in AI system design and biased training data lead to unequal outcomes, reduced benefits, increased effort, and alienation of users.

Domain 2 · Privacy & Security 2 codes

2.1

compromise of privacy

AI systems that memorize and leak sensitive personal data or infer private information about individuals without their consent. Unexpected or unauthorized sharing of data and information can compromise user expectation of privacy, assist identity theft, or loss of confidential intellectual property.

2.2

AI security vulnerabilities & attacks

Vulnerabilities in AI systems, software development toolchains, and hardware that can be exploited, resulting in unauthorized access, data and privacy breaches, or system manipulation causing unsafe outputs or behavior.

Domain 3 · Misinformation 2 codes

3.1

false / misleading information

AI systems that inadvertently generate or spread incorrect or deceptive information, which can lead to inaccurate beliefs in users and undermine their autonomy. Humans that make decisions based on false beliefs can experience physical, emotional or material harms

3.2

pollution of information ecosystem / loss of consensus reality

Highly personalized AI-generated misinformation creating “filter bubbles” where individuals only see what matches their existing beliefs, undermining shared reality, weakening social cohesion and political processes.

Domain 4 · Malicious Actors & Misuse 3 codes

4.1

disinformation / surveillance / influence at scale

Using AI systems to conduct large-scale disinformation campaigns, malicious surveillance, or targeted and sophisticated automated censorship and propaganda, with the aim to manipulate political processes, public opinion and behavior.

4.2

cyberattacks, weapons development & mass harm

Using AI systems to develop cyber weapons (e.g., coding cheaper, more effective malware), develop new or enhance existing weapons (e.g., Lethal Autonomous Weapons or CBRNE), or use weapons to cause mass harm.

4.3

fraud, scams & targeted manipulation

Using AI systems to gain a personal advantage over others such as through cheating, fraud, scams, blackmail or targeted manipulation of beliefs or behavior. Examples include AI-facilitated plagiarism for research or education, impersonating a trusted or fake individual for illegitimate financial benefit, or creating humiliating or sexual imagery.

Domain 5 · Human-Computer Interaction 2 codes

5.1

overreliance & unsafe use

Users anthropomorphizing, trusting, or relying on AI systems, leading to emotional or material dependence and inappropriate relationships with or expectations of AI systems. Trust can be exploited by malicious actors (e.g., to harvest personal information or enable manipulation), or result in harm from inappropriate use of AI in critical situations (e.g., medical emergency). Overreliance on AI systems can compromise autonomy and weaken social ties.

5.2

loss of human agency & autonomy

Humans delegating key decisions to AI systems, or AI systems making decisions that diminish human control and autonomy, potentially leading to humans feeling disempowered, losing the ability to shape a fulfilling life trajectory or becoming cognitively enfeebled.

Domain 6 · Socioeconomic & Environmental 6 codes

6.1

power centralization & unfair benefit distribution (supply-side)

AI-driven concentration of power and resources within certain entities or groups, especially those with access to or ownership of powerful AI systems, leading to inequitable distribution of benefits and increased societal inequality.

6.2

increased inequality & decline in employment quality

Widespread use of AI increasing social and economic inequalities, such as by automating jobs, reducing the quality of employment, or producing exploitative dependencies between workers and their employers.

6.3

economic & cultural devaluation of human effort

AI systems capable of creating economic or cultural value, including through reproduction of human innovation or creativity (e.g., art, music, writing, code, invention), can destabilize economic and social systems that rely on human effort. This may lead to reduced appreciation for human skills, disruption of creative and knowledge-based industries, and homogenization of cultural experiences due to the ubiquity of AI-generated content.

6.4

competitive dynamics — race-to-the-bottom

AI developers or state-like actors competing in an AI ‘race’ by rapidly developing, deploying, and applying AI systems to maximize strategic or economic advantage, increasing the risk they release unsafe and error-prone systems.

6.5

governance failure (residual)

Inadequate regulatory frameworks and oversight mechanisms failing to keep pace with AI development, leading to ineffective governance and the inability to manage AI risks appropriately.

6.6

environmental harm

The development and operation of AI systems causing environmental harm, such as through energy consumption of data centers, or material and carbon footprints associated with AI hardware.

Domain 7 · AI System Safety, Failures, & Limitations 6 codes

7.1

AI pursuing its own goals

AI systems acting in conflict with human goals or values, especially the goals of designers or users, or ethical standards. These misaligned behaviors may be introduced by humans during design and development, such as through reward hacking and goal misgeneralisation, or may result from AI using dangerous capabilities such as manipulation, deception, situational awareness to seek power, self-proliferate, or achieve other goals.

7.2

dangerous capabilities

AI systems that develop, access, or are provided with capabilities that increase their potential to cause mass harm through deception, weapons development and acquisition, persuasion and manipulation, political strategy, cyber-offense, AI development, situational awareness, and self-proliferation. These capabilities may cause mass harm due to malicious human actors, misaligned AI systems, or failure in the AI system.

7.3

lack of capability / robustness

AI systems that fail to perform reliably or effectively under varying conditions, exposing them to errors and failures that can have significant consequences, especially in critical applications or areas that require moral reasoning.

7.4

lack of transparency / interpretability (technical)

Challenges in understanding or explaining the decision-making processes of AI systems, which can lead to mistrust, difficulty in enforcing compliance standards or holding relevant actors accountable for harms, and the inability to identify and correct errors.

7.5

AI welfare & rights

Ethical considerations regarding the treatment of potentially sentient AI entities, including discussions around their potential rights and welfare, particularly as AI systems become more advanced and autonomous.

7.6

multi-agent risks

Risks from multi-agent interactions, due to incentives (which can lead to conflict or collusion) and/or the structure of multi-agent systems, which can create cascading failures, selection pressures, new security vulnerabilities, and a lack of shared information and trust.

Domain 8 · Global AI power asymmetry, sovereignty & capacity EXT 4 codes

8.1 EXT

exclusion from AI rule-making & norm-setting

Being governed by AI rules one had no role in shaping ("rule-takers, not rule-makers").

8.2 EXT

digital/data sovereignty erosion & dependency / lock-in

Loss of self-determination over AI via dependence on foreign-built/controlled models, compute, cloud or data.

8.3 EXT

national AI capacity, compute & connectivity deficit

Absolute lack of compute, data, skills, institutions or connectivity to build, evaluate or govern AI.

8.4 EXT

cultural-linguistic erasure & data/knowledge extraction (data colonialism)

A community's language, culture or knowledge appropriated without consent or erased from the AI ecosystem.

Domain 9 · Governance vacuum, fragmentation & accountability gaps EXT 3 codes

9.1 EXT

regulatory vacuum & governance lag

AI deployed faster than any binding rules exist; harm occurs today, in the gap.

9.2 EXT

governance fragmentation & regulatory arbitrage

Divergent, uncoordinated rules create loopholes, "splinternet," forum-shopping and races to the bottom.

9.3 EXT

accountability & liability gap

No responsible actor and/or no redress when AI causes harm; agentic-system liability voids; unenforceable pledges.

The governance-measure taxonomy in full

Select a category to expand its measure types.

The concrete form a governance measure takes: the tool or mechanism through which governance is enacted, independent of what it substantively demands (content) and of the harm it addresses (risk). Each measure is assigned one primary vehicle.

Law & agreements 3 types

Non-binding norms & declarations

soft_law

Shared principles, values, codes of conduct, voluntary commitments, and political declarations that guide behaviour without creating legally enforceable obligations. Choose this when a measure sets expectations or signals intent but stops short of binding law.

e.g. Recommendation of the Council on Artificial Intelligence (OECD AI Principles) · Recommendation on the Ethics of Artificial Intelligence · Hiroshima Process International Code of Conduct for Organizations Developing Advanced AI Systems · The Bletchley Declaration

Binding international agreement

treaty

A legally binding instrument negotiated between states or international organisations — a treaty, convention, protocol, or binding coalition agreement — that creates enforceable obligations across more than one jurisdiction. Choose this over soft law when the commitment is legally binding, and over domestic regulation when it operates between states rather than within one.

e.g. Nuclear Suppliers Group · Wassenaar Arrangement · Missile Technology Control Regime · OPCW · BWC Implementation Unit

Domestic & regional regulation

regulation

Legally binding rules enacted by a national or regional authority — statutes, directives, risk-tiered frameworks, sectoral rules, licensing regimes, and other mandated requirements enforceable within that jurisdiction or bloc. Choose this when a public authority imposes binding rules inside its own borders.

e.g. the General Data Protection Regulation (GDPR) · bioethics legislation · scientific codes of conduct

Standards & assurance 3 types

Technical standards & benchmarks

standards

Technical specifications, evaluation protocols, benchmarks, conformity-assessment criteria, and interoperability specifications that define how AI systems should be built, measured, or certified. Concerns setting the specification itself — not the law that makes it mandatory, nor the audit that checks compliance with it.

e.g. metrology · inspection · certification · accreditation · conformity assessments

Testing, auditing & evaluation

audit_eval

Structured examination of an AI system or the organisation behind it, before or after release — red-teaming, model and capability evaluations, algorithmic audits, impact assessments, and third-party conformity testing. This is the act of assessing, distinct from the standard that sets the criteria and from ongoing monitoring of systems in operation.

e.g. Audits · Benchmarks · Model Evaluation · Red Teaming

Monitoring, incident reporting & registries

monitoring

Continuous observation of AI systems in operation — post-deployment monitoring, incident-disclosure and reporting mechanisms, harm or model registries, and early-warning systems. Choose this over testing when the measure is ongoing and operational rather than a one-off assessment.

e.g. Post-deployment monitoring reports · Incident Reporting · Model registration

Institutions & coordination 2 types

Institutional creation

institution

Establishing a new body to carry a governance role — an agency, expert or scientific panel, secretariat, oversight authority, fund administrator, or regional hub. Concerns creating the organisation itself; the specific role it plays is captured separately.

e.g. mergers of STI-related ministries · reform of an innovation agency · creation of a new oversight body · IPCC · IAEA (Department of Safeguards) · CERN

Coordination & networking mechanisms

coordination

Mechanisms that connect and align existing actors or regimes rather than creating a new authority — mutual-recognition arrangements, working groups, knowledge-sharing networks, interoperability and framework-mapping efforts, and peer-learning platforms. Choose this over institutional creation when the measure links existing bodies instead of founding one.

e.g. research and innovation councils and committees · WTO · ICAO · IMO · FATF

Capacity & resources 2 types

Capacity-building & technical assistance

capacity_building

Programmes that build human or institutional capability — training, education, AI literacy, fellowships, centres of excellence, regulator and judiciary upskilling, and transfer of technical expertise. Provides a skill or expertise, as distinct from providing a physical or data resource.

e.g. fellowships · loans and scholarships · Capacity Development Network

Infrastructure & resource provision

infrastructure

Provision of shared physical or informational resources — public compute, data commons, connectivity, research funding, shared testing facilities, and open-model access. Provides a resource, as distinct from building a skill.

e.g. major scientific equipment · e-infrastructures such as data and computing systems · communication networks

Market & fiscal 2 types

Public procurement & market-shaping

procurement

Use of the state's role as a buyer or market actor to steer AI development — conditions attached to public procurement, competition and anti-monopoly measures, and promotion of public-interest goods. Works through purchasing power and market structure rather than direct regulation.

e.g. Advancing the Responsible Acquisition of Artificial Intelligence in Government (OMB M-24-18) · Driving Efficient Acquisition of Artificial Intelligence in Government (OMB M-25-22)

Fiscal instruments

fiscal

Financial levers that raise or redistribute resources — taxes and levies, subsidies, redistribution and transition funds, dedicated participation funds, and links between governance commitments and development finance. Concerns money raised or reallocated, as distinct from resources or infrastructure directly provided.

e.g. corporate tax income benefits · reductions in tariffs for imported research equipment · reimbursements of value added tax · reductions to social insurance contributions

Experimentation 1 types

Regulatory sandboxes & pilots

sandbox

Controlled, time-bound settings for supervised experimentation — regulatory sandboxes, policy innovation labs, live demonstrations, and pilot or cross-border test projects that trial an approach before wider adoption.

e.g. AI regulatory sandboxes (EU AI Act, Article 57) · Spain's national AI regulatory sandbox (Royal Decree 817/2023)

Firm-level controls 2 types

Organizational governance & internal controls

org_controls

Governance measures internal to the organisations that build or deploy AI — internal ethics or risk committees, board-level oversight, whistleblower protections, conflict-of-interest and anti-capture rules, and value-chain due diligence. Concerns how a provider governs itself, through structure and process.

e.g. Board of directors · Risk Register · Whistleblower protection · Anonymous Reporting

Technical safety & security controls

technical_controls

Technical measures applied to AI systems and their infrastructure — safety-by-design, model and model-weight security, content-safety filtering, access and usage controls, staged deployment, and compute or export controls. Concerns engineering and access safeguards built into or around the system.

e.g. Security measures · AI-Generated Content Watermarking · KYC-Based Model Access Threshold · Staged deployment

4 · How each submission is classified

No single model decides a label on its own. For each submission and each axis, a panel of independent models proposes codes; codes only survive if more than one model agrees and if they can be quoted; anything in between is escalated, never guessed. The steps:

1

Independent proposals

Several models read the four answers separately and each proposes the codes it sees, with a supporting quote for every one. Working independently means a shared mistake has to be made twice to matter.
2

Grounding gate

Each proposed code is checked against the submission text: the quote must really appear in the author's words (see §5). A code with no quotable basis does not proceed on that model's vote.
3

One bounded re-ask

If a model's quote can't be found, it is asked once more to point to the exact words — a single, bounded retry, not an open-ended loop.
4

Quorum

A code accepted with a valid quote by two or more independent models is accepted. A code proposed by only one becomes contested rather than accepted outright.
5

Arbitration

Contested codes are decided in one pass by a stronger arbiter model (Claude Opus), which re-checks the quote and either confirms or rejects each. A fabricated or unquotable code is rejected, with the reason recorded.

The panel currently pairs an open-weight 120-billion-parameter model (GPT-OSS-120B) with a fast frontier model as proposers, and uses Claude Opus as the arbiter; the exact roster is fixed at a throughput pilot before the full run. Every decision — each vote, quote, tier, re-ask and arbitration, with the model versions and token usage — is recorded per submission, so any label can be traced back to its evidence.

5 · Every code must be quotable

The single most important rule: a code only counts if the model can point to words the author actually wrote. This is what keeps a language model from quietly inventing concerns nobody raised. When a model proposes a code and a quote, we verify the quote against the text at one of four levels:

Exact

The quote appears verbatim in the submission. Accepted.

Stitched

The quote joins two verbatim fragments across an ellipsis, each still a real run of words. Accepted, and stored as the rejoined fragments.

Snapped

The quote is a near-miss (light paraphrase); we snap it back to the longest matching verbatim span and accept that span instead of the paraphrase.

None

No verbatim basis can be recovered. The code is re-asked once, then left to arbitration or flagged — never silently kept.

The recovery only ever runs one way: a fuzzy match can rescue a code for review, but it can never upgrade a paraphrase into an auto-accepted exact quote. When we show a quote on the analysis pages, it is the verbatim span we verified — in the author's original language.

6 · Quality & agreement

These figures are placeholders. They are measured in a validation stage that runs after a set of submissions has been hand-labelled by people. The numbers below will be filled in — and this notice removed — once that gold set is complete.

Codes carrying a verbatim quote: to be finalised; share of accepted codes grounded in the author's own words
Agreement with human labels: to be finalised; Cohen's κ against a hand-labelled gold set
Grounding precision: to be finalised; planted-quote audit: how reliably a fabricated quote is rejected
Codes sent to arbitration: to be finalised; share of proposed codes that were contested and adjudicated
Human-labelled gold set: to be finalised; submissions double-labelled by people to measure agreement

7 · How to read the numbers

Counts are of submissions, not endorsements. A code's count is how many submissions raised it — not a vote, a ranking, or a measure of how strongly anyone feels.
Precision first. The pipeline is tuned to rather miss a code than invent one. Real concerns stated very obliquely may go uncounted; that is a deliberate trade to keep what we do report trustworthy.
Two domains are our own. Risk domains 8 and 9 are CeSIA extensions beyond the MIT taxonomy, flagged EXT. They reflect a judgement about what the corpus surfaces, and reasonable people could scope them differently.
One primary vehicle per measure. A proposal that blends, say, a standard and an audit is assigned its single most salient form, so the vehicle breakdown reads as "the primary form of each measure."
Machine classification. Labels are model-assigned and human-checked, not curated by hand end-to-end. The agreement figures in §6 are how we quantify how far to trust them.

8 · Provenance & reproducibility

The codebook is frozen and content-addressed: the taxonomies shown above are pinned by hash, so what you read here is exactly what the pipeline classified against. Every submission's record carries these identifiers alongside its labels.

Methodology version: v1
Codebook snapshot: 2026-07-02
Risk taxonomy (SHA-256): 760e4d58b997…
Vehicle taxonomy (SHA-256): 44d43fabdd4c…

Sources for the taxonomies: the MIT AI Risk Repository (Slattery et al., 2024, arXiv:2408.12622); the OECD/EC STIP Compass policy-instrument taxonomy; the MIT AI risk-mitigation taxonomy; and Maas & Villalobos (2023) on international AI institutions. Full references travel with each source file.