Whose Language, Whose Model? Public-Interest Multilingual LLMs
Abstract
The workshop examined how the development of large language models (LLMs) can be aligned with public‑interest goals, especially for speakers of low‑resource languages in the Global South. Panelists highlighted the shortcomings of current, North‑centric governance models and argued for robust, multi‑stakeholder participation throughout the AI lifecycle—from data collection to post‑deployment monitoring. Drawing on case studies (e.g., Quechua language work) and lessons from social‑media regulation, the discussion surfaced concrete mechanisms for inclusive design, data ownership, evaluation standards, and incentive structures. Breakout groups then explored practical ways to embed these mechanisms, surfacing recurring themes such as data‑set bias, community‑owned datasets, and the need for international standards of inclusivity. The session concluded with calls to continue the conversation, a preview of a forthcoming “trust‑and‑safety in low‑resource languages” project, and a promotion of the European Centre’s framework for meaningful engagement.
Detailed Summary
- Aliya Bhatia opened the workshop, noting the absence of the scheduled facilitator Marlena Wisniak but assuring participants of a robust agenda.
- She defined the workshop’s purpose: to explore meaningful multi‑stakeholder participation across the AI lifecycle (conception → post‑deployment).
- Emphasis was placed on the current concentration of LLM development decisions in the Global North and the resultant exclusion of civil‑society experts and communities most affected by AI deployments.
- Bhatia announced that after a series of “provocations” from the panel, participants would break into small groups to discuss concrete initiatives and later report back.
2. Policy & Regulatory Landscape (5‑20 min)
2.1 Soft‑law → Hard‑law Trajectory
- Jhalak Kakkar described the typical progression in emerging tech governance:
- Consensus building among stakeholders →
- Voluntary commitments / soft‑law →
- Codification into standards and binding regulations.
- She warned that, mirroring early social‑media regulation, the “consensus‑building” phase has lingered too long, allowing platforms to operate with minimal accountability.
2.2 Need for Process‑Based Legal Requirements
- Kakkar advocated for process‑based legal mandates (e.g., mandatory impact assessments, bias‑audit procedures) rather than outcome‑only rules.
- She argued that such mandates would surface harms early, preventing “reinvention of the wheel” once models are already deployed.
2.3 Audience Reaction
- Bhatia prompted participants to consider why participation is needed beyond a box‑checking exercise, setting up the next speaker’s focus on power dynamics.
3. Meaningful Participation & Power (20‑35 min)
- Dhanaraj Thakur framed participation as a distribution of power issue.
- Citing development‑studies (Chambers, 1990s), he highlighted the “participatory supply‑chain” approach: start with community needs and let those shape technology use‑cases, rather than having firms dictate use‑cases post‑hoc.
- He noted that communities possess nuanced contextual expertise essential for language‑specific model design (dialectal variation, cultural idioms).
- Thakur stressed that meaningful participation must avoid “one‑off consultations” and instead embed users as insiders throughout data collection, annotation, model training, and evaluation.
4. Linguistic Identity & Low‑Resource Language Case Study (35‑45 min)
- Kakkar revisited her recent four‑report series on trust & safety in low‑resource languages, focusing on Quechua (≈10 M speakers).
- Key findings:
- LLMs trained on synthetic or scant Quechua data risk reinforcing linguistic oppression by privileging dominant languages (e.g., Spanish) in downstream applications.
- Language is tightly linked to identity; mis‑representation can erode cultural continuity.
- She connected this to the broader “human‑in‑the‑loop” debate, suggesting a shift toward “machine‑in‑the‑loop” where technical tools support human experts (e.g., community editors) rather than replace them.
- Kakkar posed two open questions:
- Who should hold accountability for ensuring inclusive participation and redress?
- What concrete participatory mechanisms (beyond ad‑hoc feedback) can be institutionalised?
5. Breakout‑Group Instructions (45‑55 min)
- Bhatia displayed three guiding questions on the screen:
- What does meaningful engagement look like?
- How can we foster and include it?
- What does success look like?
- Participants were asked to form pairs (two‑person rows) and discuss specific AI lifecycle stages (design, development, evaluation, deployment, post‑deployment).
- Emphasis was placed on identifying who should be involved when, and what incentives could motivate private‑sector compliance (including app‑layer considerations).
- Bhatia promised to circulate after ≈15 min to collect reports.
6. Group Report‑Backs (55‑70 min)
6.1 Data‑Centric Participation (Group 1 – Prakash Isral)
- Highlighted ISO 42001 (AI standard) which defines 13 dimensions; stressed that data is the foundational point for bias and fairness interventions.
- Argued for multi‑stakeholder vetting of training data before model design.
- Mentioned the need for feedback loops during deployment for fine‑tuning.
6.2 Measuring Inclusivity (Group 2 – Richard Brown)
- Discussed the lack of international metrics for language‑wise inclusivity.
- Proposed developing standards (potentially under the summit’s umbrella) to assess the inclusiveness of models.
- Suggested that governments could incentivise use of country‑specific LLMs rather than generic ones, aligning policy with local linguistic ecosystems.
6.3 Data Ownership & Self‑Determination (Group 3 – Unnamed participants)
- Focused on caste‑based hate‑speech and indigenous mis‑representation in Indian LLMs.
- Raised the principle that communities should own the datasets derived from them, not merely supply data.
- Emphasised the need for informed consent and the right to refuse inclusion in AI training pipelines.
6.4 Incentivising Private‑Sector (Group 4 – Unnamed participants)
- Identified civil‑society pressure and regulatory benchmarks as primary levers.
- Suggested aligning impact‑investor funding with inclusive AI outcomes and embedding diverse representation within corporate culture.
- Noted that profit‑only motives may neglect low‑demand languages, akin to “orphan disease” research.
6.5 End‑to‑End Lifecycle Integration (Group 5 – Unnamed participants)
- Asserted that every lifecycle stage—from data to feature extraction, development, evaluation, and post‑deployment—requires stakeholder input.
- Warned that even with perfect data, biased feature engineering or evaluation metrics can re‑introduce inequities.
- Anticipated future machine‑to‑machine interactions overseen by human auditors, requiring continuous monitoring of disparate impacts.
7. Closing Remarks & Announcements (70‑80 min)
- Bhatia thanked participants, acknowledging the session ran over time.
- Shared contact details for continued collaboration.
- Announced two forthcoming initiatives:
- Trust‑and‑Safety in Low‑Resource Languages project (report to be released the next day).
- Framework for Meaningful Engagement (by Marlena Wisniak, European Centre for Not‑for‑Profit Law) – a step‑by‑step guide for inclusive AI governance.
- Encouraged attendees to stay engaged beyond the workshop, emphasizing that the conversation should continue in corridors, future sessions, and through the shared email list.
Key Takeaways
- Power Redistribution Is Central – Meaningful participation is fundamentally about re‑balancing decision‑making power from dominant (Global North, industry) actors to the communities directly affected by AI systems.
- Data Is the First Leverage Point – Multi‑stakeholder oversight must start at data collection/curation; communities should own their data and control its use.
- Soft‑Law Is Insufficient – Voluntary commitments need to transition rapidly to process‑based legal requirements (impact assessments, bias audits) to avoid the “social‑media‑regulation” repeat.
- Language ≠ Just Text – Low‑resource languages embody cultural identity; neglecting them in LLM training can exacerbate linguistic oppression and erode cultural heritage.
- Inclusive Standards Are Needed – The community lacks universal metrics for language‑wise inclusivity; developing international standards (potentially linked to ISO 42001) is a priority.
- Incentives Must Align Across Sectors – Governments, civil society, impact investors, and corporations all need aligned incentives—regulatory benchmarks, funding criteria, and internal diversity mandates—to promote participatory AI.
- Lifecycle‑Wide Engagement – Participation should be embedded at every stage (design, development, evaluation, deployment, post‑deployment); a single consultation is a token, not a solution.
- Community‑Led Evaluation – Evaluative frameworks must incorporate cultural norms and be conducted by community insiders, not just external auditors.
- Future Directions – Upcoming trust‑and‑safety research and the European Centre’s engagement framework will provide concrete tools for operationalising the workshop’s recommendations.
See Also:
- data-sharing-infrastructures-for-ai-building-for-trust-purpose-and-public-values
- evaluations-and-open-source-software-for-ai-for-social-good-at-scale
- ai-for-democracy-reimagining-governance-in-the-age-of-intelligence
- beyond-the-cloud-the-sovereign-ai-moment
- democratizing-ai-compute-and-digital-data-infrastructures
- empowering-communities-in-the-age-of-advanced-ai-inclusion-and-safety-for-sustainable-development