Towards a Safer South: Launch of the Global South Network on AI Safety and Evaluation
Abstract
The event marked the formal inauguration of the Global South Network on AI Safety & Evaluation, a collaborative alliance of research institutes, civil‑society organisations, and governmental bodies. Opening remarks highlighted five flagship projects for the coming year—multilingual AI benchmarks, a gender‑harm taxonomy, procurement‑focused policy work, labour‑market impact studies, and health‑information‑system evaluations. Subsequent keynotes from Kenya’s technology envoy and a UN AI lead contextualised the urgency of inclusive, culturally‑aware safety standards. A panel of industry, academia and civil‑society experts then debated practical challenges—language diversity, red‑team capacity, compute access, governance concentration, and the need for regional hubs—while offering concrete actions for the network’s first year.
Detailed Summary
- Purpose of the launch – to create a research‑civil‑society infrastructure that can generate context‑aware AI safety standards for the Global South.
- Flagship Projects (2024‑25)
- Multilingual AI Benchmarks – partnership with the Collective Intelligence Project and CARIA to develop evaluation datasets for low‑resource languages.
- Gender‑Harm Taxonomy – collaboration with the GXD Hub and the Global Centre for AI Governance to build a taxonomy and incident‑reporting database for gender‑related harms.
- Procurement‑Driven Policy – translating benchmarks into procurement criteria that enable governments of the Global South to influence market incentives toward responsible innovation.
- Labour‑Market Impact Evaluation – work with ITS Rio to assess AI’s effects on employment across the Global South.
- Health‑Information‑System Evaluation – testing whether current generative‑AI tools meet clinicians’ needs in low‑resource health systems.
- Strategic framing – emphasised that AI procurement can act as a “third way” of governance (citing India’s emerging model) to set global standards for responsible innovation.
- Call for collaboration – urged participants to engage post‑launch, stressing that the network must be a joint civil‑society and research endeavour.
2. First Keynote: Ambassador Philip Thigo (Kenya)
- Structural exclusion – highlighted that the Global South has been systematically left out of AI safety conversations; Kenya is the only Global South member of the International Network of AI Safety Institutes.
- Urgency of inclusive safety – argued that models trained and deployed without local context cause mis‑use (e.g., AI chatbots providing emotional companionship rather than productive assistance).
- Four structural gaps identified:
- Red‑team capacity – lack of skilled teams to stress‑test models in local settings.
- Compute access – researchers in the Global South often lack the hardware needed for large‑scale evaluations.
- Linguistic & cultural mismatch – benchmarks often ignore regional dialects (e.g., Kiswahili variations across Kenya).
- Governance concentration – benchmarks are defined by a small set of institutions, consolidating power and risking bias.
- Recommendations
- Create regional nodes (e.g., one per continent) to decentralise governance.
- Produce multilingual benchmark datasets and conduct an annual red‑team exercise.
- Publish a Global South AI Safety Report with a broadened definition of safety (including environmental, gender, misinformation, and water‑resource harms).
- Integrate the network’s outputs into UN‑level AI governance processes (e.g., the UN Scientific Panel on AI).
- Close the accountability loop by ensuring that safety improvements translate into tangible benefits for citizens.
3. Second Keynote: Mr. Quentin Chow‑Lambert (UN Office for Digital and Emerging Technologies)
- Evolution of AI safety discourse – traced the shift from early “existential risk” focus (Bletchley Park era) to a relational, vulnerability‑based perspective that asks who is being protected and from what.
- From centralized to open‑source ecosystems – open‑source models lowered the barrier for widespread deployment, prompting a move from “single‑model safety” to contextual safety.
- Two‑tiered safety concept
- Accuracy & reliability – critical for domains such as healthcare diagnostics, finance, and criminal‑justice decisions.
- Contextual/human‑centric harms – includes cultural appropriateness, religious sensitivities, and environmental impacts.
- Policy implication – a single technical standard cannot address the heterogeneous needs of 80 % of humanity living in the Global South; empirical, locally‑sourced evidence is required.
- Call to action – emphasized that networks like the Global South Network are essential for feeding ground‑level threat examples into global AI governance dialogues and preventing the marginalisation of the Global Majority.
4. Panel Introduction
- Moderator invited five panelists to the stage:
- Ms. Natasha Crampton (Microsoft)
- Dr. Rachel Sibande (Gates Foundation)
- Ms. Chenai Chair (Masakhane African Languages Hub)
- Mr. Amir Banafetmi (Cognizant)
- Dr. Balaraman Ravindran (IIT Madras)
5. Panel Discussion
5.1. Safety Gaps in Real‑World Deployments (Dr. Rachel Sibande)
- Redefining safety – safety must be measured against social‑cultural norms (gender dynamics, religious beliefs, slang, tone).
- Language nuance – a literal translation can miss critical clinical meaning (e.g., “waters have broken” vs. “thrown away water”).
- Emergent harms – personal‑AI companions may create emotional dependence that is not captured by current benchmarks.
5.2. Civil‑Society Perspective on Missed Impacts (Ms. Chenai Chair)
- User‑experience blind‑spots – AI tools often ignore local gender inequality and youth unemployment; a voice‑assistant with a male voice may amplify gender‑based violence.
- Language coverage – Africa hosts ≈2 000 documented languages; Masakhane currently curates data for only ~50, creating a coverage gap.
- Misuse pathways – unintended harms (e.g., misinformation in local languages, deep‑fake election interference) and deliberate surveillance (e.g., tracking devices hidden in consumer goods).
5.3. Industry Constraints on Context‑Sensitive Safety (Ms. Natasha Crampton)
- Scalability of community‑led evaluations – Microsoft’s challenge is to turn deep, local evaluation work (e.g., SAMISCA project) into a sustainable process that can be repeated for thousands of languages and cultural contexts.
- Need for continuous evaluation – safety assessments must be ongoing, not a one‑off pre‑release test.
5.4. Operational Barriers (Mr. Amir Banafetmi)
- Fragmented definition of safety – safety spans models, APIs, data pipelines, and infrastructure; lacking a unified definition hinders regulation.
- Lack of imagination – system designers often do not understand the context of deployment, leading to invisible harms.
- Absence of financial incentives – without penalties or budget line items for safety, organizations deprioritise it.
- Talent pipeline gap – current safety teams lack local expertise; both skilling and inclusion of region‑based talent are required.
- Incident‑reporting infrastructure – suggested building open‑source tools for contextual incident reporting; noted that the Global South lacks rapid feedback loops compared with the Global North.
5.5. Academic View on Investment & Coordination (Dr. Balaraman Ravindran)
- Proliferation of overlapping initiatives – many regional safety networks (e.g., African capacity‑building network, a Chinese safety institute) risk duplication.
- Need for coordination – a single “node” in a global AC (Accountability and Collaboration) network could harmonise funding, avoid fragmentation, and amplify impact.
5.6. Rapid‑Fire Action Items (All Panelists)
| Speaker | Concrete next‑step for 2024‑25 |
|---|---|
| Natasha Crampton (Microsoft) | Implement the New Delhi Frontier AI commitments: multilingual & multicultural evaluation pipelines; invest $50 bn in Global South infrastructure to enable scalable evaluation. |
| Rachel Sibande (Gates) | Institutionalise safety evaluation at deployment – embed safety checks early rather than post‑deployment. |
| Chenai Chair (Masakhane) | Deliver an African benchmarking initiative for low‑resource languages. |
| Amir Banafetmi (Cognizant) | Release open‑source, culturally‑contextual incident‑reporting tools and disseminate through the network. |
| Balaraman Ravindran (IIT Madras) | Facilitate cross‑border problem‑solving projects that require collaboration across multiple Global South nodes, demonstrating the network’s added value. |
6. Closing Remarks
- Moderator thanked all participants and reiterated that the launch is only the beginning; the network must now move from discussion to concrete, measured actions.
Key Takeaways
- Launch of a Global South‑focused AI safety network with five flagship research & policy projects for the next year.
- Contextual safety is a two‑dimensional problem: technical reliability and cultural‑social appropriateness (gender, religion, language, environment).
- Structural gaps hindering safe AI in the Global South:
- Red‑team capacity shortages,
- Limited compute resources,
- Linguistic & cultural mismatches,
- Concentrated benchmark governance.
- Regional nodes and multilingual benchmark datasets are essential to decentralise power and ensure inclusive evaluation.
- Industry scalability challenge – community‑led evaluations must become continuous, sustainable processes that can cover thousands of languages.
- Policy recommendation – embed safety metrics into public procurement and create financial incentives/penalties for non‑compliance.
- UN perspective – AI safety should be viewed relationally (who is protected?) and integrated into global AI governance mechanisms (UN Scientific Panel, UN Dialogue).
- Collaboration imperative – avoid duplication among emerging safety initiatives; a coordinated “node” in a global accountability network can harmonise funding and effort.
- Immediate concrete actions (as pledged by panelists): multilingual evaluation pipelines, open‑source incident‑reporting tools, African language benchmarking, infrastructure investment, and embedding safety checks at deployment.
Prepared from the verbatim transcript of the launch event held in Delhi, 2024.
See Also:
- best-practices-from-the-international-network-for-advanced-ai-measurement-evaluation-and-science
- building-sovereign-deep-tech-for-a-resilient-future-solutions-from-finland-and-india
- evaluations-and-open-source-software-for-ai-for-social-good-at-scale
- governing-safe-and-responsible-ai-within-digital-public-infrastructure
- navigating-the-ai-regulatory-landscape-a-cross-compliance-framework-for-safety-and-governance
- scaling-ai-solutions-through-southsouth-collaboration
- responsible-ai-at-scale-governance-integrity-and-cyber-readiness-for-a-changing-world
- thriving-with-ai-human-potential-skills-and-opportunity
- harnessing-ai-for-health-equity-building-inclusive-human-capital-and-strengthening-researchindustry-collaboration