Speaking Everyone’s Language: The Key to Inclusive AI Opportunity
Abstract
The panel examined how language diversity is the decisive factor for making artificial intelligence truly inclusive across the Global South. Speakers highlighted pioneering initiatives—such as the Masakhane African Languages Hub, the Africa‑first public‑sector AI compute cluster, and the Lingua Africa open‑call—that aim to close the data‑and‑compute gaps for low‑resource languages in Africa and Asia. The discussion linked these technical efforts to broader development goals, underscoring the role of public‑good funding, responsible AI governance, and the need for culturally aware models to unlock health, education, agriculture, and financial services for billions of multilingual users.
Detailed Summary
-
Ankur Vora (Gates Foundation) opened the session by outlining three flagship programmes:
- Masakhane African Languages Hub – a community‑driven effort to develop AI models for >40 African languages, positioning the work as “genuinely African‑led.”
- Africa’s First Dedicated Public‑Sector AI Compute Cluster – to be hosted at the University of Cape Town, intended to give African researchers access to high‑performance GPUs and storage, mitigating cost barriers.
- Asia AI for Development Observatory – a new network to promote responsible AI governance across the region.
-
He linked these initiatives to the broader AI for Development program launched three years earlier at Bletchley Park, noting partnership with the Canadian IDRC, the Gates Foundation, and governments of Germany, Japan, Sweden, and the GSMA Foundation.
-
Four additional startups were announced for support under the partnership, including Torn AI (Morocco), which builds voice interfaces for low‑literacy rural users.
-
Vora framed the strategic choice before humanity: a divisive AI that concentrates power versus an inclusive, equitable AI that can uplift all. He positioned the day’s panel as a means to ensure the latter path.
2. Panel Introduction
-
The moderator introduced the panelists:
- His Excellency Ambassador Philip Thigo (Kenya) – Special Technology Envoy.
- Dr Bärbel Kofler – Parliamentary State Secretary, Germany.
- Shekar Sivasubramanian – CEO, Wadhwani AI.
- Chenai Chair – Chair, Masakhane African Languages Hub.
-
Hon. David Lammy (Deputy Prime Minister, UK) began the Q & A with a personal anecdote about using a secure AI‑powered research tool, establishing the importance of trustworthy AI for public officials.
3. Discussion Themes
3.1. AI & Local Languages – The African Perspective
-
Philip Thigo emphasized that Africa hosts ≈ 2,000 languages and that AI must be tailored to each linguistic community.
-
He argued that language is the cultural substrate of the Global South; without representation in AI models, entire oral civilisations risk extinction.
-
Thigo highlighted three pillars for progress:
- Representation & Existence – ensuring African languages appear in AI outputs.
- Infrastructure & Funding – building compute capacity and research talent as a matter of sovereignty.
- Domain‑Specific Use Cases – health, agriculture, history, and education require bespoke language models.
3.2. Masakhane African Languages Hub – Vision & Operations
-
Chenai Chair (Masakhane) traced the hub’s origin to 2019 as a grassroots, boot‑strapped effort to capture African languages digitally.
-
Mission: Impact 1 billion Africans via AI tools in the 50 most‑spoken languages, targeting economic growth, health, and cultural preservation.
-
Four Working Pillars:
- Data Expansion – leveraging the JW300 Bible dataset as a seed and now building high‑quality, diversified corpora.
- Research & Benchmarking – creating an African‑specific speech‑text benchmark because existing benchmarks ignore local nuance.
- Innovation & Use‑Case Development – allocating ≈ 40 % of funding to concrete applications; highlighted Project ECHO, a gender‑responsive initiative to empower women economically and improve health outcomes.
- Sustainability & Capacity Building – institutionalising the NLP community so that open‑source models can spawn commercial ventures, ensuring the hub’s longevity beyond initial grant funding.
-
The speaker stressed the dialectal diversity (e.g., variations of Shona) and the necessity of capturing these subtleties for effective AI.
3.3. Wadhwani AI (India) – Inclusive AI at Scale
-
Shekar Sivasubramanian described Wadhwani AI’s focus on applied AI for health, education, and agriculture, operating for seven years.
-
Core design principle: multilingual inclusion – solutions are built for 14–16 Indian languages from the outset.
-
Key Projects:
- Media Disease Surveillance – monitors national news in 16 languages, updates every four hours, and alerts the government to outbreak hotspots.
- Oral Reading Fluency – collects spoken data from children, uses AI to assess reading proficiency, and equips teachers with actionable insights.
-
Emphasis on value‑first design: technology must solve a tangible problem for the user; otherwise adoption stalls.
-
Highlighted work on low‑resource languages (e.g., Tibetan, Kreĭla in Karnataka) through digitisation of libraries and community‑driven employment.
-
Stressed the need for long‑term research investment and a balanced public‑private partnership to keep the community involved.
3.4. Funding & Policy Perspectives (UK, Germany, Canada, Microsoft)
-
Dr Amandeep Singh Gill (UN) and Dr Bärbel Kofler (Germany) underscored that AI can be a game‑changer for the Sustainable Development Goals (SDGs), but only if language‑bias is eliminated.
-
Kofler stressed that public‑good investment is required because market forces favour high‑resource languages (English, Mandarin).
-
Julie Delahanty (IDRC) reiterated the importance of public‑sector compute resources, noting that cost disparities make it exponentially harder for African institutions to access GPUs compared with European labs.
-
Natasha Crampton (Microsoft) explained Microsoft’s role:
- Compute Enablement – providing cloud resources essential for data collection, model fine‑tuning, testing, and day‑to‑day deployment.
- Lingua Africa Open Call – a multi‑partner initiative (Microsoft AI for Good, Gates Foundation, Masakhane) to fund community‑governed language infrastructure targeting health, education, agriculture, and public services.
-
She highlighted that trustworthy AI requires deliberate choices in data, model development, testing, and deployment; it does not happen “by accident.”
3.5. Lingua Africa Initiative – Announcement
-
Ankur Vora announced Lingua Africa, a multi‑partner open‑call focused on creating open, community‑governed language infrastructure.
-
Goals:
- Collect targeted multilingual data in high‑impact domains (health, education, agriculture, public services).
- Develop and fine‑tune models that are linguistically and culturally aware.
- Support deployment pathways ensuring tools reach end‑users in their mother tongues.
-
Partners include Microsoft AI for Good, Gates Foundation, and the AI for Development network.
-
Emphasis on real‑world relevance: models built in labs must survive in field conditions; benchmarking and user testing are integral.
4. Closing Remarks & Future Outlook
-
David Lammy praised state intervention to bridge market gaps, while recognizing that private‑sector innovation remains vital.
-
Julie Delahanty highlighted the upcoming African Compute Initiative at the University of Cape Town – the continent’s first dedicated high‑performance computing cluster for public institutions.
-
She reiterated that compute capacity and robust multilingual datasets are the foundational pillars for Africa to shape global AI systems rather than be passive consumers.
-
The panel concluded with a call for collaborative public‑good funding, local capacity building, and language‑first AI design to ensure inclusive technology reaches the billions who need it.
Key Takeaways
-
Language is the primary gatekeeper for AI adoption in the Global South; without representation of local languages, AI solutions remain inaccessible to the majority.
-
Masakhane African Languages Hub aims to impact 1 billion Africans by developing AI tools for the 50 most‑spoken languages, focusing on data expansion, benchmarking, gender‑responsive projects, and long‑term community sustainability.
-
Wadhwani AI demonstrates a multilingual‑first approach in India, delivering health, education, and agricultural AI applications across 14–16 languages and emphasizing value‑driven design.
-
Public‑good funding (Gates Foundation, UK, German, Canadian, Japanese governments, IDRC, Microsoft) is essential because market incentives favor high‑resource languages; coordinated investments can close the data and compute gaps.
-
The African Compute Initiative (public‑sector GPU cluster at University of Cape Town) will dramatically reduce cost barriers for African researchers, enabling local model training and testing.
-
Lingua Africa is a newly announced, multi‑partner open‑call that will fund community‑governed language infrastructure, targeting real‑world domains (health, education, agriculture, public services).
-
Trustworthy, inclusive AI requires deliberate choices across the pipeline—data collection, model fine‑tuning, testing with native speakers, and deployment in culturally relevant contexts.
-
Gender‑responsive projects (e.g., Project ECHO) are critical for addressing inequities and ensuring that AI benefits women’s economic empowerment and health outcomes.
-
State‑led initiatives complement private‑sector innovation; both are needed to create an ecosystem where AI can be a genuine catalyst for the Sustainable Development Goals across Africa and Asia.
See Also: