AI Commons for the Global South: Data, Models and Compute for Half of Humanity

Detailed Summary

Rakesh introduced the theme “AI infrastructure as civic infrastructure”, stressing that access to compute, data and models will decide who can build, deploy and govern AI in the next decade.
He positioned the panel as a “full‑stack” view: government, academia, industry, open‑source community and grassroots activists.

2. Connectivity & Community‑Centric Infrastructure

Speaker: Jayesh Ranjan (Government of Telangana)

Sub‑topic	Key Points
Bharat‑Net & household‑level broadband	Telangana leveraged pre‑existing water‑pipeline trenches to lay fiber to every home (a “saturation” model). Cost savings came from re‑using trenches that traditionally cost ~50 % of Bharat‑Net budgets.
Unexpected low adoption	After 100 + villages were connected, uptake of smartphones/tablets was minimal despite high per‑capita income in Telangana.
Root cause – mindset, not money	Rural residents often lacked digital literacy, English proficiency, or confidence in using gadgets. The barrier was cultural/psychological, not affordability.
Two‑pronged response	1. Digital kiosks / Mi Seva centers – expanded to ~1 000 centres, training local women as entrepreneurs to act as “digital guides”. 2. Demonstrable use‑cases – show technology solving a concrete problem (e.g., pest‑prediction for farmers). When a clear benefit was demonstrated, adoption surged.
Lesson	“Infrastructure alone is insufficient; it must be coupled with capacity‑building and problem‑oriented pilots.”

3. Universities, Research Institutes & the Compute Gap

Speaker: PJ Narayanan (IIIT‑Hyderabad)

Sub‑topic	Key Points
Shift from “classical AI” to foundation models	Traditional AI research (rule‑based, small‑scale) is being eclipsed by large‑scale models that require massive compute and data.
Compute & data as “fuel & fire”	Data is the “oil”; compute is the “flame” that turns data into working models. Universities generally lack both.
Academic compute cloud proposal	India‑AI and other consortia are calling for a shared academic compute cloud to give researchers access to GPUs/TPUs at low/no cost.
Data as infrastructure	Historically, data was not treated as a public resource. The speaker argued for an Open‑Data movement (mirroring the Open‑Source movement) to make civic datasets – health, transport, agriculture – freely available for research and startups.
Open‑weight models & licensing	While companies like Meta release open‑weight models, usage restrictions still exist. Full openness would level the playing field for Indian academia and startups.
India’s strategic advantage	With 1.4 bn people, India can become a data‑rich hub if policies enable government‑generated civic data to be shared under open licences.

4. Industry Perspective – Meta’s Vision for the Global South

Speaker: Prachi Bhatia (Meta)

Sub‑topic	Key Points
Meta’s “personal super‑intelligence”	Goal: “AI glasses” that act as a personal agent, perceiving what the user sees/hears and responding in real time.
Early impact stories	• Assistive technology for the blind – “Be My Eyes” app paired with glasses enabled a blind attendee to navigate a venue. • Multilingual support – Glasses launched in India support Hindi, Kannada, Tamil (in addition to English).
Payments & commerce	Piloting UPI payments directly through the glasses, showcasing a “hands‑free” financial interaction.
Open‑source data contributions	Meta contributed 12 billion tokens (≈4 M parallel sentence pairs) covering 10 Indian languages to the government’s open‑source library (AI coach).
Omni‑lingual ASR model	Supports >1 500 languages (including 500 low‑resource languages). A single audio sample can be used to adapt the model to a new language, dramatically lowering data‑collection costs.
Call for public‑private‑academic partnership	Emphasised that policy, open data, and shared compute are essential to realise inclusive AI at scale.

5. Open‑Source as the Foundation of Digital Infrastructure

Speaker: Amanda Brock (Open UK)

Sub‑topic	Key Points
“Digital pizza” metaphor	The “toppings” (AI, ML, cloud) sit on a base of open‑source – the plumbing that must be reliable for any ecosystem to work.
Soft‑sovereignty	Opening‑source enables localisation (language, culture, security) without needing to build everything from scratch. This aligns with many countries’ desire for digital sovereignty.
Business drivers of open‑source	Companies (e.g., Microsoft) now contribute because cloud platforms depend on open‑source; the shift is pragmatic, not purely ideological.
Scale & accessibility	25 M Indians are on GitHub, enabling rapid collaboration and reducing legal/financial barriers (standard licences, no contract negotiations).
Future moment – “open‑source AI at scale”	Anticipates a turning point when open‑source AI models become widely adopted in the Global South, mirroring the earlier open‑source software revolution.
Caveats	Emphasised need for language diversity (e.g., Scots dialect) and warned that open‑source alone is not a silver bullet – governance and community building remain crucial.

6. Grassroots AI Inclusion & Data Literacy

Speaker: Osama Manzar (Digital Empowerment Foundation)

Sub‑topic	Key Points
Critical view of “inclusion”	Inclusion should not become exclusion; merely increasing user numbers can turn people into passive data sources.
Data‑ownership & consent	AI systems learn from users’ behaviour; without transparent data‑literacy programmes, citizens cannot consent meaningfully.
Human‑in‑the‑loop & trust	A trust loop (human oversight, clear governance, rights education) is required before large‑scale data collection.
Policy recommendation	Enact regulations that require prior consent and ethical review before civic data is fed into AI pipelines.
Community‑driven learning	Peer‑learning models (e.g., neighbours teaching each other) are more effective than top‑down digital‑literacy curricula.
Data exchange concerns	Even well‑intentioned data‑exchanges (e.g., ADEX) must be communicated in plain language so contributors understand how their data will be used.

7. Audience‑Driven Discussion & Concrete Initiatives

Speaker / Contributor	Highlights
Rakesh Dubbudu (moderator)	Described the ADEX (Agriculture Data Exchange) – a live data‑exchange platform (since 2021) with 19 datasets, now expanded to TDEX (Telangana Data Exchange) covering >30 sectors (mobility, weather, health, tax). Emphasised a data‑management framework specifying contributors, users, responsibilities, and oversight.
Startup‑policy link	Telangana’s 2014 Innovation & Startup Policy pledged the government as the “first customer” for startups, helping them overcome the “first‑customer” barrier.
Prof. PJ Narayanan (IIIT‑Hyderabad)	Reinforced the need for localised language models and highlighted the Bhasha speech‑to‑speech API now publicly available.
Policy recommendations (multiple speakers)	• Outcome‑based regulation (focus on harms rather than tech‑specific rules). • Privacy‑enhancing technologies to secure data sharing. • Multilingual accessibility as a core requirement for any AI service.
Open‑source licensing concerns	Amanda warned that licensing complexity can hinder adoption; calls for simple, permissive licences for civic AI tools.

8. Closing Remarks & Hackathon Prizes

Rakesh announced the “AI for All” hackathon (partnered with AI Kosh) that focused on making public datasets AI‑ready and multilingual.
Prize winners:
1️⃣ Intelligent Document Processing – CDAX Splash
2️⃣ Biznova – CodePlus
3️⃣ FAP NextGen – AI4Health
The session wrapped with thanks to the panelists, audience, and organizers.

Key Takeaways

Infrastructure ≠ Adoption – Broadband and compute must be paired with digital‑literacy, visible problem‑solving pilots, and locally‑run support centres (Jayesh Ranjan).
Compute & Data are Public Goods – Universities need shared academic compute clouds and open‑data ecosystems to stay competitive (PJ Narayanan).
Multilingual, Human‑Centred AI – Meta’s AI glasses and omni‑lingual ASR illustrate how language inclusivity can drive real‑world impact in the Global South (Prachi Bhatia).
Open‑Source is the “plumbing” of AI – Without a robust, permissive open‑source base, scaling AI solutions becomes costly and fragmented; open‑source also enables soft sovereignty (Amanda Brock).
Data‑Literacy & Ethical Governance – Grassroots inclusion must guarantee informed consent, human‑in‑the‑loop oversight, and transparent data‑exchange policies (Osama Manzar).
Policy Should Be Outcome‑Focused – Regulations that specify desired outcomes & harms rather than prescribing specific technologies foster innovation while protecting citizens (multiple speakers).
Public‑Private‑Academic Partnerships – Successful pilots (ADEX/TDEX, startup‑first‑customer policy) show that collaborative frameworks accelerate the AI commons.
Hackathon Momentum – Community‑driven challenges produce concrete tools (e.g., intelligent document processing) that move the commons from theory to practice.

These insights collectively chart a roadmap for building an AI Commons that is open, multilingual, locally relevant, and governed by transparent, outcome‑oriented policy—ensuring that half of humanity can meaningfully participate in the AI era.

See Also:

India AI Impact Summit 2026

Explorer

ai-commons-for-the-global-south-data-models-and-compute-for-half-of-humanity