AI Commons for the Global South: Data, Models and Compute for Half of Humanity

Detailed Summary

  • Rakesh introduced the theme “AI infrastructure as civic infrastructure”, stressing that access to compute, data and models will decide who can build, deploy and govern AI in the next decade.
  • He positioned the panel as a “full‑stack” view: government, academia, industry, open‑source community and grassroots activists.

2. Connectivity & Community‑Centric Infrastructure

Speaker: Jayesh Ranjan (Government of Telangana)

Sub‑topicKey Points
Bharat‑Net & household‑level broadbandTelangana leveraged pre‑existing water‑pipeline trenches to lay fiber to every home (a “saturation” model). Cost savings came from re‑using trenches that traditionally cost ~50 % of Bharat‑Net budgets.
Unexpected low adoptionAfter 100 + villages were connected, uptake of smartphones/tablets was minimal despite high per‑capita income in Telangana.
Root cause – mindset, not moneyRural residents often lacked digital literacy, English proficiency, or confidence in using gadgets. The barrier was cultural/psychological, not affordability.
Two‑pronged response1. Digital kiosks / Mi Seva centers – expanded to ~1 000 centres, training local women as entrepreneurs to act as “digital guides”.
2. Demonstrable use‑cases – show technology solving a concrete problem (e.g., pest‑prediction for farmers). When a clear benefit was demonstrated, adoption surged.
Lesson“Infrastructure alone is insufficient; it must be coupled with capacity‑building and problem‑oriented pilots.”

3. Universities, Research Institutes & the Compute Gap

Speaker: PJ Narayanan (IIIT‑Hyderabad)

Sub‑topicKey Points
Shift from “classical AI” to foundation modelsTraditional AI research (rule‑based, small‑scale) is being eclipsed by large‑scale models that require massive compute and data.
Compute & data as “fuel & fire”Data is the “oil”; compute is the “flame” that turns data into working models. Universities generally lack both.
Academic compute cloud proposalIndia‑AI and other consortia are calling for a shared academic compute cloud to give researchers access to GPUs/TPUs at low/no cost.
Data as infrastructureHistorically, data was not treated as a public resource. The speaker argued for an Open‑Data movement (mirroring the Open‑Source movement) to make civic datasets – health, transport, agriculture – freely available for research and startups.
Open‑weight models & licensingWhile companies like Meta release open‑weight models, usage restrictions still exist. Full openness would level the playing field for Indian academia and startups.
India’s strategic advantageWith 1.4 bn people, India can become a data‑rich hub if policies enable government‑generated civic data to be shared under open licences.

4. Industry Perspective – Meta’s Vision for the Global South

Speaker: Prachi Bhatia (Meta)

Sub‑topicKey Points
Meta’s “personal super‑intelligence”Goal: “AI glasses” that act as a personal agent, perceiving what the user sees/hears and responding in real time.
Early impact storiesAssistive technology for the blind – “Be My Eyes” app paired with glasses enabled a blind attendee to navigate a venue.
Multilingual support – Glasses launched in India support Hindi, Kannada, Tamil (in addition to English).
Payments & commercePiloting UPI payments directly through the glasses, showcasing a “hands‑free” financial interaction.
Open‑source data contributionsMeta contributed 12 billion tokens (≈4 M parallel sentence pairs) covering 10 Indian languages to the government’s open‑source library (AI coach).
Omni‑lingual ASR modelSupports >1 500 languages (including 500 low‑resource languages). A single audio sample can be used to adapt the model to a new language, dramatically lowering data‑collection costs.
Call for public‑private‑academic partnershipEmphasised that policy, open data, and shared compute are essential to realise inclusive AI at scale.

5. Open‑Source as the Foundation of Digital Infrastructure

Speaker: Amanda Brock (Open UK)

Sub‑topicKey Points
“Digital pizza” metaphorThe “toppings” (AI, ML, cloud) sit on a base of open‑source – the plumbing that must be reliable for any ecosystem to work.
Soft‑sovereigntyOpening‑source enables localisation (language, culture, security) without needing to build everything from scratch. This aligns with many countries’ desire for digital sovereignty.
Business drivers of open‑sourceCompanies (e.g., Microsoft) now contribute because cloud platforms depend on open‑source; the shift is pragmatic, not purely ideological.
Scale & accessibility25 M Indians are on GitHub, enabling rapid collaboration and reducing legal/financial barriers (standard licences, no contract negotiations).
Future moment – “open‑source AI at scale”Anticipates a turning point when open‑source AI models become widely adopted in the Global South, mirroring the earlier open‑source software revolution.
CaveatsEmphasised need for language diversity (e.g., Scots dialect) and warned that open‑source alone is not a silver bullet – governance and community building remain crucial.

6. Grassroots AI Inclusion & Data Literacy

Speaker: Osama Manzar (Digital Empowerment Foundation)

Sub‑topicKey Points
Critical view of “inclusion”Inclusion should not become exclusion; merely increasing user numbers can turn people into passive data sources.
Data‑ownership & consentAI systems learn from users’ behaviour; without transparent data‑literacy programmes, citizens cannot consent meaningfully.
Human‑in‑the‑loop & trustA trust loop (human oversight, clear governance, rights education) is required before large‑scale data collection.
Policy recommendationEnact regulations that require prior consent and ethical review before civic data is fed into AI pipelines.
Community‑driven learningPeer‑learning models (e.g., neighbours teaching each other) are more effective than top‑down digital‑literacy curricula.
Data exchange concernsEven well‑intentioned data‑exchanges (e.g., ADEX) must be communicated in plain language so contributors understand how their data will be used.

7. Audience‑Driven Discussion & Concrete Initiatives

Speaker / ContributorHighlights
Rakesh Dubbudu (moderator)Described the ADEX (Agriculture Data Exchange) – a live data‑exchange platform (since 2021) with 19 datasets, now expanded to TDEX (Telangana Data Exchange) covering >30 sectors (mobility, weather, health, tax). Emphasised a data‑management framework specifying contributors, users, responsibilities, and oversight.
Startup‑policy linkTelangana’s 2014 Innovation & Startup Policy pledged the government as the “first customer” for startups, helping them overcome the “first‑customer” barrier.
Prof. PJ Narayanan (IIIT‑Hyderabad)Reinforced the need for localised language models and highlighted the Bhasha speech‑to‑speech API now publicly available.
Policy recommendations (multiple speakers)Outcome‑based regulation (focus on harms rather than tech‑specific rules).
Privacy‑enhancing technologies to secure data sharing.
Multilingual accessibility as a core requirement for any AI service.
Open‑source licensing concernsAmanda warned that licensing complexity can hinder adoption; calls for simple, permissive licences for civic AI tools.

8. Closing Remarks & Hackathon Prizes

  • Rakesh announced the “AI for All” hackathon (partnered with AI Kosh) that focused on making public datasets AI‑ready and multilingual.
  • Prize winners:
    1️⃣ Intelligent Document Processing – CDAX Splash
    2️⃣ Biznova – CodePlus
    3️⃣ FAP NextGen – AI4Health
  • The session wrapped with thanks to the panelists, audience, and organizers.

Key Takeaways

  • Infrastructure ≠ Adoption – Broadband and compute must be paired with digital‑literacy, visible problem‑solving pilots, and locally‑run support centres (Jayesh Ranjan).
  • Compute & Data are Public Goods – Universities need shared academic compute clouds and open‑data ecosystems to stay competitive (PJ Narayanan).
  • Multilingual, Human‑Centred AI – Meta’s AI glasses and omni‑lingual ASR illustrate how language inclusivity can drive real‑world impact in the Global South (Prachi Bhatia).
  • Open‑Source is the “plumbing” of AI – Without a robust, permissive open‑source base, scaling AI solutions becomes costly and fragmented; open‑source also enables soft sovereignty (Amanda Brock).
  • Data‑Literacy & Ethical Governance – Grassroots inclusion must guarantee informed consent, human‑in‑the‑loop oversight, and transparent data‑exchange policies (Osama Manzar).
  • Policy Should Be Outcome‑Focused – Regulations that specify desired outcomes & harms rather than prescribing specific technologies foster innovation while protecting citizens (multiple speakers).
  • Public‑Private‑Academic Partnerships – Successful pilots (ADEX/TDEX, startup‑first‑customer policy) show that collaborative frameworks accelerate the AI commons.
  • Hackathon Momentum – Community‑driven challenges produce concrete tools (e.g., intelligent document processing) that move the commons from theory to practice.

These insights collectively chart a roadmap for building an AI Commons that is open, multilingual, locally relevant, and governed by transparent, outcome‑oriented policy—ensuring that half of humanity can meaningfully participate in the AI era.

See Also: