Leveraging Artificial Intelligence in Public Audit for greater Transparency and Accountability

Abstract

The session opened with a technical presentation that detailed the Comptroller and Auditor General of India’s (CAG) AI strategy, its four‑pillar framework, ongoing AI‑enabled audit projects, and the supporting infrastructure—including cyber‑security audit pilots, big‑data analytics, OCR/NLP pipelines, and a sovereign large‑language model (LLM) under development. The presentation highlighted extensive capacity‑building programmes aimed at up‑skilling a 45 000‑strong workforce. The first part transitioned into a panel discussion moderated by Ms. Priyanka Sharma, where senior officials and AI experts examined data‑readiness, governance safeguards, and the practicalities of scaling AI across public‑audit functions. The dialogue underscored the need for robust data lakes, reuse of existing government AI components, and multi‑stakeholder collaboration to ensure responsible, transparent, and accountable AI adoption in public finance oversight.

Detailed Summary

  • The moderator thanked the audience and emphasized that the session would build on earlier deliberations about AI’s opportunities and responsibilities in public audit.
  • The agenda was described as a “technical presentation” that would precede a panel discussion, functioning as a “precursor” to the broader conversation.

2. Technical Presentation – Institutional Perspective

2.1. Presenter: Shri K. Surjith (Director, Office of the CAG)

  • Outlined the institutional context for AI‑enabled public assurance, stressing the need for a clear vision and an actionable roadmap.

2.2. Presenter: Prof. Madhusudhan (IIT Madras)

  • Discussed the enabling infrastructure required for sovereign LLM initiatives, highlighting the role of academia and research partnerships in building secure, context‑specific AI models for audit.

3. AI Strategy Framework (April 2023)

  • Four foundational pillars:

    1. Embedding AI/ML into audit processes and broader business operations.
    2. Auditing AI systems deployed by government departments (i.e., AI‑as‑auditee).
    3. Capacity‑building for staff to operate and evaluate AI tools.
    4. Research & Development in AI/ML to sustain innovation.
  • Noted that most Supreme Audit Institutions (SAIs) worldwide lack a formal AI strategy; the CAG’s framework positions India as a pioneer among peers.

4. Evolution of Information‑Systems Audit

  • Historical backdrop: Since 2000, the CAG has audited over 500 + IT applications (e.g., income‑tax systems, GST, IRCTC).
  • Current focus: Auditing “GEM” (Government e‑Marketplace) and expanding to cyber‑security audits for two flagship applications, recognizing the rising threat landscape.

5. Cyber‑Security Audit Pilot

  • Rationale: Traditional IT audits confirm functional correctness but do not assess cyber‑readiness.
  • Pilot scope: Two major applications will undergo a dedicated cyber‑security audit, evaluating threat‑modeling, vulnerability management, and resilience controls.

6. AI‑Enabled Audit Tools & Methodologies

Tool/TechniquePurpose & Current Use
OCR & NLP pipelinesAutomate extraction of data from PDFs, vouchers, and scanned documents, reducing manual transcription.
Sovereign LLM (under development)Tailored large‑language model for audit‑specific tasks such as policy interpretation, anomaly detection, and drafting of audit reports.
Machine‑learning‑driven procurement analyticsUses a priori algorithms and graph analysis to examine the entire procurement dataset, selecting audit units based on risk‑derived metrics rather than arbitrary thresholds.
Computer vision & high‑resolution imageryAnalyses satellite/drone imagery to verify on‑ground construction progress for schemes like “7‑Street databases.”
Data lake architectureCentralised repository for structured and unstructured data (text, images, video, GIS) enabling cross‑domain analytics.
Center of Excellence (CoE) – Financial Autodesk & Healing LawProvides specialised support for staff to develop AI‑driven audit solutions and standardise best practices.

7. Capacity‑Building Initiatives

  • Workforce size: 45 000 employees across 140+ offices.
  • Training goals:
    • Current cohort: ~700–800 officers enrolled in data‑science, ML, AI, and cyber‑security courses.
    • Target: Upskill 5 000 officers in the next year; eventually reach 90 % of audit staff proficient in AI‑augmented methods within three years.
  • Emphasised that coding per se is less critical; the focus is on problem framing, data understanding, and metric definition.

8. Governance, Ethics & Responsible AI

  • Stressed that audit of AI systems (pillar 2) is essential to ensure that governmental AI applications themselves comply with ethical, security, and fairness standards.
  • Highlighted the need for multi‑stakeholder collaboration (academia, industry, other government agencies) to co‑create AI governance frameworks.

9. Transition to Panel Discussion

  • The presenter thanked the audience and introduced the upcoming panel, stating that the discussion would explore data readiness, government safeguards, infrastructure requirements, skill transformation, and responsible scaling of AI in public audit.

Panel Discussion – “Use of AI in Public Audit”

Moderator: Ms. Priyanka Sharma (KPMG)

Panelists:

  • Dr. Sanjeev Kumar (Wadhwani Foundation) – AI technologist & entrepreneur
  • Shri Naveen Singhvi (CAG) – Principal Director, Commercial
  • Prof. Agam Gupta (IIT Delhi) – Researcher on technology and societal impact
  • Mr. Srinath Chakravarthy (National Institute of Smart Governance) – Senior Vice‑President

10. Introduction of Panelists (Moderator)

  • Brief bios were given, underscoring each panelist’s expertise in AI, audit, and public‑sector governance.

11. Key Question – Data Quality & Completeness

Moderator to Dr. Sanjeev Kumar:

“Given the CAG’s extensive interactions with multiple ministries, how prepared is the institution in terms of data quality, completeness, and overall data‑readiness for AI‑driven audit?”

11.1. Dr. Kumar’s Response – The Big‑Data Challenge

  • Volume, Velocity, Variety: CAG faces one of the world’s most complex big‑data problems.
    • Structured data (e.g., transaction tables) is only a fraction; the bulk consists of unstructured sources—PDFs, images, vouchers, speech recordings, GIS layers, and possibly video.
  • Language & Regional Diversity: Multilingual documents across states add a layer of difficulty for OCR/NLP pipelines.
  • Data Lake Strategy:
    • Proposes building a robust government‑wide data lake that ingests raw data (both structured and unstructured) and later curates it into structured “data marts” for specific audit use‑cases.
    • Emphasises the need to “glean” structured information from unstructured sources using pre‑trained models and custom extraction rules.
  • Reuse of Existing Components:
    • Points to GSTN and Centers of Excellence that already host mature AI components (e.g., tax‑invoice parsing, fraud detection) which can be repurposed for audit.
    • Suggests a component‑based architecture to avoid reinventing the wheel, allowing quicker integration.

11.2. Panel Consensus

  • Agreement that the data‑lake approach is viable, but stresses the necessity of strong data governance (metadata standards, lineage, access controls).
  • Acknowledgement of Gaps: Current data ingestion pipelines lack uniformity; a coordinated effort is needed to standardise formats across ministries.

12. Follow‑Up Themes (Briefly Covered)

  • Ethical Safeguards: Need for transparent audit logs of AI decisions; explainability must be built into LLM‑driven analyses.
  • Skill Transformation: Continuous up‑skilling of auditors to understand AI outputs; “human‑in‑the‑loop” remains crucial.
  • Collaboration Call: The panel reiterated the earlier invitation to academia, industry, and other government bodies to co‑develop tools, share datasets, and co‑fund research.

13. Closing Remarks

  • Moderator thanked the panelists and the audience, noting that the technical presentation laid a concrete roadmap, while the panel highlighted the pragmatic challenges of data readiness and governance.
  • An invitation was extended for ongoing dialogue and joint pilots to accelerate AI adoption in public audit.

Key Takeaways

  • Four‑Pillar AI Strategy: The CAG’s framework (embed AI, audit AI, capacity building, R&D) provides a systematic roadmap for AI integration in public audit.
  • Cyber‑Security Audits: Piloting cyber‑security audits for major applications marks a shift from pure functional assurance to risk‑based, resilience‑focused oversight.
  • AI‑Enabled Tools: OCR/NLP pipelines, sovereign LLMs, machine‑learning‑driven procurement analytics, and computer‑vision‑based field verification are already in use or under development.
  • Data Lake Imperative: Centralising structured and unstructured data is essential to unlock AI potential; existing government AI components (e.g., GSTN) can be repurposed.
  • Capacity Building at Scale: Targeting 5 000 officers initially, with a long‑term goal of up‑skilling 90 % of audit staff, signals a serious commitment to human‑capital transformation.
  • Responsibility & Governance: Auditing AI systems themselves is a cornerstone of responsible AI adoption, ensuring ethical, secure, and transparent outcomes.
  • Multi‑Stakeholder Collaboration: Success hinges on partnerships with academia, industry, and inter‑departmental bodies; the CAG has explicitly solicited such cooperation.
  • Challenges Remain: Data heterogeneity, multilingual content, and the sheer volume of information present a formidable big‑data problem that requires robust data‑governance and reuse of proven AI components.
  • Future Outlook: Within three years, the CAG aims to have a sovereign LLM built largely by its own officers, a mature data‑lake ecosystem, and a workforce proficient in AI‑augmented audit techniques.

Prepared from the verbatim transcript of the “Leveraging Artificial Intelligence in Public Audit for greater Transparency and Accountability” session at the AI Conference, Delhi (2026).

See Also: