प्रragya Lab

While Responsible AI has become the dominant paradigm in discussions of ethical AI development, our research team has introduced the concept of CIVILIZING AI. This framework seeks to move beyond aspirational principles toward quantifiable, actionable standards for AI alignment and safety. At its core, CIVILIZING AI proposes four key metrics: (i) the AI Detectability Index (ADI) to assess how transparently and eloquently AI communicates its artificial nature; (ii) the Hallucination Vulnerability Index (HVI) to measure susceptibility to factual errors and ungrounded responses; (iii) the Adversarial Attack Vulnerability Index (AAVI) to evaluate resilience against prompt injection and other adversarial exploits; and (iv) the Carbon Emission Index (CarbonAI) to quantify the environmental impact of AI inference and training. Together, these metrics provide a balanced view of AI systems, ensuring usability, trust, and environmental responsibility while safeguarding against vulnerabilities. Our vision charts a path toward three generations of CIVILIZED AI systems, each progressively advancing in transparency, robustness, and policy compliance, and ultimately supporting AI that aligns with constitutional, cultural, and societal principles.

प्रragya is a vision to civilize future-generation machines, guiding them beyond raw computational capability toward wisdom, responsibility, and ethical discernment. At the heart of this vision lies our pioneering concept of Neural DNA (nDNA), which offers a profound lens through which to interpret the life cycle of foundation models—not as static artifacts of training, but as evolving semantic organisms. Through the nDNA framework, we illuminate how these models inherit, mutate, and transmit latent beliefs, cultural priors, and alignment traits across their developmental stages. This perspective transforms the study of AI from engineering isolated systems to cultivating living semantic entities that can harmonize with human values, societal norms, and constitutional principles—hallmarks of the CIVILIZING AI philosophy championed by प्रragya.

Media Coverage

	Honestly, I love when AI hallucinates - Josh Tyrangiel
	This USC Professor from Kolkata is on a Journey to Civilise AI
	Inside Amitava Das’s quest to ‘civilise’ Artificial Intelligence
	AI hallucinations can’t be stopped — but these techniques can limit their damage

Highlights

EMNLP 2023 Outstanding Paper Award

-What If Cross-Cultural LLMs Married?

The Latent Geometry of Inherited Culture in Their Neural Offspring

Latest News

Jun 01, 2025	3 papers accepted in ACL: 2 Findings, 1 in SRW

Selected publications

ACL 25

SEPSIS: I Can Catch Your Lies – A New Paradigm for Deception Detection

Anku Rani, Dwip Dalal, Shreya Gautam, and 5 more authors

2023

Abs

Deception is the intentional practice of twisting information. This research explores three forms: lies of omission, lies of commission, and lies of influence, focusing primarily on lies of omission. We introduce a novel NLP framework supported by an annotated dataset of 876,784 samples from merged fake-news sources and Times of India headlines. Each sample is labeled for omission type (e.g., speculation, bias), “color” of lie (black/white/gray/red), intent (influence, prestige), and topic (political, educational, religious, etc.). Our multi-task learning pipeline, which merges fine-tuned language models, attains an F1 score of 0.87 across all annotation layers. We also examine correlations between omission-based lies and propaganda techniques—e.g., loaded language often signals opinion—and will release both model and dataset under an open license.
ACL 25

DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization

Amitava Das, Sameer Trivedy, Deepansh Khanna, and 8 more authors

In Proceedings of ACL 2025, 2025

A*

Abs PDF

We present DPO Kernels, combining kernel density estimation with divergence-based objectives for direct preference optimization in large language models. By incorporating semantics via custom kernel functions, DPO Kernels significantly improve the alignment between model outputs and human preferences. Experiments on public preference datasets show gains of 3-5% over existing DPO approaches, demonstrating improved generation quality and user satisfaction.
ACL 25

YINYANG-ALIGN: Benchmarking Contradictory Objectives and Proposing Multi-Objective Optimization based DPO for Text-to-Image Alignment

Amitava Das, Yogesh Narsupalli, Gautam Singh, and 5 more authors

In Proceedings of ACL 2025, 2025

A*

Abs PDF

YINYANG-ALIGN is a novel benchmark and training framework for text-to-image alignment that concurrently optimizes for coherence and diversity, conceptually contradictory objectives. We propose multi-objective DPO training that balances these via Pareto optimality. Evaluated on standard T2I datasets, YINYANG-ALIGN achieves higher user preference and diversity metrics without sacrificing fidelity.
COLING 25

Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting

T. Wijesiriwardene, R. Wickramarachchi, S. R. Vennam, and 5 more authors

In Proceedings of COLING 2025, 2025

B

Abs PDF

This work investigates LLMs’ capability on analogical reasoning (e.g., A:B :: C:? queries) with knowledge-enhanced prompts. We design knowledge retrieval modules that inject relevant facts into prompts, enhancing model performance by 12% on proportional analogy datasets. Analyses reveal improvements when symbolic knowledge complements contextual embeddings.
EMNLP 24

Counter Turing Test (CT2): Investigating AI-Generated Text Detection for Hindi

Isha Kavathekar, Anku Rani, Ananya Chamoli, and 3 more authors

In Findings of EMNLP 2024, 2024

Abs PDF

We introduce the Hindi AI Detectability Index (ADIhi), a benchmark suite and evaluation metric for detecting AI-generated Hindi text. We compile a corpus of 20K human-written and LLM-generated texts and evaluate detectors including token-level, syntactic, and stylometric features. Results indicate existing detectors struggle on code-mixed content and slang; using ADIhi, we rank LLMs and propose improvements tailored for Hindi.
EMNLP 23

FACTIFY3M: A benchmark for multimodal fact verification with explainability through 5W Question-Answering

Vipula Rawte and Amitava Das

In Proceedings of EMNLP 2023, 2023

Oral, A*

Abs

We present FACTIFY3M, a multimodal factual verification benchmark that combines images, text, and explanations grounded in 5W QA. The dataset includes 50K multimodal claims with annotated 5WQA pairs and supporting sources. Baseline models using joint vision-language transformers achieve 74.5% accuracy, but leave room for improvement; the 5W explanations improve interpretability and diagnostic evaluation.
EMNLP 23

The Troubling Emergence of Hallucination in Large Language Models - An Extensive Definition, Quantification, and Prescriptive Remediations

Vipula Rawte, S. Chakraborty, A. Pathak, and 5 more authors

In Proceedings of EMNLP 2023, 2023

Oral, A*

Abs DOI PDF

This paper introduces a formal definition and taxonomy for LLM hallucinations, proposes automatic metrics for hallucination quantification (e.g., hallucination density, factual error scores), and presents remediations such as knowledge grounding, self-critique modules, and multi-step prompting. In experiments across GPT-3, BERT, and open-source LLMs, these countermeasures reduce hallucination by up to 35%.
EMNLP 23

Counter Turing Test (CT2): AI-Generated Text Detection is Not as Easy as You May Think

Shreya Chakraborty, Aman Chadha, Aishwarya Sheth, and 1 more author

In EMNLP 2023, 2023

A*

Abs PDF

This paper demonstrates that standard token- and syntax-based detectors poorly generalize to AI-generated text in Hindi and code-mixed domains. We propose CT2 detection models that leverage stylometric and structural features, improving cross-domain robustness by 12%. We release CT2 dataset and models for future research.