LoRA-Based Fine-Tuning and Deployment of a Domain-Specific Legal Question Answering Model
Authors:
Boni Poojitha, Kadali Niharika Chaparla Pavani1, Chitikela Prasanna Kumari
Page No: 14-33
Abstract:
Access to legal information remains a significant societal challenge due to the complexity of statutory language and the limited availability of affordable expert guidance. This paper presents a domain-specific Legal Question Answering (QA) system for Indian constitutional and statutory law, built by fine-tuning a 3B-parameter large language model using Low-Rank Adaptation (LoRA) and augmenting it with a Retrieval-Augmented Generation (RAG) pipeline. LoRA reduces the number of trainable parameters to 0.28% of the base model while preserving performance, enabling efficient training on a single consumer GPU. The system is trained on a semi-automatically curated dataset of 12,593 question–answer pairs derived from the Constitution of India and the Indian Penal Code, with explicit leakage control and duplication analysis. A FAISS-based retrieval module grounded in sentence-transformer embeddings injects relevant legal context at inference time. Controlled ablation experiments demonstrate that RAG reduces hallucination rates from 8.3% to 3.1% and improves ROUGE-L by +0.039. The proposed system achieves BLEU-4 = 0.7028, ROUGE-L = 0.8512, Exact Match = 0.6745, and BERTScore-F1 = 0.8841, outperforming a fully fine-tuned baseline on semantic and recall-oriented metrics. Bootstrap-based confidence interval analysis confirms the statistical robustness of these results. Qualitative evaluation identifies residual failure modes in cross-provision reasoning and omission of judicial precedents. The system constitutes a research-grade prototype for Indian legal QA; safe deployment would require expert validation, expanded corpus coverage, and continuous updates to reflect evolving law.
Description:
.
Volume & Issue
Volume-15,Issue-4
Keywords
.