ARTICLE
23 June 2025

Legal Reasoning Still A Struggle For LLMs

FL
Foley & Lardner

Contributor

Foley & Lardner LLP looks beyond the law to focus on the constantly evolving demands facing our clients and their industries. With over 1,100 lawyers in 24 offices across the United States, Mexico, Europe and Asia, Foley approaches client service by first understanding our clients’ priorities, objectives and challenges. We work hard to understand our clients’ issues and forge long-term relationships with them to help achieve successful outcomes and solve their legal issues through practical business advice and cutting-edge legal insight. Our clients view us as trusted business advisors because we understand that great legal service is only valuable if it is relevant, practical and beneficial to their businesses.
Legal reasoning is a critical frontier for large language models (LLMs) specifically and artificial intelligence (AI) at large, requiring specialized domain knowledge and advanced reasoning abilities...
United States Technology

The authors in this paper created a benchmark including long-form, open-ended questions and multiple-choice questions to evaluate the performance of a number of different LLMs with respect to legal reasoning. Legal reasoning requires the application of deductive and inductive logic to complex scenarios, often with undefined parameters. Their results show that these models still "struggle with open questions that require structured, multi-step legal reasoning."

Legal reasoning is a critical frontier for large language models (LLMs) specifically and artificial intelligence (AI) at large, requiring specialized domain knowledge and advanced reasoning abilities such as precedent interpretation, statutory analysis, and legal inference. Despite progress in general reasoning, legal reasoning remains difficult and under-assessed in NLP research. Moreover, the legal domain is inherently high-stakes and a failure to thoroughly examine the capabilities and limitations of models could lead to serious real-world consequences ...

Our analysis reveals substantial variability and limitations in LLM capabilities for addressing MCQs and especially on complex open questions; notably, increasing the number of MCQ options consistently reduces model accuracy. Our evaluation framework offers a scalable approach for assessing legal reasoning quality beyond simple accuracy metrics, thereby facilitating
future research aimed at enhancing the reliability and robustness of LLMs on challenging legal tasks.

View referenced article

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Learn More