ARTICLE
28 April 2026

Putting AI To The Test: How 5 Tools Performed Across 3 Financial Models

AC
Ankura Consulting Group LLC

Contributor

Ankura Consulting Group, LLC is an independent global expert services and advisory firm that delivers services and end-to-end solutions to help clients at critical inflection points related to conflict, crisis, performance, risk, strategy, and transformation. Ankura has more than 2,000 professionals serving 3,000+ clients across 55 countries. Collaborative lateral thinking, hard-earned experience, and multidisciplinary capabilities drive results and Ankura is unrivalled in its ability to assist clients to Protect, Create, and Recover ValueTM. For more information, please visit, ankura.com.
Explore critical insights across multiple sectors including Middle East construction challenges amid regional conflict, the intersection of financial reporting valuations and tax compliance...
United States Technology
Ankura Consulting Group LLC are most popular:
  • within Insolvency/Bankruptcy/Re-Structuring, Antitrust/Competition Law and About Mondaq topic(s)

Every finance team is being asked the same question: Should we be using AI to build our models? The Ankura OCFO® team ran identical prompts through Claude for Excel, M365 Copilot (Agent Mode), ChatGPT in Excel, Shortcut (Free Tier – Open SourceModel), and Shortcut (Pro Tier) against a fictional company (Fieldstone Retail Holdings, approximately $180 million revenue, three entities).

The three test scenarios — a Long-Range Plan, 13-Week Cash Flow Forecast, and Customer Profitability Analysis — were selected because they mirror common financial modeling exercises in PE-backed finance and operations environments. AI was used to generate the fictional company data and over 500 word prompts for each scenario.

The goal was to establish a baseline comparison of AI financial modeling tools available on the market. What did we find? All five tools produced structurally complete models in 7-10 minutes, but no tool produced a production-ready model. Every tool had fundamental formula mechanics wrong or an incomplete, inconsistent structure. The value today is in scaffolding and structure; the formulas still require meaningful human rework. Critically, each prompt was delivered as a single pass with no iteration or follow-up, and identical prompts were copied across all five tools to isolate baseline capability. If a tool asked a follow-up question, the default was to accept its recommendation. With iterative prompting utilizing currently available features, results would likely improve.

Read more in our detailed report. 

1778814.jpg

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

[View Source]

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Learn More