Framework · Tool Evaluation · Ethics

Four Factors Decide Whether an AI Tool Belongs in Your Firm. Most Reviews Cover Two.

Most “AI tool reviews” written for law firms cover features and cost. They skip the two factors that actually decide whether the tool can touch a client matter. Here is the framework I use, what it surfaces across sixty-four vendors, and where I want your pushback.

The conversation about AI tools is louder than it has ever been, and most of it is useless to a working lawyer. A typical “best AI for lawyers” article in 2026 covers two questions: what can the tool do, and what does it cost. Both questions matter. Neither is sufficient.

When I started evaluating AI tools for small Florida firms, the gap I kept hitting was the distance between a vendor's marketing site and what a Bar audit would actually require. The marketing site tells you the tool generates briefs. It does not tell you whether the tool's training defaults expose firm data, whether the underlying API is subject to a preservation order, whether the vendor sits in a jurisdiction the Florida Bar would treat as a confidentiality problem, or whether the published pricing prices out a 1-10 attorney firm before the first month closes.

I needed a structured way to answer those questions before recommending any tool to any firm. That work became the Four-Factor Framework, and the answers became the JDAI Tool Index that lives on this site.

Here is what the framework does, factor by factor, with the patterns it surfaces across sixty-four real vendors.

Factor 1: Data Confidentiality

The first factor is also the only one that can stop an evaluation cold. If the tool's data handling fails this factor for a given firm, nothing else matters. The other three factors do not get scored.

Factor 1 asks a specific question: under what terms does firm data flow into and out of this tool, and what is each entity in the data path doing with the information?

This is not “is the tool secure.” That framing is too narrow. Factor 1 covers training-on-customer-data defaults, retention windows, sub-processor relationships, DPA availability, BAA availability where protected health information is in scope, and active litigation that affects the data path. Most AI tools fail Factor 1 at consumer tiers and pass conditionally at business or enterprise tiers with executed DPAs. Some fail at every tier because of vendor jurisdiction or active litigation.

Concrete examples from the index: Otter.ai fails Factor 1 across all tiers in mid-2026 because of the consolidated class action in the Northern District of California (In re Otter.AI Privacy Litigation, 5:25-cv-06911) plus vendor terms updates. ChatGPT Free, Plus, and Pro fail Factor 1 because of consumer training defaults plus the NYT v. OpenAI preservation order that retained chat logs from May through September 2025. Self-hosted Mike OSS with an enterprise zero-retention API passes Factor 1, while the same Mike OSS run through the public demo site fails it. The framework distinguishes these.

Factor 1 is the factor most reviews skip. Generic privacy framing (“the tool is SOC 2 certified”) does not answer it.

Factor 2: Ethics Compliance

The second factor asks a different question: does the tool's design and use comply with the bar rules that govern the lawyer using it?

Florida Bar Rule 4-1.6 governs confidentiality. Rule 4-1.1 governs competence, which Florida Bar Ethics Opinion 24-1 (2024) extended explicitly to AI use. Rule 4-3.3 governs candor to the tribunal, which has become the most-cited rule in the 2024-2026 AI sanctions cases. Rule 4-5.3 governs supervision of nonlawyer assistants, which the Bar treats as covering AI agents. Rule 4-7 governs lawyer advertising, which matters the moment AI output reaches a marketing channel. ABA Formal Opinion 512 (2024) overlays all of this at the national level.

Factor 2 catches risks that Factor 1 misses. A tool can pass Factor 1 contractually and still fail Factor 2 because it captures conversations in a way that violates Florida's two-party consent law (FL Stat. § 934.03). Zoom AI Companion and Otter.ai both have to navigate this directly. A tool can pass Factor 1 and still fail Factor 2 because the agentic execution model creates supervision exposure under Rule 4-5.3 that the firm cannot reasonably oversee in real time. Telldonna.space sits in this category. A tool can pass Factor 1 and still fail Factor 2 because the output reaches a client or court without the lawyer verifying it under the lawyer's own name.

Factor 2 is the second factor most reviews skip. Generic ethics framing (“don't rely on AI for legal research”) does not answer it.

Factor 3: Practice Area Fit

The third factor asks whether the tool is built for what the firm actually does.

This factor is where most reviews actually do work, because it is the factor closest to features. But the relevant question is sharper than feature parity. Factor 3 asks: does this tool's design match the practice area, the matter type, and the workflow the firm runs?

Harvey AI is excellent at large-document review and M&A diligence. For a small plaintiff personal injury firm, Harvey's strengths are not in the workflow. EvenUp's are. For a 50-state regulatory practice, NotebookLM is structurally weak because it does not maintain statutory currency. Westlaw Precision with CoCounsel is structurally strong for the same reason. For a Word-first transactional practice, Spellbook and Donna are native to the workflow; a browser-based tool, no matter how capable, will not be used consistently.

Factor 3 also surfaces the difference between “feature exists” and “feature is trustworthy.” Paxton AI publishes a 94% non-hallucination rate on the Stanford Legal Hallucination Benchmark. A 2024 Stanford study found a 34% citation error rate on Westlaw AI versus 17% on Lexis+ AI. These are not interchangeable. The framework counts the difference.

Factor 4: Cost Relative to Firm Size

The fourth factor asks whether the tool is reachable for a 1-10 attorney firm.

This factor is the one most reviews oversimplify. Sticker price is not cost. Total cost of ownership for a small firm includes the seat license, training time, the IT capacity required to deploy or self-host, the time cost of verification and supervision, the cost of running the wrong tool for six months, and the opportunity cost of not running the right one.

Factor 4 disqualifies tools that pass the first three factors but cannot reach a small firm in any sensible deployment. Harvey AI at $1,000-$1,200 per lawyer per month with a 20-seat minimum fails Factor 4 for the small-firm market by construction, even though Harvey passes Factor 1 with executed enterprise terms. Claude Enterprise at a 70-seat minimum, Luminance, and Relativity fail Factor 4 the same way.

Factor 4 also rescues tools that look unaffordable but are not. Microsoft 365 Copilot at $30 per seat is in range for a small firm if the firm is already on M365 Business Premium. Adobe Acrobat AI Assistant is essentially free at the margin if the firm is already on Acrobat. Gemini for Workspace is bundled with Workspace Business at no additional charge. The framework counts these correctly.

What the framework reveals at sixty-four tools

When all 64 tools currently in the JDAI Tool Index get scored against the four factors, three patterns emerge that no single-factor review surfaces.

Consumer tiers fail Factor 1 almost universally, regardless of vendor. Every consumer ChatGPT, Claude, Gemini, Grok, and Mistral tier in the index fails Factor 1 by default for client work. Training-on-customer-data is the default on every consumer tier. The opt-outs are real but inconsistent across vendors, and they are not the default state any firm member walks into.

Enterprise tiers pass Factor 1 but often fail Factor 4 for small firms. Harvey, Claude Enterprise, Microsoft Azure OpenAI ZDR (EA contract required), Relativity, and Luminance all sit in this bucket. The strongest contractual confidentiality posture is also the one a 1-10 attorney firm cannot reach directly.

The tools that actually fit small firms cluster at a specific intersection. Business-tier subscriptions that include a DPA, are priced under $50 per seat per month, and have a Word add-in or a Workspace integration the firm is already paying for. Microsoft 365 Copilot, Gemini for Workspace Business, Spellbook, Clio Manage AI, Smokeball Archie, and Adobe Acrobat AI Assistant cluster here. So do the legal-specific tools at the next price tier up: Lexis+ with Protege, Westlaw Precision with CoCounsel, Paxton Pro.

The framework is not magic. It is a way to keep the conversation honest about what actually decides whether a tool belongs in a firm.

“Frameworks improve when they meet practice. The version you see today is not the version I want six months from now.”

The framework is a draft

This is the part I want you to read carefully.

Frameworks improve when they meet practice. The version of the Four-Factor Framework you see in the Tool Index is the version I have today. It is not the version I want six months from now. The way to get there is to put it in front of working lawyers who are running real firms and let them tell me what is missing or wrong.

If you read through the index and think I have gotten a tool wrong on any factor, I want to know. I will not get defensive. I will look at the evidence, update the row if you are right, and credit you in the next quarterly audit notes if you would like the credit.

If there is a tool you use that does not appear in the index, I want to know. The list is sixty-four entries today. It should probably be a hundred. Tell me which tool you would add and why.

If you think the framework is missing a factor, I want to know. I considered a fifth factor on jurisdictional fit (whether the tool suits a specific state's procedural rules) and rejected it because it collapses into Factor 3 in practice. I considered a fifth factor on vendor maturity (whether the vendor will exist in two years) and I am still on the fence. If you have a fifth factor I have not thought through, send it.

And if your reaction is that you would do this differently, that is the most useful kind of feedback. Tell me where you would change it.

How to engage

Three paths, all low friction.

Email me directly at [email protected] with whatever you have. Subject line “Tool Index feedback” will get my attention. Corrections, additions, framework critique, your firm's stack, anything. I read every email and I respond.

Open the Tool Index if you want to see how the framework lands across 64 vendors. Live at /tool-index.html. One-time email registration to see the full per-tool detail. The email address you register there is also a channel I will use to send the quarterly audit update and any material vendor changes between cycles.

Schedule a Phase 1 Discovery walkthrough if you want me to run the framework against your firm's actual stack. Thirty minutes, no charge for the first call. The walkthrough maps which tools your firm and staff are actually running, identifies the Factor 1 gaps that put you at risk under Rule 4-1.6, and produces a prioritized remediation list.

Frameworks get better through use. This one gets better when you tell me where it is wrong.

This article is for informational purposes only and does not constitute legal advice. The Tool Index is research, not legal advice; it does not establish an attorney-client relationship and JDAI Consultants does not endorse any vendor listed. Consult qualified counsel for guidance specific to your situation.

The framework gets better when you tell me where it is wrong. Feedback, corrections, and tools you would add are all welcome.


← Previous Article
Share LinkedIn Email
All Articles →