How to Catch AI-Generated Problems in Salesforce Code Reviews

A significant chunk of what now lands in your PR queue was probably written with AI assistance. Cursor, Copilot, Claude, Agentforce — the output is getting harder to tell apart from human-written code at a glance.

But AI tools weren’t built with Salesforce’s constraints in mind, and the gaps show up in ways that aren’t always obvious. This is a guide to the Salesforce-specific failures in AI-generated code, and how to use code review tooling to make sure they’re caught every time. 

The failure patterns to know

SOQL inside loops

This governor limit violation is easy to miss in a long PR, because the query and the loop are sometimes separated by enough lines that the relationship isn’t obvious.

AI generates this pattern frequently because it’s intuitive — for each record, get its related records. On a handful of records in a sandbox it works fine; against real data volumes it’ll hit the 100 SOQL query limit quickly. The correct pattern is to bulkify: query everything upfront, outside the loop, and look up results inside it.

Hardcoded IDs and environment-specific values

AI tools are trained to be helpful and specific. When they reference a record or resource, they’ll often use the actual ID from your dev org. Record IDs, record type IDs, org-specific URLs — these don’t exist in UAT or production, and the failure is usually quiet: logic that just doesn’t behave as expected, with no clear error pointing back to the cause.

The correct approach is dynamic resolution — Custom Metadata Types, Custom Settings, or schema methods — so values resolve correctly in every environment.

Test classes that look thorough but aren’t

AI-generated test classes tend to produce high coverage numbers while skipping the scenarios that actually stress the code.

The happy path gets tested well. Everything else — bulk data volumes, missing or invalid inputs, permission restrictions, boundary conditions, error handling — tends to get skipped. The result is coverage percentages that look fine in a report while leaving real failure scenarios untested.

This matters more in Salesforce than in most platforms because so many real failures happen at volume, at permission boundaries, or in error conditions that never occur in a dev sandbox with a handful of test records.

Missing or incorrect sharing declarations

AI often generates classes without explicit sharing declarations, or uses the wrong one for the use case.

Without a deliberate with sharing, without sharing, or inherited sharing declaration, behaviour depends on context and defaults — which may not be what’s intended, and may not be consistent across different calling contexts. In some cases this exposes records to users who shouldn’t see them. In others it silently restricts access in ways that break functionality without an obvious error.

This is easy to miss in a review because an absent declaration doesn’t look wrong — it just looks like nothing is there.

Conflicts with existing org logic

This one’s harder to spot in a single file, but worth building into your review habit: code that duplicates or conflicts with logic that already exists in your org.

AI generates code based on what you’ve asked for in the current session. How much of your codebase it can actually see depends on the tool and how much context you’ve given it, but it’s unlikely to have full awareness of your live org configuration, including triggers, Flows, and processes running across it. It may not know you already have a trigger handler on the same object, a Flow running after the same record update, or a process firing on the same condition. 

The result is often duplicate logic, competing updates to the same records, or unexpected execution order — none of which shows up as an error in the diff. This is particularly problematic for teams with large or long-established orgs, where years of accumulated logic make conflicts harder to spot. 

Why these patterns keep slipping through

None of these patterns are obscure — most experienced Salesforce developers know them. But knowing what to look for and consistently catching it across every PR are very different things.

AI has dramatically increased the volume of code being written, which means more PRs, reviewed faster, with less scrutiny per change. Add to that the inherent trust people place in AI output — it looks confident, it’s coherent, it usually works in dev — and the conditions are right for these patterns to slip through regularly. Not because developers are careless, but because the volume and the trust make it easy to miss.

The answer isn’t asking developers to be more careful. It’s choosing tooling that catches these things automatically, so they can’t be missed regardless of who’s reviewing or how many PRs are in the queue.

The must-haves for any tool reviewing AI code

Here’s what to look for when evaluating a code review tool for AI-generated Salesforce code:

Deterministic rules, not AI-assisted suggestions. The patterns above are binary — SOQL is either inside a loop or it isn’t, a sharing declaration is either there or it isn’t. You want a tool that applies the same rules the same way every time and returns a hard pass or fail, not one that offers a probabilistic assessment that might vary between runs. Using AI to review AI-generated code compounds the problem rather than solving it.

Org-level context, not just file-level. Catching duplicate trigger logic or conflicting Flow interactions requires understanding your full org configuration — Apex, Flows, permissions, sharing rules — not just the file in the diff. A tool that only reads one file at a time has the same blind spot as the AI that generated the code.

Coverage of the full metadata stack. Hardcoded IDs don’t only appear in Apex — they appear in Flows too. A tool that only scans code-based metadata is only checking part of what can go wrong.

Only flags what your change introduced. If there are pre-existing violations sitting in the codebase, they’re not your problem to fix every time you open a PR. The right tool isolates the issues introduced by the current change only — keeping feedback focused on what’s actually in front of you rather than drowning you in historical debt.

Metrics over time. If the same violation keeps appearing across multiple PRs, that’s a signal worth acting on — whether it means refining your AI prompts, updating your conventions, or addressing a knowledge gap. A tool that gives you visibility across the team over time turns the review stage into a feedback loop, not just a gate.

If you want to compare the leading Salesforce code reviews tools, check out this article. One of the few tools that covers all of these criteria natively for Salesforce is Gearset’s Code Reviews — deterministic, org-aware checks across the full metadata stack, applied automatically to every change. 

The cost of not keeping up

As the volume and velocity of AI-assisted development increases, making sure your review process can actually keep up is more important than ever. The risk is a gradual erosion of standards, rubber-stamped reviews, and time spent on rework for issues being caught too late.

Apex Hours
Apex Hours

Salesforce Apex Hours is a program of the community, for the community, and led by the community. It is a space where Salesforce experts across the globe share their expertise in various arenas with an intent to help the Ohana thrive! Join us and learn about the apex hours team.

Articles: 299

Leave a Reply

Your email address will not be published. Required fields are marked *