

getQueryLocator vs. Iterator in Apex
When you write a Batch class in Salesforce, the start method asks you a fundamental question: “How should I get the data?”
You have two options: Database.getQueryLocator or Iterable<sObject>.
While 90% of developers default to getQueryLocator, understanding when and why to use an Iterator distinguishes a senior developer from a junior one. Let’s break down the differences with real-time use cases and technical examples.
1. getQueryLocator
Imagine you need to fill a swimming pool. You connect a hose/pipe directly to the main water supply (the Database). The water flows continuously and efficiently. You don’t need to carry the water yourself; the pressure from the main supply does the work.
- In Apex: You pass a SOQL query string directly to Salesforce. The platform handles the heavy lifting, streaming up to 50 million records automatically.
2. Iterator
Now imagine you need to fill that pool, but the water isn’t coming from a single tap. Some water is from a well, some is from bottled water, and some is filtered rain. You have to manually collect it in buckets, check the quality, maybe mix it, and then pour it into the pool.
- In Apex: You write custom logic to gather data. This data might come from complex calculations, external APIs, or a mix of multiple objects. You hand-feed the batch job to the final list.
Now we got what is the real purpose of getQueryLocator and Iterator, lets dig more into real examples
Understanding Database.getQueryLocator
getQueryLocator is the simplest way to return a massive SOQL result set in a Batch Apex job.
The single most important reason to use getQueryLocator is the Governor Limit bypass.
Standard SOQL Limit: 50,000 records. If you query 50,001 records into a List, your code explodes.
QueryLocator Limit: 50,000,000 records. Because it streams data, Salesforce allows you to touch up to 50 million records in a single batch job.
You can implement getQueryLocator in two ways. Both work, but they have subtle differences.
1. The Inline SOQL: This is the modern, preferred approach if your query is static.
global Database.QueryLocator start(Database.BatchableContext BC) {
// Salesforce validates this query when you save the file.
// If 'Status__c' doesn't exist, you can't save the code.
return Database.getQueryLocator([SELECT Id, Name FROM Account]);
}
2. The Dynamic String: Use this if you need to change the query based on variables (e.g., passing a date filter into the batch class constructor).
global Database.QueryLocator start(Database.BatchableContext BC) {
String query = 'SELECT Id, Name FROM Account WHERE CreatedDate = :dateVariable';
// Salesforce does NOT check this until the code actually runs.
return Database.getQueryLocator(query);
}
Scenario: We will build a Batch class designed to delete old “Log” records (or any custom object) that are older than a specific number of days. This is a perfect use case because Log tables often grow into the millions, making standard queries impossible.
You have a custom object System_Log__c. You have 2 million records, and you want to delete everything older than 90 days to save storage space.
global BatchLogCleaner implements Database.Batchable<sObject>, Database.statefull{
Public Integer totalRecordsDeleted = 0;
Public Database.QueryLocator start(Database.BatchableContext bc){
return Database.getQueryLocator([Select Id, name From System_log__c where createdDate = Last_90_days]);
}
Public void execute(Database.BatchableContext bc, List<System_Log__c> scope){
Delete scope; // Deleting records in a batch
totalRecordsDeleted += scope.size();
}
Public void finish(Database.BatchableContext bc){
System.debug('Batch Job Complete. Total Logs Deleted: ' + totalRecordsDeleted);
}
}
How to Run It
You would execute this from the Developer Console > Anonymous Window or a scheduled job.
BatchLogCleaner bc = BatchLogCleaner();
// Execute with a batch size of 2000 (Max optimized size for simple deletes)
Database.executeBatch(bc, 200);
Limitations about Database.getQueryLocator:
Even though it is powerful, getQueryLocator has rules:
- No Aggregate Queries: You typically cannot use GROUP BY or aggregate functions (like SUM, COUNT) in a QueryLocator. It is designed for retrieving raw records (sObjects), not summarized data.
- Subquery Fetch Limits: While the main query supports 50 million records, subqueries (e.g., SELECT Id, (SELECT Id FROM Contacts) FROM Account) can sometimes hit fetch limits if the child relationships are too deep or massive.
- Unordered Id Preservation: If you use ORDER BY in your query, Salesforce attempts to honor it, but for massive datasets (millions of rows), sorting can sometimes degrade performance or be overridden by internal chunking optimization.
Understanding Iterator:
While getQueryLocator is the automatic, high-speed highway for Salesforce data, the Iterator is the off-road vehicle. It allows you to go where standard SOQL cannot process complex lists, external data, or data that requires heavy calculation before the job even starts.
Scenario: You are running a “Weekly Sales Leaderboard” batch. You need to rank Sales Reps based on a complex formula that involves:
- Closed Opportunities.
- Customer Satisfaction Scores (stored in a different object).
- Number of phone calls made (stored in Tasks).
A single SOQL query cannot join all these tables efficiently or perform the complex math required to “Rank” them.
The Solution: We perform the heavy math in the start method, build a list of Custom Wrapper Objects, and then pass that list to the Batch to send out the emails.
global class WeeklyScoreBatch implements Database.Batchable<WeeklyScoreBatch.RepScore> {
// 1. Define a Wrapper Class to hold our complex data
// This is NOT a Salesforce object; it exists only in memory.
global class RepScore {
public Id userId;
public String userName;
public Decimal totalScore;
public RepScore(Id uid, String name, Decimal score) {
this.userId = uid;
this.userName = name;
this.totalScore = score;
}
}
// 2. START: The Iterator
// Notice the return type is Iterable<RepScore>, not QueryLocator
global Iterable<RepScore> start(Database.BatchableContext BC) {
List<RepScore> scorecard = new List<RepScore>();
// --- Complex Logic Starts Here ---
// Imagine we have complex logic that fetches Users,
// loops through their Opps and Tasks, and calculates a score.
// (Simplified for brevity)
List<User> users = [SELECT Id, Name FROM User WHERE IsActive = TRUE LIMIT 1000];
for(User u : users) {
// Perform math that SOQL can't do
Decimal mathScore = (Math.random() * 100);
// Add to our custom list
scorecard.add(new RepScore(u.Id, u.Name, mathScore));
}
// Return the simple List. Lists implement Iterable automatically!
return scorecard;
}
// 3. EXECUTE: Processing the Wrappers
// Notice the scope is List<RepScore>, not List<sObject>
global void execute(Database.BatchableContext BC, List<RepScore> scope) {
for(RepScore rep : scope) {
// Now we process our custom object
System.debug('Processing Score for: ' + rep.userName + ' Score: ' + rep.totalScore);
// Example: Create a record based on this wrapper
// Performance_Log__c log = new Performance_Log__c();
// log.User__c = rep.userId;
// log.Score__c = rep.totalScore;
// insert log;
}
}
global void finish(Database.BatchableContext BC) {
System.debug('Leaderboard calculation complete.');
}
}
Limitations of Iterator in Salesforce Batch Apex
Iterator gives you flexibility, but it’s not perfect. Here’s what you need to be aware of before using it in production.
- Database.getQueryLocator can handle up to 50 million records.Iterator cannot. Because Iterator loads data into Apex memory (List, Set, custom structure). The moment your list crosses limits, you hit heap size errors.
- QueryLocator fetches chunks of data as needed. Iterator doesn’t. You are responsible for loading and preparing all data upfront inside the constructor.
- If a batch fails mid-execution:QueryLocator batches resume exactly from next chunk. Iterator batches do not automatically recover
- Batch Apex is stateless. This means between the start, execute, and finish methods, Salesforce serializes (saves) your variables and objects to the database and deserializes (reloads) them.
If your Iterator class holds references to things that cannot be serialized such as HTTPResponse objects, open JSONParser streams, or Savepoint variables the batch job will crash with a SerializationException.






