Developer guide
When to use Bulk Operations instead of pagination in Shopify apps
A decision guide for Shopify developers choosing between synchronous GraphQL pagination and Bulk Operations, with practical thresholds based on workload shape rather than generic 'large dataset' advice.
What this decision is really about
Most teams frame this as a size question. That framing is too shallow.
The real question is whether your query is serving a person or serving a pipeline. If a merchant is waiting on a table, picker, dashboard, or search result, pagination is usually the native shape of the problem. If your app is trying to ingest, export, reconcile, backfill, or resync most of a dataset, then you are no longer doing UI work. You are doing systems work, and systems work should stop pretending it is a next-page button.
Shopify’s GraphQL Admin API is cursor-paginated, with PageInfo and a maximum
of 250 resources per page. It is also cost-based, which means every page request consumes
query cost and competes with the rest of your app’s traffic on that app-store bucket.
Shopify explicitly recommends bulk operations for querying and fetching large amounts of
data rather than trying to stretch single queries forever.
“To query and fetch large amounts of data, you should use bulk operations instead of single queries.”
The working model
Use pagination when the query serves a UI. Use bulk when the query serves a pipeline. Record count matters, but workload shape matters more.
Put differently, the deciding signal is not “is this catalog kind of big?” The deciding signal is “am I making page-by-page requests only because I need the whole thing anyway?” Once the answer becomes yes, pagination turns into polite technical debt. It still works. It is still valid. It is also increasingly ridiculous.
Use pagination for bounded interactive work
Pagination is the right default when a human is waiting and the list is naturally bounded. Merchant-facing tables, search results, setup pickers, admin dashboards, and “show me the latest 20 things” screens should almost always stay paginated.
That is not just because pagination is familiar. It is because it aligns with the user
experience. The first page arrives quickly. The query shape is narrow. The app only pays
cost for data it is actually showing. Cursor-based navigation is stable, and Shopify’s
PageInfo model exists specifically for this kind of incremental traversal.
Pagination is also a strong fit for bounded operational jobs, especially when you are not traversing the whole connection. A common example is “pull recently updated orders every few minutes” or “walk a recent time window until there are no more changes.” If your job normally touches tens or a few hundreds of records, stores a durable checkpoint, and can resume from the last cursor or timestamp, bulk may be unnecessary ceremony.
When pagination is still the grown-up choice
- A person needs the first screenful now, not a file later.
- You only need a filtered working set, not the entire connection.
- Your sync logic is incremental, narrow, and resumable.
- Your query cost is predictable and low enough that throttling is rare.
- Your nested data needs are shallow enough to keep orchestration simple.
The hidden superpower of pagination is not that it scales forever. It does not. Its superpower is that it is honest about partial retrieval. If the user needs page one, fetching page one is not a compromise. It is the job.
// Good pagination use case:
// render a merchant-facing "recent orders" table
query RecentOrders($after: String) {
orders(first: 50, after: $after, sortKey: CREATED_AT, reverse: true) {
nodes {
id
name
displayFinancialStatus
createdAt
currentTotalPriceSet {
shopMoney {
amount
currencyCode
}
}
}
pageInfo {
hasNextPage
endCursor
}
}
}Nobody wants to click “Orders” and then wait for your app to launch an asynchronous export
pipeline, write a JSONL file, parse it, hydrate a staging table, and finally admit that
yes, order #1042 still exists. That is not architecture. That is performance art.
Use bulk for system-driven large reads
Bulk Operations are for work where the system needs most or all of a connection and where asynchronous delivery is acceptable. Initial imports, full resyncs, catalog exports, historical backfills, one-off migrations, denormalized warehouse feeds, and “rebuild the whole local mirror” jobs are classic bulk workloads.
Shopify’s bulk query flow exists precisely for this pattern. You submit a
bulkOperationRunQuery, Shopify executes the query asynchronously, and the
result is made available as JSONL. Shopify recommends webhooks over polling to detect
completion, recommends offline access tokens because long-running jobs can outlive online
tokens, and documents JSONL streaming because bulk results are intentionally designed for
large-file processing rather than nested in-memory responses.
Bulk also changes the economic shape of the job. You stop paying the coordination tax of repeated page fetches, repeated throttle checks, repeated retry edges, and repeated “where was I?” bookkeeping across a long traversal. The work still exists, but it moves from client-side orchestration into Shopify’s asynchronous execution model.
“Subscribing to the webhook topic is recommended over polling.”
Bulk is usually the right call when
- The app needs nearly all records, not just the first page or two.
- The job is background work and no human is waiting on it.
- You are repeatedly traversing an entire connection with pagination.
- You need nested data where page-by-page fan-out becomes brittle.
- The workload repeats across many shops, so operational efficiency matters.
Shopify’s current bulk-query guidance matters here. In API versions 2026-01
and higher, Shopify documents support for up to five concurrent bulk query operations per
shop. Bulk query results remain available for seven days after completion. The bulk query
must include at least one connection field, supports up to five connections, and has a
maximum nesting depth of two levels. If the query does not complete within 10 days,
Shopify marks it as failed.
That combination tells you what bulk is and what it is not. It is a powerful export mechanism for large reads. It is not a magical replacement for all query design, and it is not an excuse to submit a monster query assembled by a caffeinated raccoon.
mutation RunBulkProductsExport {
bulkOperationRunQuery(
query: """
{
products(query: "status:active") {
edges {
node {
id
title
updatedAt
vendor
variants {
edges {
node {
id
sku
price
updatedAt
}
}
}
}
}
}
}
"""
) {
bulkOperation {
id
status
}
userErrors {
field
message
}
}
}If that workload were implemented as paginated reads, you would be coordinating page traversal for products, coping with cost and throttling over time, and probably adding extra logic for nested variant hydration anyway. That is the moment to stop fighting the platform and let the bulk pipeline do its job.
Decision signals that matter more than record count
Teams love arguing about whether 5,000 records is “large.” That argument is fun in the same way a sand-filled shoe is fun. It produces motion, but not progress.
Record count matters, but it is not the best first discriminator. The better signals are about interaction model, retrieval coverage, orchestration cost, and repeatability.
| Signal | Leans pagination | Leans bulk | Why it matters |
|---|---|---|---|
| Who is waiting? | A person in a UI | The system in the background | Async file generation is wrong for interactive paths and excellent for pipelines. |
| Coverage needed | Small filtered subset | Most or all records | Whole-dataset work magnifies pagination coordination cost. |
| Retry shape | Cheap to retry a page | Cheaper to rerun a durable background job | Bulk turns long traversals into one job lifecycle instead of hundreds of request lifecycles. |
| Nested data | Shallow and bounded | Deep enough to cause fan-out pain | Nested pagination orchestration gets fragile fast. |
| Frequency across shops | Occasional or ad hoc | Repeated fleet-wide workload | Small inefficiencies become real infrastructure costs when multiplied across merchants. |
| Throttle pressure | Rare and manageable | Constant companion, like a sad little metronome | Heavy repeated page traversal burns cost budget that bulk avoids. |
Heuristics that work well in real app teams
These are not Shopify hard limits. They are practical operator heuristics:
Stay with pagination if the job usually completes in a few pages and exists primarily to support a UI or a narrow incremental sync.
Strongly consider bulk if your job is expected to walk dozens of pages or more, especially when it does so just to end up with a local full copy anyway.
Switch to bulk early when the job needs parent and child records at scale, such as products plus variants or orders plus line items.
Switch to bulk when “resume from the next cursor” has become a mini-subsystem with throttling, retry, idempotency, and checkpoint logic that exists only because you are still paginating the universe.
Shopify’s documented limits reinforce this. Single GraphQL queries are still bounded by requested cost, including a hard maximum single-query cost of 1,000 points, while bulk operations are specifically designed for large reads and are documented as not being subject to those single-query max cost limits or the usual rate-limit model for single queries.
# Smell test:
# if this loop exists only because the app needs "everything",
# it is often a bulk candidate.
cursor = nil
loop do
result = client.query(query: PRODUCTS_PAGE_QUERY, variables: { after: cursor })
nodes = result.data.products.nodes
break if nodes.empty?
nodes.each { |product| upsert_product(product) }
page_info = result.data.products.pageInfo
break unless page_info.hasNextPage
cursor = page_info.endCursor
endThere is nothing wrong with this loop when the workload is genuinely page-shaped. There is everything wrong with it when the app runs it across every shop every night and then acts surprised that the queue smells like throttle debt.
Concrete workloads and the right choice
Abstract rules help. Concrete examples help more.
| Workload | Recommendation | Why |
|---|---|---|
| Merchant UI showing the latest 25 orders | Pagination | The user needs the first result immediately and only a bounded subset is required. |
| Typeahead product picker during setup | Pagination | It is interactive, filtered, and should retrieve only what the user can act on. |
| Initial catalog import for a newly installed app | Bulk | The app needs most of the dataset and the job is background-oriented. |
| Nightly rebuild of local product mirror | Bulk | Whole-dataset synchronization is pipeline work, not UI work. |
| Incremental sync of orders updated in the last 10 minutes | Usually pagination | A bounded recent window with durable checkpoints often does not need bulk. |
| One-time historical backfill of all orders since 2022 | Bulk | Page-by-page traversal adds coordination cost without user-facing benefit. |
| Export of products and variants to a warehouse | Bulk | Nested, large, and asynchronous by nature. |
| Admin screen with a filtered list of failed jobs | Pagination | Bounded list, immediate feedback, simple retrieval path. |
Notice what is missing from that table: a magic record-count cutoff. That is deliberate. A shop with only 2,000 products can still justify bulk if you need a complete daily export with nested variants. Meanwhile, a shop with 200,000 orders can still justify pagination for a screen that only shows the latest 20.
This is why “use bulk for large datasets” is technically true but strategically lazy. The actual decision is about job shape and operational economics.
A useful threshold ladder
UI-first path: default to pagination until you have very strong evidence otherwise.
Incremental background sync: start with pagination if the changed set is naturally small and resumable.
Whole-dataset or nested export: start with bulk instead of first building a paginated crawler you will later regret.
Fleet-wide recurring traversal: bias toward bulk earlier, because per-shop inefficiency multiplies hard.
Rails patterns for both paths
The implementation pattern should mirror the workload. Do not hide an architectural mismatch inside a service object with a confident name and a thousand-yard stare.
Pattern A: paginated service for bounded reads
class Shopify::RecentOrdersPage
QUERY = <<~GRAPHQL
query($after: String) {
orders(first: 50, after: $after, sortKey: CREATED_AT, reverse: true) {
nodes {
id
name
createdAt
displayFinancialStatus
}
pageInfo {
hasNextPage
endCursor
}
}
}
GRAPHQL
def initialize(shop:)
@shop = shop
@client = ShopifyClient.for(shop)
end
def call(after: nil)
response = @client.query(query: QUERY, variables: { after: after })
{
orders: response.data.orders.nodes,
page_info: response.data.orders.pageInfo,
cost: response.extensions&.dig("cost")
}
end
endThis kind of service is great for screens, bounded jobs, and incremental windows. Keep it simple. Surface cursor information and cost metadata. Do not quietly make it iterate through 800 pages “for convenience.” Convenience is how good abstractions become crimes.
Pattern B: bulk kickoff plus durable ingestion pipeline
class Shopify::StartProductsBulkExport
MUTATION = <<~GRAPHQL
mutation($query: String!) {
bulkOperationRunQuery(query: $query) {
bulkOperation {
id
status
}
userErrors {
field
message
}
}
}
GRAPHQL
BULK_QUERY = <<~GRAPHQL
{
products(query: "status:active") {
edges {
node {
id
title
updatedAt
vendor
variants {
edges {
node {
id
sku
price
updatedAt
}
}
}
}
}
}
}
GRAPHQL
def initialize(shop:)
@shop = shop
@client = ShopifyClient.for(shop) # use offline token
end
def call!
response = @client.query(query: MUTATION, variables: { query: BULK_QUERY })
payload = response.data.bulkOperationRunQuery
raise payload.userErrors.map(&:message).join(", ") if payload.userErrors.any?
BulkSyncRun.create!(
shop: @shop,
shopify_bulk_operation_id: payload.bulkOperation.id,
status: payload.bulkOperation.status.downcase
)
end
endThe important thing here is not the mutation itself. It is the boundary. Starting a bulk operation should create a durable local run record. Treat the Shopify bulk operation as an external async job with your own lifecycle, not as a floating promise you hope to remember.
Pattern C: finish webhook plus streaming JSONL import
require "json"
require "open-uri"
class Webhooks::Shopify::BulkOperationsFinishController < ApplicationController
skip_before_action :verify_authenticity_token
def create
payload = JSON.parse(request.raw_post)
run = BulkSyncRun.find_by!(
shop_id: shop.id,
shopify_bulk_operation_id: payload.fetch("admin_graphql_api_id")
)
if payload["status"] == "completed"
ProcessBulkJsonlJob.perform_later(run.id, payload["url"])
run.update!(status: "completed")
else
run.update!(
status: payload["status"],
error_code: payload["error_code"]
)
end
head :ok
end
private
def shop
@shop ||= Shop.find_by!(shopify_domain: request.headers["X-Shopify-Shop-Domain"])
end
end
class ProcessBulkJsonlJob < ApplicationJob
queue_as :default
def perform(run_id, url)
run = BulkSyncRun.find(run_id)
URI.open(url) do |io|
io.each_line do |line|
row = JSON.parse(line)
case row["id"]
when /\Agid:\/\/shopify\/Product\//
upsert_product!(run.shop, row)
when /\Agid:\/\/shopify\/ProductVariant\//
upsert_variant!(run.shop, row)
end
end
end
run.update!(ingested_at: Time.current, status: "ingested")
end
endThis is the bulk mental model in Rails: submit, persist local run state, react to finish,
stream JSONL, and make ingestion idempotent. Shopify documents JSONL specifically so clients
can parse line by line instead of loading the entire file into memory. For nested
connections, the JSONL output includes __parentId so children can be related
back to their parent records during reconstruction.
The backend rule
Bulk operations deserve first-class job orchestration. Do not tuck them inside a controller action that returns 200 and a prayer.
Common mistakes that waste weeks
Using bulk for interactive screens. If a user is waiting, the file-based async model is usually the wrong UX shape.
Using pagination for full imports because it was easier to prototype. Prototype debt becomes queue debt very quickly.
Ignoring query cost metadata. Shopify returns cost and throttle information with GraphQL responses. If your paginated sync is constantly near the limit, the system is telling you something.
Polling forever. Shopify recommends subscribing to the
bulk_operations/finishwebhook instead of leaning on redundant status checks.Using online tokens for long-running bulk jobs. Shopify explicitly advises offline access tokens because online tokens can expire before the operation completes.
Reading the whole JSONL file into memory. The docs practically wave a large red flag at this. Stream it line by line.
Skipping normal-query validation before bulk. Shopify notes that query errors are easier to understand when you test the query normally first.
Treating nested bulk output as if it were normal nested JSON. It is not. Parent and child nodes are flattened into JSONL lines, with parent linkage handled through
__parentId.
The most expensive mistake is not a syntax error. It is building the wrong retrieval model for the job and then compensating with retries, caches, checkpoints, throttling logic, side queues, and increasingly aggressive optimism.
A practical rule of thumb for app teams
Use pagination when the query is serving a UI. Use bulk when the query is serving a pipeline.
That rule is not mathematically perfect, but it is operationally strong. It prevents the two most common bad decisions:
- making users wait on an async export system, and
- making background jobs behave like next-page navigation.
A second rule helps when the first one feels too abstract:
If your Rails job paginates through an entire connection, throttles repeatedly, persists most or all records locally, and then repeats that behavior across many shops, the job is already asking for bulk. It may not be asking politely, but it is asking.
Shopify gives you both tools because they solve different shapes of work. Pagination is for bounded retrieval with immediate feedback. Bulk is for asynchronous, large-scale extraction. Mature Shopify apps use both, and they do not confuse them.
Best internal links
Sources and further reading
Shopify Dev: API limits
Shopify Dev: Paginating results with GraphQL
Shopify Dev: Perform bulk operations with the GraphQL Admin API
Shopify Dev: bulkOperationRunQuery
Shopify Dev: Migrate to GraphQL from REST
Sources checked on March 12, 2026. Bulk-query concurrency guidance changed in API version
2026-01, so older articles and code samples may still repeat the previous
one-query-at-a-time rule.
FAQ
Should I switch to bulk just because a connection has many records?
Not automatically. Switch when the app needs most or all of the dataset as background work. A large connection shown interactively can still belong on pagination.
Is bulk faster than pagination?
For whole-dataset and nested export workloads, usually yes in total system effort. For the first result a user sees on screen, no. Bulk is asynchronous and therefore wrong for human-waiting paths.
Can bulk replace incremental syncs?
Not always. Incremental syncs over a small recent window can still fit pagination well, especially when you have a durable cursor or timestamp checkpoint.
What usually forces the switch in production?
Repeated full traversals, high query cost, constant throttling, brittle nested pagination loops, and jobs that exist for the system rather than for a person sitting in front of a screen.
Related resources
Keep exploring the playbook
Shopify Admin GraphQL patterns in Rails
Production patterns for using the Shopify Admin GraphQL API from Rails, including service boundaries, pagination strategy, throttling, partial failure handling, and when to switch to bulk operations.
Shopify Bulk Operations in Rails
A Rails implementation guide for Shopify Bulk Operations covering job orchestration, JSONL downloads, polling versus webhooks, and the service boundaries that make large syncs maintainable.
Calling a Rails API from a Shopify customer account extension
A practical guide to calling Rails endpoints from a Shopify Customer Account UI Extension, including session-token verification, endpoint design, and the requests that should not go through your backend at all.