Indexing Ethereum Events with The Graph: From Subgraph Development to Production Query
Querying Ethereum event history with eth_getLogs costs $500/month in RPC calls and takes 30s per query. The Graph makes complex queries free and instant. Your dapp's user is tapping their foot, waiting for their transaction history to load, while your backend is silently hemorrhaging cash on RPC calls. You're polling a chain processing 1.2M+ transactions/day with a 15-second average block time (Etherscan, Q1 2026), and every eth_getLogs is a direct debit from your infrastructure budget. This isn't a scaling problem—it's an architectural failure. We're going to fix it by moving from reactive polling to event-driven indexing with The Graph, turning slow, expensive queries into free, instant GraphQL calls.
Why eth_getLogs Will Bankrupt Your Backend
Let's autopsy the standard approach. You need "all Transfer events for this ERC-20 token from block 18,000,000." You write a script that calls eth_getLogs with a JSON-RPC provider. For a token with moderate activity, that's 50,000 events. At 5–15 gwei average gas fees on mainnet (Etherscan, Jan 2026), the RPC provider isn't paying that, but they are charging you for compute. Alchemy's top tier runs ~$0.00001 per compute unit. That "simple" query might burn 50M compute units. Do it a few times an hour for different users? You've just invented a $500/month subscription to your own data.
The Graph inverts this model. Instead of you querying the chain, you define a subgraph—a manifest that says "watch these contracts, index these events." The Graph's decentralized indexers (or the hosted service) spin up, ingest the chain from your specified startBlock, and process every relevant event once. They store the derived data in a queryable Postgres database. Your dapp then queries this indexed dataset with GraphQL. The query hits a database, not an Ethereum node. Latency drops from 30 seconds to 200ms. Cost drops from dollars per day to zero (hosted service) or a predictable query fee (decentralized network).
The real win isn't cost; it's complexity. Try this with eth_getLogs: "Show me the net flow of ETH between these 500 addresses on Arbitrum over the last month, grouped by week." You'd need to fetch every Transfer event for 500 addresses, manage pagination across 2 million blocks, and aggregate in memory. With a subgraph, it's a single GraphQL query. This is why Ethereum L2 TVL sits at $45B total (L2Beat, Jan 2026)—applications need this complexity, and indexing is the scaffold that makes it possible.
Crafting Your Subgraph Manifest: The Blueprint
The subgraph manifest (subgraph.yaml) is your contract with The Graph. It defines what to index. Get this wrong, and your indexer either misses data or grinds to a halt.
Here’s a minimal manifest for indexing a Uniswap V3 Pool on Arbitrum One, where L2 transaction fees average $0.01–0.05 (Etherscan, Jan 2026), making event volume high and indexing essential.
specVersion: 0.0.5
schema:
file: ./schema.graphql
dataSources:
- kind: ethereum
name: UniswapV3Pool
network: arbitrum-one
source:
address: "0xC36442b4a4522E871399CD717aBDD847Ab11FE88" # Uniswap V3: Positions NFT
abi: UniswapV3Pool
startBlock: 22283296 # First Arbitrum block for this contract
mapping:
kind: ethereum/events
apiVersion: 0.0.7
language: wasm/assemblyscript
entities:
- Swap
- Pool
abis:
- name: UniswapV3Pool
file: ./abis/UniswapV3Pool.json
eventHandlers:
- event: Swap(indexed address,indexed address,int256,int256,uint160,uint128,int24)
handler: handleSwap
file: ./src/mapping.ts
Critical Configuration:
network: Must match a supported network. Usearbitrum-one,mainnet,base, etc.startBlock: This is your most important optimization. Don't start from block 0. Find the contract creation block on Etherscan. Starting from block 0 for a popular contract like USDC will make your subgraph take days to sync. Real Error Fix: Event log not found on L2 — This often happens when you use a mainnet RPC to query an L2 event. Ensure yournetworkand sourceaddressare correct for the chain. For L2s, verify chain ID matches (Arbitrum=42161, Base=8453).abi: This must match the contract exactly. Generate it from the verified source on Etherscan or compile it yourself with Foundry:forge inspect <contract> abi > abis/MyContract.json.
Schema Design: Modeling On-Chain Relationships
Your GraphQL schema (schema.graphql) defines how indexed data is stored and related. Think of it as your database schema. This is where you move from raw events to meaningful application data.
For a DEX, you don't just want Swap events; you want a Pool entity that aggregates volume, a Token entity with current price, and a Swap entity that ties them together.
type Pool @entity {
id: ID! # Contract address
token0: Token!
token1: Token!
feeTier: BigInt!
totalValueLockedUSD: BigDecimal!
totalVolumeUSD: BigDecimal!
swaps: [Swap!]! @derivedFrom(field: "pool")
createdAtBlock: BigInt!
}
type Token @entity {
id: ID! # Contract address
symbol: String!
decimals: Int!
pools: [Pool!]! @derivedFrom(field: "token0")
}
type Swap @entity {
id: ID! # `${transaction.hash}-${logIndex}`
pool: Pool!
sender: Bytes!
amount0: BigDecimal!
amount1: BigDecimal!
sqrtPriceX96: BigInt!
transaction: Transaction! # Derived field
timestamp: BigInt!
}
type Transaction @entity(immutable: true) {
id: ID! # Transaction hash
blockNumber: BigInt!
gasUsed: BigInt!
swaps: [Swap!]! @derivedFrom(field: "transaction")
}
Key Design Patterns:
@derivedFrom: This creates a virtual relationship. ThePool.swapsfield is not stored in the database; it's derived by querying allSwapentities whereSwap.pool == this Pool.id. This keeps your data normalized.- Immutable Entities: Use
@entity(immutable: true)for data that never changes, likeTransaction. This gives the indexer a significant performance boost. - ID Strategy: The
idfield is the primary key. For aSwap, a composite ID using the transaction hash and log index ensures uniqueness. For aPool, the contract address is the natural ID.
Writing Event Handlers: From Solidity to AssemblyScript
Mappings (mapping.ts) are written in AssemblyScript (a TypeScript subset). This is your business logic. An event handler takes a raw Ethereum event and translates it into updates to your defined entities.
Here's a handleSwap handler that updates the Pool's aggregate volumes and creates a new Swap entity.
import { Swap as SwapEvent } from "../generated/UniswapV3Pool/UniswapV3Pool";
import { Pool, Swap, Token, Transaction } from "../generated/schema";
import { BigDecimal, BigInt } from "@graphprotocol/graph-ts";
export function handleSwap(event: SwapEvent): void {
// 1. Load or create the Pool entity
let pool = Pool.load(event.address.toHexString());
if (pool == null) {
pool = new Pool(event.address.toHexString());
pool.token0 = Token.load("0x...")!.id; // Load actual token IDs
pool.token1 = Token.load("0x...")!.id;
pool.feeTier = BigInt.fromI32(3000); // 0.3%
pool.totalValueLockedUSD = BigDecimal.zero();
pool.totalVolumeUSD = BigDecimal.zero();
pool.createdAtBlock = event.block.number;
}
// 2. Calculate USD values (simplified - needs price oracle)
let amount0USD = event.params.amount0.toBigDecimal().times(getTokenPrice(pool.token0));
let amount1USD = event.params.amount1.toBigDecimal().times(getTokenPrice(pool.token1));
let swapVolumeUSD = amount0USD.abs().plus(amount1USD.abs());
// 3. Update the Pool's aggregate fields
pool.totalVolumeUSD = pool.totalVolumeUSD.plus(swapVolumeUSD);
pool.save();
// 4. Create the Swap entity
let swapId = event.transaction.hash.toHexString() + "-" + event.logIndex.toString();
let swap = new Swap(swapId);
swap.pool = pool.id;
swap.sender = event.params.sender;
swap.amount0 = event.params.amount0.toBigDecimal();
swap.amount1 = event.params.amount1.toBigDecimal();
swap.sqrtPriceX96 = event.params.sqrtPriceX96;
swap.timestamp = event.block.timestamp;
// 5. Link to a Transaction entity
let tx = Transaction.load(event.transaction.hash.toHexString());
if (tx == null) {
tx = new Transaction(event.transaction.hash.toHexString());
tx.blockNumber = event.block.number;
tx.gasUsed = event.transaction.gasUsed;
tx.save();
}
swap.transaction = tx.id;
swap.save();
}
Critical Note: AssemblyScript is not Node.js. You cannot make external API calls (e.g., to a price oracle) inside a handler. To get token prices, you must index an on-chain price oracle (like Chainlink) in the same subgraph. All data must originate from on-chain events.
Deploying and Querying: From Studio to Production
Once your subgraph is written, deploy it to The Graph Studio.
- Build:
graph codegen && graph build - Authenticate:
graph auth --studio <your-deploy-key> - Deploy:
graph deploy --studio <your-subgraph-name>
After deployment, the hosted service will begin syncing. This can take minutes to days, depending on your startBlock and event volume. Use the GraphQL Playground in Studio to test queries.
Here's a production query your frontend can run using viem + wagmi, fetching the top pools by volume on Arbitrum in the last 24 hours:
import { createPublicClient, http } from 'viem';
import { arbitrum } from 'viem/chains';
const client = createPublicClient({
chain: arbitrum,
transport: http('https://api.studio.thegraph.com/query/.../your-subgraph/v0.0.1')
});
const query = `
query TopPools($since: Int!) {
pools(
first: 10,
orderBy: totalVolumeUSD,
orderDirection: desc,
where: { createdAtBlock_gt: $since }
) {
id
token0 { symbol }
token1 { symbol }
totalValueLockedUSD
totalVolumeUSD
}
}
`;
const variables = { since: 22283296 }; // Block 24h ago
const data = await client.request({ method: 'graphql', query, variables });
Query Optimization: Avoiding the N+1 Trap
The GraphQL API is powerful, but inefficient queries will throttle your app. The classic pitfall is the N+1 problem. Imagine fetching 10 pools, then making a separate query for each pool's swaps. That's 11 round trips.
Bad (N+1):
query BadQuery {
pools(first: 10) {
id
# ... then frontend loops and calls...
# swaps(where: {pool: $poolId}) {...}
}
}
Good (Single Query):
query OptimizedQuery($since: BigInt!) {
pools(first: 10, orderBy: totalVolumeUSD) {
id
token0 { symbol }
swaps(first: 5, orderBy: timestamp desc, where: {timestamp_gt: $since}) {
amount0
amount1
transaction { id }
}
}
}
Use filtering heavily. Want swaps in the last hour? Calculate the block timestamp and use timestamp_gt. Need a specific sender? Use sender: "0x...". Leverage the indexed fields you defined in your schema.
The L2 and Future-Proofing Landscape
Your indexing strategy must account for the multi-chain reality. EIP-4844 (proto-danksharding) reduced L2 transaction fees by 90% since March 2024 (L2Beat gas tracker), accelerating adoption. You're not just indexing mainnet anymore.
| Chain | Avg Tx Cost | Time to Finality | Indexing Consideration |
|---|---|---|---|
| Ethereum Mainnet | $3.50 | 12s | High gas, less event spam. startBlock critical. |
| Arbitrum One | $0.02 | 250ms | High throughput. Indexer must handle volume. |
| Base | $0.01 | ~2s | Growing rapidly. Ensure subgraph supports chain ID 8453. |
| zkSync Era | $0.05 | ~1 hour (ZK proof) | Different finality. Confirm block inclusion before indexing. |
Deploy the same subgraph to multiple networks by defining multiple dataSources in your manifest or managing separate deployments. The Graph's decentralized network is also evolving; for production applications with strict SLA requirements, evaluate the cost and reliability of the hosted service versus running your own indexer on the decentralized network.
Real Error Fix: Transaction stuck: gasPrice too low — While not a direct subgraph error, your indexer's Ethereum node might hit this if it's also broadcasting transactions. For any on-chain interactions (like calling a contract to get data for a derived field), use EIP-1559 with maxFeePerGas = baseFee * 1.5 + maxPriorityFeePerGas. Never use legacy gasPrice on mainnet or L2s.
Next Steps: From Prototype to Production
You've now got a subgraph indexing Uniswap V3 events. The path to production involves hardening. Add error handling in your mappings—what happens if a Token entity isn't found? Implement comprehensive testing using The Graph's test environment with mocked events. Set up a CI/CD pipeline to auto-deploy your subgraph on changes to your contract's ABI.
Monitor your subgraph's health in The Graph Studio. Is it 100% synced? What's the lag time (latest indexed block vs. chain head)? For mission-critical data, consider a fallback. Perhaps your dapp queries the subgraph first, but has a read-only contract call via viem as a backup for the most recent data not yet indexed.
Finally, remember that The Graph is a tool, not a magic wand. It excels at complex historical queries and aggregate data. For real-time, sub-second state reads (e.g., "what is this user's balance right now?"), a direct contract call via an RPC provider like Alchemy (with its 95ms p50 latency) is still the right choice. The art is in combining both: the indexed history from The Graph for the UI, and the live call for the instant update. That's how you build an app that feels instant, even on a chain with over 1M active validators staking 32M+ ETH (BeaconScan, Q1 2026) securing every single event you just indexed.