The Architecture You Are Actually Auditing

To avoid Ethereum consensus changes, ERC-4337 does not attempt to create new transaction types for account-abstracted transactions. Instead, users package up the action they want their smart contract account to take in a struct named UserOperation, which is sent to a dedicated UserOperation mempool.

That design choice has profound security implications. Every component that handles the UserOperation before it reaches the chain is a trust boundary, and every trust boundary is an attack surface.

The five core actors are:

  • EntryPoint — The core of ERC-4337. It acts as the universal gateway for all smart contract wallet interactions, coordinating validation, execution, sponsorship, and bundling logic.
  • UserOperation — Transaction intent objects that describe what you want to do. Unlike regular transactions, they are sent to a separate mempool, can include custom authentication, and support gas sponsorship through Paymasters.
  • Bundler — Monitors the alternative mempool, collects multiple UserOperations, and submits them to the blockchain in a single transaction. Bundlers are critical because all Ethereum transactions ultimately need to originate from an EOA. In the ERC-4337 ecosystem, bundlers are the only participants that need EOAs.
  • Paymaster — A contract that can sponsor gas fees for users. This allows dApps to deliver seamless, “gasless” experiences where users don’t have to hold ETH to transact.
  • Account (Smart Wallet) — Replaces traditional externally owned accounts with programmable smart contract wallets. Instead of a single private key controlling an account, the account’s validation logic is defined in code, enabling features like multi-sig, social recovery, spending limits, and gasless transactions.

Understanding how these five actors interact under adversarial conditions is the entire job of the ERC-4337 security auditor.


The UserOperation Struct

Before discussing attacks, internalize the data structure that every attack targets:

struct UserOperation {
    address sender;
    uint256 nonce;
    bytes   initCode;        // factory + calldata for first-time deployment
    bytes   callData;        // what the account executes
    uint256 callGasLimit;
    uint256 verificationGasLimit;
    uint256 preVerificationGas;
    uint256 maxFeePerGas;
    uint256 maxPriorityFeePerGas;
    bytes   paymasterAndData; // paymaster address + its calldata
    bytes   signature;
}

Two gas fields depend on the actual bytes of your UserOperation: preVerificationGas covers unmetered overhead like EntryPoint calldata gas cost and parts of bundler execution — this cost changes with the signature and paymasterAndData length and content. verificationGasLimit covers on-chain verification work such as parsing and verifying signatures or paymaster checks.

Both of these fields are supplied by the user, which means they are attacker-controlled inputs. This matters enormously for bundler economics and DoS surface, as we will see.


Validation vs. Execution Phase Separation

The most fundamental security property of ERC-4337 is the separation between validation and execution. The EntryPoint runs all validations first across the entire bundle, then executes.

handleOps(UserOperation[] ops)
  └─ Phase 1: Validate all ops
       ├─ account.validateUserOp(op, userOpHash, missingFunds)
       └─ paymaster.validatePaymasterUserOp(op, userOpHash, maxCost)
  └─ Phase 2: Execute all ops
       ├─ account.execute(callData)
       └─ paymaster.postOp(mode, context, actualGasCost)

This separation exists for economic security. If validation happened inside execution, a malicious account could pass validation cheaply and consume unlimited gas during execution at the bundler’s expense. By separating phases, the EntryPoint can charge for gas before execution runs.

The separation also creates a class of bugs that do not exist in EOA systems. Signing a UserOperation does not fully control the circumstances under which that action runs. It locks in what was asked to do — which contract to call, what data to pass, and the limits around gas — but it does not guarantee that the call will be executed as a clean, standalone action.

This gap between “what was signed” and “how execution context looks at runtime” is the root cause of the griefing vulnerability discovered in EntryPoints before v0.9, discussed in detail later.

What Validation MUST NOT Do

The ERC-4337 specification imposes strict constraints on what code may execute during the validation phase. Bundlers simulate validation before inclusion. If validation accesses banned opcodes or out-of-scope storage, a bundler must reject the UserOperation.

Banned opcodes during validation include: BALANCE, ORIGIN, GASPRICE, BLOCKHASH, COINBASE, TIMESTAMP, NUMBER, PREVRANDAO, GASLIMIT, SELFBALANCE, BASEFEE, and CREATE2 (when called from anything other than the factory).

These restrictions exist because bundlers must simulate and then include the op in a single real transaction. If validation reads block.timestamp, the simulated result and the on-chain result may diverge, making a simulated-valid op become on-chain invalid — destroying the bundler’s gas.

// UNSAFE: reads timestamp in validation
function validateUserOp(
    UserOperation calldata userOp,
    bytes32 userOpHash,
    uint256 missingAccountFunds
) external override returns (uint256 validationData) {
    // VIOLATION: TIMESTAMP opcode is banned in validation
    require(block.timestamp >= validFrom, "not yet valid");
    // ...
}

// CORRECT: encode time range in the return value
function validateUserOp(
    UserOperation calldata userOp,
    bytes32 userOpHash,
    uint256 missingAccountFunds
) external override returns (uint256 validationData) {
    bool sigValid = _validateSignature(userOp, userOpHash);
    // Pack validAfter and validUntil into validationData
    // EntryPoint enforces time ranges natively
    return _packValidationData(!sigValid, validUntil, validAfter);
}

Time-range validity should be expressed through the validationData return value (the upper 160 bits encode validUntil and validAfter), not through opcode reads.


UserOp Validation Logic and What Makes It Exploitable

ERC-4337 replaces the EOA signature model with a validateUserOp function on the account contract. That function is now the trust boundary. Every assumption downstream — that the caller is authorized, that the nonce is valid, that the paymaster isn’t being abused — flows from whatever logic is written there.

This is the core problem: a system that was previously secured by cryptography is now secured by code. And code has bugs.

Returning Instead of Reverting

The spec requires that on a signature mismatch, validateUserOp returns SIG_VALIDATION_FAILED (a specific sentinel value) rather than reverting. If the account does not support signature aggregation, it MUST validate the signature is a valid signature of the userOpHash, and SHOULD return SIG_VALIDATION_FAILED (and not revert) on signature mismatch.

A revert during simulation is treated differently from SIG_VALIDATION_FAILED. If your validation unconditionally reverts on bad signatures, some bundler implementations may misclassify the failure mode, leading to incorrect handling. The correct pattern:

uint256 constant SIG_VALIDATION_FAILED = 1;

function validateUserOp(
    UserOperation calldata userOp,
    bytes32 userOpHash,
    uint256 missingAccountFunds
) external override onlyEntryPoint returns (uint256) {
    bytes32 hash = userOpHash.toEthSignedMessageHash();
    address recovered = hash.recover(userOp.signature);

    if (recovered != owner) {
        return SIG_VALIDATION_FAILED; // return, do not revert
    }

    _payPrefund(missingAccountFunds);
    return 0; // success
}

Missing onlyEntryPoint Guard

validateUserOp must only be callable by the EntryPoint. Without this guard, an attacker can call it directly to probe signature validity, drain the prefund, or manipulate validation state.

modifier onlyEntryPoint() {
    require(msg.sender == address(entryPoint), "not entrypoint");
    _;
}

Failing to Verify userOpHash

The EntryPoint computes userOpHash = keccak256(abi.encode(userOp, chainId, entryPointAddress)). To prevent replay attacks, either cross-chain or with multiple EntryPoint contract versions, the signature MUST depend on chainid and the EntryPoint address.

Teams that roll their own hash computation and omit chainId or the EntryPoint address open the door to cross-chain replay.

Storage Access Violations in Validation

During validation, the account may only access its own storage. UserOperation storage access rules prevent operations from interfering with each other. But “global” entities — paymasters and factories — are accessed by multiple UserOperations, and thus might invalidate multiple previously valid UserOperations.

A validation function that reads a global mapping — say, a shared allowlist contract — will be rejected by compliant bundlers, silently making the account unusable with the standard mempool.


Paymaster Sponsorship Abuse and Griefing Attacks

Missteps in paymaster design can not only break gas sponsorship flows, but also expose their deposited ETH in the EntryPoint to exploitation or griefing.

The Paymaster interface involves two hooks:

interface IPaymaster {
    function validatePaymasterUserOp(
        UserOperation calldata userOp,
        bytes32 userOpHash,
        uint256 maxCost
    ) external returns (bytes memory context, uint256 validationData);

    function postOp(
        PostOpMode mode,
        bytes calldata context,
        uint256 actualGasCost
    ) external;
}

The Stake Draining Attack

If a sponsored op fails validation or execution, the Paymaster pays the gas. It is critical to simulate carefully and enforce strict checks in validatePaymasterUserOp.

An attacker who can produce UserOps that pass validatePaymasterUserOp but consistently fail during execution can drain the Paymaster’s deposit at zero cost to themselves. The anatomy of a vulnerable Paymaster:

// VULNERABLE: no binding between signature and callData
function validatePaymasterUserOp(
    UserOperation calldata userOp,
    bytes32 userOpHash,
    uint256 maxCost
) external override returns (bytes memory context, uint256 validationData) {
    // Only checks that the user is in a whitelist
    require(whitelist[userOp.sender], "not whitelisted");
    return ("", 0); // passes; execution can still fail arbitrarily
}

A whitelisted address can manufacture UserOps with callData that always reverts, burning the Paymaster’s deposit. Mitigations:

  1. Bind the signature over the full userOpHash, locking callData, gas limits, and the intended target.
  2. Encode a max gas cost in the off-chain signature so the Paymaster cannot be used for unexpectedly expensive operations.
  3. Use rate limiting keyed to userOp.sender.
// SAFER: signature binds callData, maxCost, and sender
function validatePaymasterUserOp(
    UserOperation calldata userOp,
    bytes32 userOpHash,
    uint256 maxCost
) external override returns (bytes memory context, uint256 validationData) {
    (uint256 allowedMaxCost, bytes memory sig) =
        abi.decode(userOp.paymasterAndData[20:], (uint256, bytes));

    require(maxCost <= allowedMaxCost, "gas limit exceeded");

    bytes32 hash = keccak256(abi.encodePacked(
        userOpHash, allowedMaxCost, block.chainid
    )).toEthSignedMessageHash();

    address signer = hash.recover(sig);
    require(signer == paymasterSigner, "invalid paymaster sig");

    return (abi.encode(userOp.sender, allowedMaxCost), 0);
}

postOp Reentrancy and Double Charging

The postOp hook runs after execution. If postOp reverts, the EntryPoint calls it a second time with PostOpMode.postOpReverted. A Paymaster that does an ERC-20 transfer in postOp can be targeted: if the transfer reverts due to a re-entrancy guard or insufficient balance, the double postOp can leave accounting in an inconsistent state.

A revert in the initial postOp() call can revert the entire bundle, even though a second postOp() call is expected to be made.

The safest pattern is to perform all deductions during validatePaymasterUserOp, not postOp, so that the Paymaster is always whole before execution begins. The key lesson: always collect full payment during validation, not after execution.

The Reputation and Staking System

The ERC-4337 specification describes a paymaster reputation scoring and throttling system. Because a paymaster’s storage is shared across all the operations in a bundle that use that paymaster, the actions of one validatePaymasterUserOp could potentially cause validation to fail for numerous other user ops in the same bundle — a Denial-of-Service attack.

To avoid malicious paymasters creating multiple instances of itself (a Sybil attack), paymasters are required to stake ETH. Paymaster stakes are never slashed and can be withdrawn at any time; stakes exist to require a potential attacker to lock up a non-trivial amount of capital to deter malicious behavior.


Bundler DoS via Gas Estimation Manipulation

Bundlers assume the gas cost risk and are reimbursed by user accounts or Paymasters. This creates an economic attack vector: if a UserOperation passes off-chain simulation but fails or consumes far more gas on-chain, the bundler loses money.

The Simulation / Execution Divergence Attack

Bundlers must pre-simulate using debug_traceCall before including any UserOp. An attacker can craft a validation function that behaves differently under simulation versus real execution by reading state that changes between the two:

// ADVERSARIAL account that passes simulation but fails on-chain
function validateUserOp(
    UserOperation calldata userOp,
    bytes32 userOpHash,
    uint256 missingAccountFunds
) external override returns (uint256) {
    // Reads a mapping the attacker can mutate between simulation and inclusion
    require(permissions[msg.sender], "not permitted");
    // attacker calls permissions[entryPoint] = false in a front-run tx
    return 0;
}

To prevent abuse, the spec throttles down (or completely bans for a period of time) any entity that causes invalidation of a large number of UserOperations in the mempool. To prevent Sybil attacks, entities are required to stake with the system, making such DoS attacks very expensive.

preVerificationGas Underestimation

The preVerificationGas field covers unmetered overhead like EntryPoint calldata gas cost and parts of bundler execution. This cost changes with the signature and paymasterAndData length and content.

An attacker can submit a UserOp with a preVerificationGas value that is accurate at simulation time but underestimates real on-chain cost — for example, by using a shorter dummy paymasterAndData during estimation than will be used on-chain. This leaves the bundler partially unreimbursed.

To mitigate this risk, bundlers should ensure that at least one of the enforceable gas limits is not completely consumed during simulation so there is enough buffer to cover the overhead. In practice, they should choose the user’s verificationGasLimit to guarantee that the simulated buffer amount is reproduced on-chain.


Signature Aggregation Security

Signature aggregation is a talked-about feature of ERC-4337 for its ability to reduce calldata costs on L2s via signature compression and amortize the gas cost of an aggregated validation check across a bundle of operations.

The aggregation extension introduces a new entity, an aggregator, that is called during validation to validate multiple UserOperations at once. This enables UserOperations to share validation inputs, saving gas and guaranteeing atomicity of the bundle.

Why Aggregators Are High-Risk

The aggregator contracts are among the most trusted contracts in the entire ecosystem. They can authorize transactions on behalf of accounts, and they can invalidate large numbers of transactions with a simple storage change. Both account developers and block builders should be extremely careful with the selection of aggregator contracts they are willing to support.

The trust model is stark: if the aggregator is compromised, it can construct a valid aggregate signature over arbitrary UserOperations, authorizing fund transfers from every account that uses that aggregator.

BLS Aggregation Pitfalls

BLS signature aggregation is the primary use case. The critical vulnerability is rogue key attacks: a malicious participant in a multi-party BLS signing session can supply a public key chosen to cancel out other participants’ contributions.

interface IAggregator {
    // Called by bundler to validate aggregate sig for a batch
    function validateSignatures(
        UserOperation[] calldata userOps,
        bytes calldata signature
    ) external view;

    // Called per-op by EntryPoint to validate individual ops
    function validateUserOpSignature(
        UserOperation calldata userOp
    ) external view returns (bytes memory sigForUserOp);
}

Security requirements for aggregator implementations:

  1. Proof of possession must be verified for each public key before it participates in aggregation.
  2. The aggregate must bind to the entire userOpHash array — not a subset.
  3. The bundler should first verify the aggregator is not throttled or banned according to ERC-7562 rules before calling validateUserOpSignature() to validate the UserOperation signature.

Unmetered Aggregator Calls

In the current EntryPoint, the call to the signature aggregator’s validate function is unmetered. This means the bundler is required to find a means to charge aggregated UserOperations for this gas, similar to the L2 calldata cost problem. The only reasonable way to do this is to increase preVerificationGas.

A malicious aggregator that performs unbounded computation in validateSignatures can thus exhaust a bundler’s gas budget without the cost appearing in any user-visible gas field.


Session Keys and Guardians: Programmable Auth as Attack Surface

Session keys and social recovery guardians are the two features most likely to introduce critical vulnerabilities in smart account implementations.

Session Key Risks

Session key enforcement happens in validateUserOp() via wallet-specific logic such as a plugin or auth module; it is not natively supported by ERC-4337.

A session key is a delegated signing key with constrained permissions — for example, “can call swap() on Uniswap with at most 0.1 ETH, for the next 24 hours.” The security of session keys depends entirely on how tightly those constraints are encoded and enforced.

Common session key vulnerabilities:

Scope creep via callData parsing: If the session key module parses callData to check the target and function selector but does not validate the full ABI-decoded arguments, an attacker with a session key authorized for transfer(alice, 100) may craft callData that passes the selector check while encoding a different recipient or amount.

// VULNERABLE: only checks selector, not arguments
function _validateSessionKeyOp(
    UserOperation calldata userOp,
    SessionKey memory key
) internal view returns (bool) {
    bytes4 selector = bytes4(userOp.callData[:4]);
    require(selector == key.allowedSelector, "wrong selector");
    // Does NOT validate the arguments — attacker can pass arbitrary to/amount
    return true;
}

// SAFER: decode and validate full calldata
function _validateSessionKeyOp(
    UserOperation calldata userOp,
    SessionKey memory key
) internal view returns (bool) {
    bytes4 selector = bytes4(userOp.callData[:4]);
    require(selector == TRANSFER_SELECTOR, "wrong selector");

    (address to, uint256 amount) =
        abi.decode(userOp.callData[4:], (address, uint256));

    require(to == key.allowedRecipient, "wrong recipient");
    require(amount <= key.maxAmount, "amount too high");
    require(block.timestamp <= key.expiry, "expired");
    return true;
}

Missing revocation propagation: Session keys stored in off-chain mappings or module storage must be revocable. If a key is revoked in one module but the account’s validateUserOp still routes to an old module, the revocation is ineffective.

Incorrect operation type checks: Incomplete validation of operation types in session verification has been reported as a real-world finding — specifically, a session key authorized for one operation type being accepted for a different type by a validation function that checks presence but not type equality.

Guardian Recovery Risks

Social recovery guardians introduce a time-delayed privileged operation. The canonical attack is guardian griefing:

If the recovery module uses a recoveryInitiatedAt timestamp with a fixed window, a single malicious guardian can prevent recovery indefinitely at the cost of gas — by repeatedly triggering and then cancelling recovery, resetting the timer each time.

Mitigations include:

  • Using a monotonically incrementing recovery nonce so cancellations do not reset a timer that an honest requester must re-wait.
  • Requiring a guardian supermajority to cancel, not just any single guardian.
  • Storing the initiating guardian set in the recovery request so substitution attacks are impossible.
struct RecoveryRequest {
    bytes32   guardianSetHash; // hash of the guardian set at initiation
    address   newOwner;
    uint256   initiatedAt;
    uint256   recoveryNonce;  // prevents reset griefing
}

Replay Protection in UserOperations

Nonce Structure

In the Ethereum protocol, the sequential transaction nonce is used as a replay protection method and to determine valid transaction ordering. However, requiring a single sequential nonce value is limiting to the senders’ ability to define their custom logic regarding transaction ordering and replay protection.

ERC-4337 addresses this with a 2D nonce: nonce = (key << 64) | sequence. The key is a 192-bit namespace; the sequence is a 64-bit counter within that namespace. Different keys can advance independently, enabling parallel operation lanes and batch-cancel patterns.

A critical mistake is implementing validation without using the EntryPoint-provided userOpHash:

// WRONG: rolls own hash, omits chainId
function validateUserOp(...) external override returns (uint256) {
    bytes32 hash = keccak256(abi.encodePacked(
        userOp.sender,
        userOp.nonce,
        userOp.callData
    )); // no chainId, no entryPoint address
    // ...
}

Cross-Chain Replay

An attacker who captures a valid UserOperation on Arbitrum can replay it on mainnet if the same account is deployed at the same address — common with CREATE2. The nonce is checked by the EntryPoint, but only if the implementation correctly increments it — and only against the same EntryPoint on the same chain.

The fix is using the EntryPoint-provided userOpHash, which includes chain ID and EntryPoint address.

Cross-Account Replay

In most naive implementations of isValidSignature() — for example, one that checks that some owner has signed a bytes32 hash — signatures can be replayed across accounts with shared owners. The hash needs to be altered by the account using EIP-712 so that a signature is only valid for a specific wallet.

The correct approach:

function _validateSignature(
    UserOperation calldata userOp,
    bytes32 userOpHash
) internal view returns (bool) {
    // userOpHash is already bound to: sender, nonce, callData,
    // chainId, and entryPoint address.
    // Do NOT re-hash it with a stripped-down version.
    bytes32 hash = userOpHash.toEthSignedMessageHash();
    return hash.recover(userOp.signature) == owner;
}

Transient Storage Hazards

Contracts using EIP-1153 transient storage must take into account that ERC-4337 allows multiple UserOperations from different unrelated sender addresses to be included in the same underlying transaction. The transient storage MUST be cleaned up manually if it contains sensitive information or is used for access control.

A module that uses transient storage as a reentrancy lock may leave that lock set across UserOperations in the same bundle, unintentionally blocking a second user’s operation.


The EntryPoint v0.9 Griefing Vulnerability

This real-world vulnerability illustrates how the gap between validation and execution can be exploited even when signatures are perfect.

EntryPoint versions before v0.9 allow a griefing attack where an attacker creates a temporary “must revert” condition in a target contract — for example, by tripping a reentrancy guard — then forces a victim’s valid UserOperation to execute while that condition is still active.

The UserOperation is correctly signed and validates normally, but it is executed at the wrong moment, so the inner call reverts and the victim (or their paymaster) still pays the gas.

The attack is observable because for the issue to be exploitable, the UserOperation needs to be observable before execution — either when UserOperations are propagated through the ERC-4337 off-chain mempool, or when a bundler submits a handleOps transaction to the public Ethereum mempool containing the full UserOperation payload.

The mitigation was implemented in EntryPoint v0.9, which enforces that handleOps and handleAggregatedOps can only be invoked by externally owned accounts in a top-level transaction context. This change prevents UserOperations from being executed within attacker-controlled call frames, removing the conditions required for the described griefing scenario.

The broader lesson: the ERC-4337 execution model creates ordering and context dependencies that do not exist in direct EOA transactions. It enables smart accounts, paymasters, batching, and flexible validation — but it also means there is now a path between the user signing intent and the target contract call actually running, and that path has security properties of its own.


How Auditing an ERC-4337 System Differs from Auditing a Standard Contract

A standard DeFi audit reviews contract logic in isolation. An ERC-4337 audit must cover the full UserOperation lifecycle — validation logic, return value packing, storage access restrictions during simulation, paymaster economics, module installation access control, and recovery flow griefing vectors. The bundler simulation model adds constraints that do not exist in traditional contract execution.

Specifically, an ERC-4337 audit must reason about:

1. Off-chain / on-chain divergence. The bundler simulates before including. Any state that changes between simulation and inclusion — including block properties, external contract storage, and oracle prices — is a potential divergence point.

2. Return value semantics. In standard Solidity, a function either succeeds or reverts. In ERC-4337 validation, return values carry packed semantic meaning (sigFailed, validUntil, validAfter). A validation function that returns 1 is telling the EntryPoint “invalid signature” not “success with exit code 1.” Misunderstanding this packing is a recurring bug class.

// Packing validation return value correctly
function _packValidationData(
    bool sigFailed,
    uint48 validUntil,
    uint48 validAfter
) internal pure returns (uint256) {
    return (sigFailed ? 1 : 0) |
           (uint256(validUntil) << 160) |
           (uint256(validAfter) << (160 + 48));
}

3. Module system isolation. Accounts using plugin or module architectures (ERC-6900, ERC-7579) must isolate module storage and execution. A malicious module must not be able to hijack validateUserOp on the root account, access sibling module storage, or escalate its own permissions through callback patterns.

4. Upgrade safety. Most ERC-4337 smart contract accounts are expected to be upgradeable, either via on-chain delegate proxy contracts or via EIP-7702. When changing the underlying implementation, all accounts must ensure there are no conflicts in the storage layout of the two contracts. One common approach is “diamond storage” as described in ERC-7201.

5. Economic simulation. The auditor must evaluate the Paymaster’s economics under adversarial conditions: what does it cost to drain the deposit, and is that cost greater than the deposit itself?

6. Bundler interaction model. Standard contract audits do not touch infrastructure software. In ERC-4337 audits, the bundler’s simulation rules, reputation system, and gas reimbursement assumptions must all be verified against the implementation.


ERC-4337 Security Audit Checklist

EntryPoint Interaction

  • validateUserOp is gated with onlyEntryPoint
  • validateUserOp returns `SIG_VALIDATION