Skip to content

Stateless Executor Benchmark Guide

This comprehensive guide explains how to run benchmarks using the stateless-executor guest program. The stateless-executor is the primary benchmarking workload that executes Ethereum stateless block execution logic without the pre and post execution validation overhead within zkVM environments.


TLDR: Quick Start

For those who want to get started immediately with default settings:

# 1. Clone and navigate to the repository
git clone https://github.com/NethermindEth/zkevm-benchmark-workload.git
cd zkevm-benchmark-workload
git checkout master-se-docs-03  # Use this branch until merged to master
 
# 2. Download EEST test fixtures
./scripts/download-and-extract-fixtures.sh benchmark@v0.0.4
 
# 3. Generate gas-categorized witness files
RAYON_NUM_THREADS=4 ./scripts/generate-gas-categorized-fixtures.sh
 
# 4. Run benchmarks (defaults: risc0 zkVM, reth execution client, GPU)
./scripts/run-gas-categorized-benchmarks.sh
Default Configuration:
  • zkVM: RISC0
  • Execution Client: Reth
  • Resource: GPU
  • Gas Categories: 1M, 10M, 30M, 45M, 60M, 100M, 150M
  • Action: prove

Prerequisites

Before running benchmarks, ensure you have the following installed:

Required Software

SoftwareMinimum VersionPurpose
Docker20.10+Required for EreDockerized zkVM compilation
Rust1.70+Building the benchmark runner
Git2.0+Cloning the repository
Python3.8+Running analysis scripts
jq1.6+JSON processing in shell scripts
curl7.0+Downloading fixtures

Hardware Requirements

ResourceMinimumRecommended
CPU8 cores16+ cores
RAM16 GB32+ GB
GPU (optional)NVIDIA with CUDARTX 3080+ or A100
Disk Space50 GB100+ GB

Verify Prerequisites

# Check Docker
docker --version
docker info  # Ensure Docker daemon is running
 
# Check Rust
rustc --version
cargo --version
 
# Check Python
python3 --version
 
# Check other tools
jq --version
curl --version

Complete Workflow

The benchmarking process consists of four main stages:

┌──────────────────┐    ┌──────────────────┐    ┌──────────────────┐    ┌──────────────────┐
│  1. Download     │───▶│  2. Generate     │───▶│  3. Execute      │───▶│  4. Analyze      │
│     Fixtures     │    │     Witnesses    │    │     Benchmarks   │    │     Results      │
└──────────────────┘    └──────────────────┘    └──────────────────┘    └──────────────────┘

Stage 1: Download EEST Fixtures

Download the Ethereum Execution Spec Test (EEST) fixtures that serve as the base test data.

# Download default fixtures (benchmark@v0.0.6)
./scripts/download-and-extract-fixtures.sh
 
# Download latest official release
./scripts/download-and-extract-fixtures.sh latest
 
# Download specific version
./scripts/download-and-extract-fixtures.sh v5.0.0
 
# Download to custom directory
./scripts/download-and-extract-fixtures.sh latest ./my-fixtures

Output: Creates ./zkevm-fixtures/ directory containing blockchain test fixtures.

Stage 2: Generate Gas-Categorized Fixtures

Transform the EEST fixtures into BlockAndWitness JSON files organized by gas categories.

# Generate with default gas categories (1M, 10M, 30M, 45M, 60M, 100M, 150M)
./scripts/generate-gas-categorized-fixtures.sh
 
# Generate with custom gas categories
./scripts/generate-gas-categorized-fixtures.sh -g 1M,10M,30M
 
# Use local EEST fixtures instead of downloading
./scripts/generate-gas-categorized-fixtures.sh -e ./my-local-eest-fixtures
 
# Preview commands without executing
./scripts/generate-gas-categorized-fixtures.sh --dry-run

Output: Creates directories like ./zkevm-fixtures-input-1M/, ./zkevm-fixtures-input-10M/, etc.

Stage 3: Execute Benchmarks

Run the stateless-executor benchmarks across your chosen configuration.

# Run all gas categories with defaults
./scripts/run-gas-categorized-benchmarks.sh
 
# Run with specific zkVM
./scripts/run-gas-categorized-benchmarks.sh -z sp1
./scripts/run-gas-categorized-benchmarks.sh -z risc0
./scripts/run-gas-categorized-benchmarks.sh -z openvm
 
# Run with specific execution client
./scripts/run-gas-categorized-benchmarks.sh -e ethrex
 
# Run on CPU instead of GPU
./scripts/run-gas-categorized-benchmarks.sh -r cpu
 
# Run only specific gas categories
./scripts/run-gas-categorized-benchmarks.sh -c 10M
./scripts/run-gas-categorized-benchmarks.sh -c 1M,10M,30M
 
# Enable memory tracking
./scripts/run-gas-categorized-benchmarks.sh -m
 
# Force rerun (bypass cached results)
./scripts/run-gas-categorized-benchmarks.sh -f
 
# Execute only (skip proving)
./scripts/run-gas-categorized-benchmarks.sh -a execute
 
# Custom input directory (default: ./zkevm-fixtures-input)
./scripts/run-gas-categorized-benchmarks.sh -i ./my-fixtures
 
# Custom output directory (default: ./zkevm-metrics)
./scripts/run-gas-categorized-benchmarks.sh -o ./my-metrics
 
# Custom input and output directories
./scripts/run-gas-categorized-benchmarks.sh -i ./my-fixtures -o ./my-metrics
 
# Preview commands without executing
./scripts/run-gas-categorized-benchmarks.sh -n

Output: Creates directories like ./zkevm-metrics-risc0-1M/, ./zkevm-metrics-sp1-10M/, etc.

Stage 4: Analyze Results

Generate reports and compare benchmark results.

# Compare execution metrics between two runs
python3 scripts/compare_executions.py zkevm-metrics-risc0-1M zkevm-metrics-sp1-1M
 
# Compare proving times across all zkVMs (uses human-readable names by default)
python3 scripts/compare_proving_times.py
 
# Compare proving times with raw benchmark names (disable formatting)
python3 scripts/compare_proving_times.py --no-format
 
# Filter by zkVM or benchmark name
python3 scripts/compare_proving_times.py --zkvm sp1
python3 scripts/compare_proving_times.py --benchmark create
 
# Export proving times comparison to markdown
python3 scripts/compare_proving_times.py -o results/
 
# Convert metrics to markdown (uses human-readable names by default)
python3 scripts/convert-metrics-to-markdown.py zkevm-metrics-sp1-1M
 
# Convert multiple metrics directories
python3 scripts/convert-metrics-to-markdown.py zkevm-metrics-*
 
# Specify output directory for markdown files
python3 scripts/convert-metrics-to-markdown.py -o markdown-reports zkevm-metrics-sp1-1M

Scripts Reference

Setup Scripts

download-and-extract-fixtures.sh

Downloads and extracts Ethereum Execution Spec Test fixtures.

OptionDescription
[TAG]EEST release tag (e.g., v5.0.0, latest)
[DEST_DIR]Destination directory (default: ./zkevm-fixtures)
Environment Variables:
  • GITHUB_TOKEN: Optional. Provides authenticated API access to avoid rate limits.
# With authentication (recommended for CI/CD)
export GITHUB_TOKEN=ghp_xxxxxxxxxxxx
./scripts/download-and-extract-fixtures.sh latest

generate-gas-categorized-fixtures.sh

Generates witness files organized by gas categories.

OptionShortDescriptionDefault
--gas-gComma-separated gas categories1M,10M,30M,45M,60M,100M,150M
--eest-fixtures-path-ePath to local EEST fixtures-
--dry-run-Preview commands onlyfalse
--help-hShow help message-

Gas Category Format: Use rational numbers with M suffix (e.g., 0.5M, 1M, 2.5M, 10M).

# Examples
./scripts/generate-gas-categorized-fixtures.sh -g 0.5M,1M,2.5M
./scripts/generate-gas-categorized-fixtures.sh -e ./eest-fixtures -g 10M,30M

Benchmark Execution Scripts

run-gas-categorized-benchmarks.sh

Main script for running stateless-executor benchmarks.

OptionShortDescriptionDefault
--dry-run-nPreview commands onlyfalse
--force-rerun-fForce rerun of benchmarksfalse
--action-aAction: prove or executeprove
--resource-rResource: gpu or cpugpu
--guest-gGuest program typestateless-executor
--zkvm-zzkVM implementationrisc0
--execution-client-eExecution clientreth
--input-dir-iBase input directory./zkevm-fixtures-input
--output-dir-oBase output directory./zkevm-metrics
--gas-categories-cGas categories to run1M,10M,30M,45M,60M,100M,150M
--memory-tracking-mEnable memory trackingfalse
Supported zkVMs:
  • risc0 - RISC0 zkVM (default)
  • sp1 - SP1 zkVM
  • openvm - OpenVM zkVM
  • pico - Pico zkVM
  • zisk - Zisk zkVM
  • airbender - Airbender zkVM
Supported Execution Clients:
  • reth - Reth (default)
  • ethrex - Ethrex
# Full example with all options
./scripts/run-gas-categorized-benchmarks.sh \
  -z sp1 \
  -e reth \
  -r gpu \
  -a prove \
  -c 1M,10M \
  -m \
  -f

Analysis Scripts

compare_executions.py

Compares execution cycle counts between two benchmark runs.

python3 scripts/compare_executions.py <baseline_folder> <optimized_folder>
 
# Example
python3 scripts/compare_executions.py zkevm-metrics-risc0-1M zkevm-metrics-sp1-1M
Output includes:
  • Speedup table by region (verify_witness, post_state_compute, validation, etc.)
  • Total cycle counts comparison
  • Statistical analysis with best/worst performers
  • Key findings summary

compare_provings.py

Compares proving time metrics between two runs.

python3 scripts/compare_provings.py <baseline_folder> <optimized_folder>
 
# Example
python3 scripts/compare_provings.py zkevm-metrics-risc0-10M zkevm-metrics-sp1-10M
Output includes:
  • Proving time speedup analysis
  • Time savings calculations (in seconds)
  • Efficiency gain reporting
  • Top performers identification

analyze_opcode_traces.py

Analyzes bytecode and traces opcode execution.

python3 scripts/analyze_opcode_traces.py [OPTIONS]
 
# Options
--fixtures-dir <DIR>   Directory with test fixtures
--output <DIR>         Output directory for reports
--test-case <NAME>     Specific test case to analyze
 
# Example
python3 scripts/analyze_opcode_traces.py --fixtures-dir ./zkevm-fixtures --output ./opcode-analysis

Troubleshooting

Stage 1: Download Fixtures

ProblemSymptomsSolution
GitHub rate limitAPI rate limit exceeded errorSet GITHUB_TOKEN environment variable
Network timeoutDownload stalls or failsRetry with curl --retry 3 or use VPN
Asset not foundAsset not found in release errorCheck the release tag exists on GitHub
Disk spaceExtraction failsEnsure 10+ GB free space
# Fix rate limit
export GITHUB_TOKEN=ghp_your_token_here
./scripts/download-and-extract-fixtures.sh
 
# Check disk space
df -h .

Stage 2: Generate Fixtures

ProblemSymptomsSolution
Missing EEST fixturesFixtures directory not foundRun download script first
Invalid gas formatInvalid gas value format errorUse format like 1M, 0.5M, 10M
Build failureCargo compilation errorsCheck Rust version, run rustup update
Memory exhaustionProcess killedReduce parallel jobs or increase RAM
# Verify fixtures exist
ls -la ./zkevm-fixtures/
 
# Check Rust installation
rustup show
rustup update
 
# Build with verbose output
cargo build --release --bin witness-generator-cli -v

Stage 3: Execute Benchmarks

ProblemSymptomsSolution
Docker not runningCannot connect to Docker daemonStart Docker: sudo systemctl start docker
GPU not detectedCUDA errors or fallback to CPUCheck nvidia-smi, use -r cpu flag
Input not foundInput directory not foundRun fixture generation first
Out of memoryProcess killed, OOM errorsUse smaller gas categories or more RAM
Build failureEreDockerized compilation errorsCheck Docker disk space, restart Docker
# Check Docker
docker info
docker ps
 
# Check GPU
nvidia-smi
 
# Check input directories
ls -la zkevm-fixtures-input-*
 
# Run with CPU if GPU issues
./scripts/run-gas-categorized-benchmarks.sh -r cpu
 
# Run smaller gas category first
./scripts/run-gas-categorized-benchmarks.sh -c 1M

Stage 4: Analysis

ProblemSymptomsSolution
No common filesNo common files foundEnsure both folders have matching test names
Import errorModuleNotFoundErrorRun from scripts directory or fix PYTHONPATH
Empty resultsNo data in outputCheck JSON files exist in metrics folders
# Check metrics files exist
ls -la zkevm-metrics-*/
 
# Run from correct directory
cd /path/to/zkevm-benchmark-workload
python3 scripts/compare_executions.py ...
 
# Verify JSON structure
cat zkevm-metrics-risc0-1M/some-test.json | jq .

Automation & Best Practices

Environment Configuration

Create a .env file for consistent configuration:

# .env
export GITHUB_TOKEN=ghp_xxxxxxxxxxxx
export ZKVM=risc0
export EXECUTION_CLIENT=reth
export GAS_CATEGORIES=1M,10M,30M
export RESOURCE=gpu

Load before running:

source .env
./scripts/run-gas-categorized-benchmarks.sh -z $ZKVM -e $EXECUTION_CLIENT -c $GAS_CATEGORIES -r $RESOURCE

Batch Processing Multiple zkVMs

#!/bin/bash
# run-all-zkvms.sh
 
ZKVMS=("risc0" "sp1")
GAS_CATEGORIES="1M,10M"
 
for zkvm in "${ZKVMS[@]}"; do
    echo "Running benchmarks for $zkvm..."
    ./scripts/run-gas-categorized-benchmarks.sh \
        -z "$zkvm" \
        -c "$GAS_CATEGORIES" \
        -f
 
    if [ $? -ne 0 ]; then
        echo "Failed: $zkvm"
        exit 1
    fi
done
 
echo "All benchmarks completed!"
 
# Generate comparison reports
for gas in 1M 10M; do
    echo "Comparing results for ${gas}..."
    python3 scripts/compare_executions.py \
        "zkevm-metrics-risc0-${gas}" \
        "zkevm-metrics-sp1-${gas}" \
        > "comparison-risc0-vs-sp1-${gas}.txt"
done

Memory Tracking Best Practices

When using memory tracking (-m flag):

  1. Start with smaller gas categories to establish baseline memory usage
  2. Monitor system memory during execution: watch -n 1 free -h
  3. Allocate 2x expected memory as headroom for peak usage
  4. Review memory metrics in output JSON for optimization opportunities
# Run with memory tracking on small category first
./scripts/run-gas-categorized-benchmarks.sh -c 1M -m
 
# Check memory metrics
jq '.proving.success.peak_memory_usage_bytes' zkevm-metrics-risc0-1M/*.json

Performance Optimization Tips

  1. Use GPU when available - Proving is 5-10x faster with GPU
  2. Start small - Test with 1M gas category before running larger ones
  3. Use dry-run first - Preview commands with -n flag
  4. Parallelize zkVM runs - Run different zkVMs on different machines
  5. Monitor disk space - Metrics can consume significant space for large runs
  6. Cache fixtures - Reuse generated fixtures across multiple benchmark runs

Execution Time Estimates

Gas CategoryGPU (prove)CPU (prove)GPU (execute)CPU (execute)
1M5-15 min30-60 min1-2 min5-10 min
10M15-30 min1-2 hr5-10 min20-40 min
30M30-60 min2-4 hr10-20 min40-80 min
45M45-90 min3-6 hr15-30 min1-2 hr
60M60-120 min4-8 hr20-40 min1.5-3 hr
100M90-180 min6-12 hr30-60 min2-4 hr
150M120-240 min8-16 hr45-90 min3-6 hr

Times are estimates and vary based on hardware and zkVM implementation.


Output Structure

After running benchmarks, your directory structure will look like:

zkevm-benchmark-workload/
├── zkevm-fixtures/                    # Downloaded EEST fixtures
│   └── fixtures/
│       └── blockchain_tests/
│           └── benchmark/
├── zkevm-fixtures-input-1M/           # Generated witnesses (1M gas)
│   ├── test-case-1.json
│   ├── test-case-2.json
│   └── ...
├── zkevm-fixtures-input-10M/          # Generated witnesses (10M gas)
├── zkevm-metrics-risc0-1M/            # RISC0 results (1M gas)
│   ├── test-case-1.json               # Contains execution & proving metrics
│   └── ...
├── zkevm-metrics-sp1-1M/              # SP1 results (1M gas)
├── index.html                         # Generated HTML report
└── comparison-*.txt                   # Comparison reports

Metrics JSON Structure

Each metrics file contains:

{
  "name": "test-case-name",
  "execution": {
    "success": {
      "total_num_cycles": 12345678,
      "region_cycles": {
        "verify_witness": 1000000,
        "post_state_compute": 2000000,
        "validation": 500000
      },
      "execution_duration": {
        "secs": 45,
        "nanos": 123456789
      }
    }
  },
  "proving": {
    "success": {
      "proving_time_ms": 180000,
      "proof_size": 1024,
      "peak_memory_usage_bytes": 8589934592
    }
  }
}

For more detailed information on specific topics: