How Manifold's volume field actually aggregates — and why your scraper is probably wrong

The volume field on a Manifold market does not equal SUM(bet.amount). It also doesn't equal SUM(abs(amount)), except on the markets where it does. The truth lives in the per-bet fills[] array, and most public scrapers never look there. Here's the reproduction, the drift quantified, and the workaround.

The setup

We were building a cross-venue prediction-market dataset and reconciliation harness. For every market, we compare SUM(captured_trades.size) against the volume field reported by the venue. Drift below 1% on ≥95% of markets = pass. Anything else gets flagged in the recon log.

On a 50-market Manifold sample, the first run failed catastrophically — pass rate 25%. The first instinct was a pagination bug; we walked the cursor exhaustively. Same result. Second instinct: maybe Manifold's amount field is signed. It is.

Discovery 1: amount is signed

A bet on Manifold has a single amount field at the top level. We assumed positive = buy of that outcome, and summed naïvely:

# wrong
def normalize_trade(bet):
    return CanonicalTrade(
        size_native=Decimal(str(bet["amount"])),
        ...
    )

But Manifold uses signed amounts. A NO-side fill records as a negative amount. So our naïve sum collapsed bets that should have added up — buys and sells cancelled each other out.

The first market we drilled into ("Free Lottery Kamiokande") had:

  • Reported volume: 23.1745 mana
  • Our naïve sum (signed): 1.00 mana
  • Sum of abs(amount): 23.17 mana

Match. So Manifold's volume = sum(abs(amount)). Easy fix:

def normalize_trade(bet):
    amount = Decimal(str(bet["amount"]))
    return CanonicalTrade(
        size_native=abs(amount),
        taker_side="sell" if amount < 0 else "buy",
        ...
    )

Re-ran the recon. Pass rate jumped from 25% → 77.5%, with 30 of 40 sampled markets reconciling to 0.0% drift. Big win. But what about the other ~25% still drifting?

Discovery 2: the residual drift goes one direction

Of the 9 markets still failing, every single one was over-counting: our captured sum was higher than Manifold's reported volume. Same direction, every time. That's a structural pattern, not noise.

The worst offender — a market called 9EE6Uz60lN — had:

  • Reported volume: 2127.37 mana
  • Our captured (sum of abs): 2401.65 mana → +12.9%

We tried every flag combination: exclude redemptions, exclude cancelled, exclude ante/LP-provision. None hit the target:

FilterSum (mana)vs target
all_abs2401.65+12.9%
excl_redemption2364.99+11.2%
excl_cancelled1977.97−7.0%
excl_red_cancelled1941.30−8.7%

The target sat between cancelled-included and cancelled-excluded. Neither flag alone explains it — meaning the answer isn't in the top-level bet fields at all.

Where the truth lives: fills[]

A Manifold bet has a top-level amount, but it also has a fills array — sub-objects representing how the bet was actually matched against existing orders. A bet might have orderAmount: 100 with fills: [{amount: 60}, {amount: 35}], meaning 95 mana actually executed and 5 was unfilled (or refunded, or matched against market-maker liquidity that doesn't surface as bet rows).

Manifold's reported volume aggregates over the fills array, not the top-level amount. The public /v0/bets endpoint does surface fills, but most scrapers (us included, initially) treat the top-level amount as ground truth. Hence the drift.

The proper aggregation:

def reconciled_size(bet):
    if "fills" in bet and bet["fills"]:
        return sum(
            abs(Decimal(str(f["amount"])))
            for f in bet["fills"]
        )
    return abs(Decimal(str(bet["amount"])))

We haven't shipped this yet — the per-bet pagination cost roughly doubles, and for our Phase-0 sample the documented ~10% drift on active markets is already disclosed. But it's the right answer if you need audit-grade Manifold reconciliation. We'll fold it in for Phase 1.

Why this matters

Manifold is widely cited as a calibration baseline (it's play-money, but research-grade empirics show solid calibration on aggregate). If you're doing cross-venue calibration — comparing the same event across Manifold, Kalshi, and Polymarket — you almost certainly want a volume-weighted weighting. If your weights are 10% off because you summed the wrong field, your published correlation is also 10% off.

Worse: the failure mode is silent. Your scraper returns a number; the number is in the right ballpark; you ship the analysis. Nobody flags it because nobody else knows. The whole space ends up calibrating against a slightly wrong reference.

Takeaways

  1. If you're scraping Manifold for volume, sum abs(bet.amount) — not signed. This catches you up to ~75% of markets.
  2. For audit-grade reconciliation, aggregate over fills[]. Don't trust top-level amount on active markets.
  3. Run a reconciliation harness, period. Drift you don't measure is drift you ship to customers.

In the pred-markets dataset we currently document Manifold as calibration-grade (≤10% drift on active markets, <0.01% on resolved). Phase 1 closes the gap by adding the fills-aware aggregator. Coverage matrix here; email if you want the full Phase-0 reconciliation log for replication.