How does a market making bot manage inventory risk?

OpenMM tracks the base/quote ratio against a target (e.g. 50/50). When inventory skews, it shades orders — making the overweight side more aggressive to sell and less aggressive to buy. The shading magnitude scales with distance from target, keeping the portfolio close to neutral.

What is the most important risk management rule in market making?

When in doubt, do nothing. It is always better to miss a trade than to make a bad one. Position limits act as circuit breakers — if inventory exceeds the configured max, all orders are cancelled automatically until rebalance completes.

Market Making

Inside Our Market Making Stack: From Signal to Execution

March 27, 2026

•

10 min read

•

Aggelos Kappos(Founder @ QBT Labs)

Market MakingOpenMMTradingArchitectureRisk Management

Most market making systems are black boxes. You see the bid and ask on the order book — you don't see what happens in the 50 milliseconds between a price signal arriving and an order landing on the exchange. This post opens that black box for the OpenMM stack.

This is not a theoretical overview. This is exactly how we process a signal, build an order, manage risk, and execute — from the first tick to the filled trade.

The Five Layers

Every market making operation, from the most sophisticated quant fund to an individual running a grid bot, needs to solve the same five problems:

Signal — What does the market look like right now, and where is it going?
Strategy — Where do I place my bids and asks given that signal?
Risk — How do I protect myself from adverse inventory and fat-finger errors?
Execution — How do I get my orders to the exchange as fast as possible?
Settlement — How do I account for P&L and manage positions across venues?

OpenMM addresses all five. Here's how.

Layer 1 — Signal

A market making strategy is only as good as its view of the market. Our signal layer aggregates three types of data:

Order book data Real-time L2 order book from each exchange. We track best bid, best ask, mid-price, and book depth at multiple price levels. Spread compression at the top of the book is often an early indicator of incoming volume.

Trade flow Recent trade history tells you who is aggressing. A sequence of large market buys signals short-term upward pressure. This feeds directly into the inventory skew calculation — if we think price is going up, we want to hold more base asset, so we shade our bids higher and our asks higher too.

Cross-venue price discovery For pairs trading on multiple exchanges simultaneously, we maintain a reference price derived from a weighted average of mid-prices across venues. This is what we use as the "true" mid when placing orders — not the single-exchange mid, which can diverge temporarily due to venue-specific flow.

What we don't do: We don't use prediction models or ML signals in the core strategy loop. Market making is not about predicting price direction — it's about capturing spread while keeping inventory neutral. Prediction models introduce their own risk and complexity. The signal layer stays simple and fast.

Layer 2 — Strategy

With a signal in hand, the strategy layer answers: where do I quote?

Grid strategy (default) The simplest viable approach. Place N orders above and below mid-price, evenly spaced by a configurable spread increment. When an order fills, immediately replace it. Profits come from the spread captured on each round-trip.

Parameters:

spread — distance between best bid/ask and mid (e.g. 0.1%)
order_amount — size per grid level
num_orders — levels on each side
rebalance_threshold — when inventory skew triggers rebalancing

Inventory management This is where most grid bots fail. Pure grid strategies ignore inventory. If price trends upward while your bot is running, you accumulate base asset (your bids keep filling but your asks don't), and you end up holding a large long position at a loss.

OpenMM's inventory management works like this:

Track current base/quote ratio vs target (e.g. 50/50)
If base > target: shade asks slightly lower (more aggressive selling), shade bids slightly higher (less aggressive buying)
If quote > target: reverse
The shading magnitude scales with how far from target you are

This keeps the portfolio close to neutral without requiring active position hedging.

Spread adaptation When volatility increases, widen the spread. When book depth drops, widen the spread. When you're far from target inventory, widen the spread. The underlying principle: in uncertain conditions, make less but be more selective. A tight spread in a volatile market is a guaranteed way to get picked off.

Layer 3 — Risk

Risk is the layer that keeps you alive. Every market maker eventually faces an adverse event — a flash crash, an exchange outage, a sudden manipulation spike. Risk management is what determines whether that event is a bad day or a fatal one.

Per-order limits

Max order size: hard cap on any single order, regardless of strategy parameters
Max notional exposure: total open orders x price cannot exceed configured limit
Slippage protection: if fill price deviates more than X% from expected, cancel and re-evaluate

Position limits

Max inventory in base asset (long or short)
If position exceeds limit: cancel all orders, wait for rebalance before resuming
This is the circuit breaker — it fires automatically, no human needed

Drawdown protection

Track cumulative P&L since session start
If drawdown exceeds threshold (e.g. -2%): halt all trading, alert operator
Resume only after manual confirmation or configured cooldown

Exchange-specific safeguards

Order rate limiting: never exceed exchange's API rate limit (prevents bans)
Duplicate order detection: never place the same order twice (common source of loss)
Stale order cleanup: cancel orders that haven't filled within configurable timeout

The most important risk rule: When in doubt, do nothing. It is always better to miss a trade than to make a bad one. The risk layer is biased toward inaction, not action.

Layer 4 — Execution

Execution is where milliseconds matter. The difference between a 10ms and 100ms order placement round-trip can be the difference between a fill at the intended price and a fill at a worse price, or no fill at all.

Connection management OpenMM maintains persistent WebSocket connections to each exchange rather than opening new connections per request. This eliminates the TCP handshake overhead (typically 20-50ms) on every order.

Order batching When multiple orders need to be placed simultaneously (e.g. after a rebalance), they are submitted as a batch where the exchange supports it. MEXC, Gate.io, and others support batch order placement — a single API call for multiple orders rather than N calls for N orders.

Optimistic placement Orders are placed optimistically — we don't wait for confirmation before updating internal state. If an order is rejected, we handle the exception and reconcile. This improves throughput significantly at the cost of more complex state management.

Order lifecycle tracking Every order has a lifecycle: pending, open, partially filled, filled, or cancelled. We track every state transition. This is how we know when to replace, when to cancel, and when something unexpected happened.

Exchange connectors OpenMM uses a unified connector interface. Each exchange (MEXC, Gate.io, Bitget, Kraken) implements the same interface: placeOrder, cancelOrder, getOrderBook, getBalance. Strategies don't care which exchange they're running on — the connector abstracts that away.

Layer 5 — Settlement

After the trade comes the accounting.

P&L tracking Every fill updates the running P&L calculation. We track:

Realized P&L: from completed round-trips (bid fill + ask fill)
Unrealized P&L: from open positions at current mid-price
Fees: exchange fees, gas fees (for on-chain venues), any other costs

Position reconciliation Every N seconds, we reconcile our internal position tracking against the exchange's reported balances. Discrepancies trigger an alert. If the discrepancy exceeds a threshold, we halt and wait for manual review.

Session reporting At the end of each session (or on-demand), OpenMM generates a session report: total volume, realized P&L, average spread captured, fill rate, inventory delta. This is what you use to tune strategy parameters.

AI Integration — What It Does and Doesn't Change

OpenMM integrates with Claude and other LLMs via the MCP (Model Context Protocol). This lets an AI agent issue commands like "start a grid on BTC/USDC with 1% spread" or "what's my current P&L?" — and have those commands execute in real-time on the exchange.

What AI adds:

Natural language strategy configuration (no need to know parameter names)
Real-time market commentary ("the spread on this pair has compressed 40% in the last hour")
Portfolio-level questions across multiple bots and venues

What AI doesn't replace:

The risk layer (always runs independently of AI commands)
Execution speed (AI commands go through the same pathway as manual commands)
Signal quality (AI doesn't generate better signals; it helps interpret existing ones)

The AI is a control plane, not a strategy engine. The strategy engine runs deterministically.

Open Source

The full OpenMM stack is open source under MIT license:

OpenMM Core SDK — strategy engine, risk management, execution layer: github.com/QBT-Labs/openmm
OpenMM MCP Server — AI integration layer: mcp.openmm.io

We believe infrastructure this fundamental should be open. The edge in market making comes from signal quality, execution speed, and operational excellence — not from hiding the strategy framework.

Inside Our Market Making Stack: From Signal to Execution

The Five Layers

Layer 1 — Signal

Layer 2 — Strategy

Layer 3 — Risk

Layer 4 — Execution

Layer 5 — Settlement

AI Integration — What It Does and Doesn't Change

Open Source

Related Reading

Related Articles

How to Choose a Crypto Market Maker in 2026: What Token Projects Actually Need

Why Token Projects Lose Money with the Wrong Market Maker

What Token Projects Should Know Before Engaging a Market Maker