Idempotency in execution systems
In most software a retry is a convenience. In an execution system it's a decision about money. If sending the same order twice can fill twice, then every dropped connection is a chance to double your position by accident. Idempotency is what takes that chance off the table, and it isn't something you get to treat as optional.
The scenario is boring, and it happens all the time. You submit an order. The network hiccups. The acknowledgement never comes back. Did it land? You've no idea. So you retry, because not retrying might mean a missed fill. But if the first order did go through, that retry just doubled your exposure on a position you meant to take once. By its own logic the system behaved perfectly. It still ended up long twice.
What idempotency actually means
Idempotency means doing the same operation more than once leaves you exactly where doing it once would have. Submit the order, submit it again, fire it off ten more times — you still end up with one order, not eleven. Every attempt after the first gets recognized as the same intent and folds into a no-op.
This isn't deduplication bolted on afterward, and it definitely isn't "usually fine." The guarantee lives inside the operation itself. The system can be unsure whether something happened, retry as much as it likes to settle that uncertainty, and never pay for the retry in duplicated effect.
The mechanism: identity before transport
It all starts before the order ever leaves your system. You stamp every order with a client-side identifier the moment the intent exists — not the exchange, you. That ID rides along with the order and survives every retry of that same intent.
Now the retry is safe because of how it's built, not because you got lucky. The venue, the gateway, or your own outbound layer sees the same ID land twice and knows the second one is a repeat rather than a fresh instruction. First attempt, fifth attempt — same identity, so they can only ever add up to one order. That identity is what turns "did it land?" from a coin flip into a question you can actually answer.
When the venue gives you client order IDs, you lean on them. When it doesn't, you build the dedup boundary yourself: an outbound ledger that remembers which intents have already gone out and flatly refuses to send the same one twice.
The boundaries that need it
There's no single switch you flip. Every boundary in the system that moves value has to hold this property on its own:
- Order submission — a retried submit can't be allowed to create a second order.
- Fill application — when the same fill event shows up twice, the position should move once.
- Cancellation — cancelling an already-cancelled order should quietly do nothing, not throw an error that papers over the real state.
- State updates from callbacks — a replayed webhook or a reconnection can't re-apply effects that already landed.
Miss any one of these and you've got a system that's idempotent in the demo and lossy in production. The network will eventually deliver every message you assumed would arrive exactly once — that's just the normal weather of running infrastructure.
Why "exactly once" is a lie you engineer around
Exactly-once delivery doesn't exist in a distributed system. Messages come at least once or at most once, and both are the wrong tool for moving money. At-most-once drops orders. At-least-once duplicates them. The honest position, the one I keep coming back to, is to assume at-least-once delivery and make the processing idempotent so the duplicates simply don't matter.
That's the whole trick, really. You don't try to make the network perfect — you'll lose that fight. You make your system indifferent to how imperfect the network is. Duplicates show up, and they cost you nothing, because the second copy does nothing.
A system that cannot safely retry is a system that cannot safely fail.
Underneath, every reliable execution system is really just a pile of operations that are safe to repeat. That's what lets it retry hard, claw its way out of ambiguity, and reconnect after an outage without anyone holding their breath — because however many times a message gets replayed, the position still only moves once. You don't tack that property on at the end. It's the assumption everything else gets built on top of.
Ignacio Montoya is a systems architect specializing in algorithmic trading infrastructure, financial systems, and digital asset platforms. He designs execution systems where retries are safe by construction and duplicate orders never reach the market.
If your system retries and you are not certain what a retry costs, the conversation starts here.
See engagement model