Everything is a Stream: how AlgoX2 collapses the modern streaming stack into a single abstraction

In 1971, Ritchie and Thompson made a strange decision: in Unix, everything is a file. A disk, a network socket, a keyboard, a running process — all the same thing, reached through the same four verbs: open, read, write, close.

The genius of Unix wasn't what they put in. It was what they left out. They could have shipped a hundred specialized primitives, each one complex. Instead, they chose one simple abstraction: the file, from which everything else could be built. Fifty-five years later, we are all still building on it.

Streaming is overdue for the same move. Today, running real-time data at scale means assembling a stack of complex systems, each with its own cluster, failure modes, and operational burden. AlgoX2 applies the same idea to streaming: choose one abstraction, and most of that separate stack collapses into it.

One primitive

In AlgoX2, everything is a stream: an append-only log with two operations, append and read.

Customer data is a stream. Cluster control commands are a stream. Replication between nodes is a stream. Schemas are streams. Processor state is a stream. Metrics are streams. The cluster even boots itself by reading its own command stream.

Compare that to a typical real-time stack: a transport layer, a stream processor, a schema registry, a connector framework, a coordination service, an observability stack, and a control plane on top. Seven separate systems, each deployed, scaled, and operated on its own.

Of course, calling everything a stream means nothing on its own. It only matters if one primitive can carry control, state, and replication as reliably as customer data. It can, and that is what the rest of this post is about. Once it does, those seven systems are no longer separate. They become applications on one substrate.

Small parts on a bus

The stream is the data model. The interesting part is what it lets the system become.

Picture a system-on-a-chip: a handful of simple components arranged around a shared bus, passing messages. No component reaches into another's memory. Each one puts a message on the bus and reads messages off it. AlgoX2 is built exactly that way, and the bus is the stream.

Each component is a single-threaded process pinned to one CPU core. No locks, no shared mutable state, nothing to coordinate. It runs flat out doing one job. To scale, you don't grow a process; you add more of them.

This is the whole point. Because processes share nothing and talk only through streams, they compose like hardware. A matching engine reads orders and writes fills. A risk engine reads fills and writes limits. Surveillance reads both. None of them call each other; they just read and write streams.

One order

A bus full of independent components is useless if they disagree about what happened first. So the one genuinely hard problem is reduced to a single job with a single owner: the sequencer. It takes any unordered input and collapses it into a single deterministic sequence.

This is the move the whole architecture rests on: order once, execute everywhere. Once the order is fixed, any number of processes can run against the stream in parallel and arrive at exactly the same state without ever talking to each other.

It also redefines state. No process is the authoritative home of its data; the stream is. The log is append-only, every message is uniquely attributed, and nothing is ever rewritten, so a process's state is simply a function of the messages it has read. It is git for a running system: the state is the append-only history, and you reconstruct it by replaying. A component that crashes does not reconcile or repair. It restarts, replays the stream from its last position, and lands in the identical state. Recovery is replay.

How a small team maintains 1.6 million lines of code

AlgoX2 holds to a few unforgiving rules: one thread per process, no locks, no shared mutable state, every piece of state owned by a single writer, every interaction through a stream rather than a direct call. These are what make the model work — processes that compose like hardware, state you can rebuild by replay, parallelism without race conditions.

But rules this strict don't survive contact with a real codebase if humans have to remember them. So at AlgoX2, engineers don't write the code by hand. They write specifications, around 80,000 lines of them, and a compiler turns those specs into the 1.6 million lines of C++ that actually run.

That compiler is OpenACR, a formal-methods toolchain we built and open-sourced. You describe the data and the rules; it generates correct C++ code to match them. So single-threaded, single-writer, message-only are not review guidelines here. They are properties of the compiler: it cannot emit code that breaks them, the same way Unix won't let you open something that isn't a file.

We have been refining OpenACR for fifteen years. With it, our team built the world's fastest matching engine at our previous company, AlgoTechnologies (acquired by ICE), and rebuilt the core of NYSE. AlgoX2 is what happens when you aim it at streaming, and it is why a small team can build and maintain a codebase this large.

What one primitive buys you

Because the primitive is simple and uniform, the hard parts get simple too.

Storage. Existing storage engines assume mutable data, so we built our own. A stream is append-only, and our engine keeps it append-only all the way to disk: messages are written once, in arrival order, and never merged or rewritten. Engines like RocksDB or Cassandra constantly rewrite data in the background to stay fast; our engine never does. Stream paths are just directories on disk, so the file system itself is the index. The payoff: zero write amplification, replication is just copying bytes, and deleting old data is just deleting a file (rm, the Unix delete command).
Processing. No separate stream-processing cluster. A job is a small Unix process that reads one stream on stdin and writes another on stdout. AlgoX2 runs it, supervises it, restarts it.
Control plane. Operating the cluster (adding a node, rebalancing, deploying) means publishing to a stream. Every action is automatically ordered, audited, and replayable — and delivered at the same speed as your data.
Schemas and observability. Both are just streams, so they inherit ordering, durability, replication, and audit for free. No schema registry to run, no observability product to buy.

There is a hardware payoff too. Inside the cluster, AlgoX2 moves data over the fastest transport the hardware offers: plain TCP or UDP in a cloud VPC such as AWS, Ethernet multicast in a datacenter, RDMA on a low-latency fabric. This is entirely internal — clients connect over standard Kafka, NATS, or MQTT and never see it, never configure it, never deal with multicast at all. The result is an architecture that gets faster as the network does: on plain TCP in an ordinary AWS VPC, it already delivers 10 to 30 times the throughput of legacy streaming, and on an RDMA fabric, up to 1000 times.

Not a protocol, an operating system

Kafka, Redis, and NATS each lead with a protocol. The protocol is the product: adopt it, and you inherit its features and its limits.

AlgoX2 inverts that. Underneath sits a substrate that does the hard distributed work: ordering, replication, durability, failover, fan-out. Protocols plug into it the way Ethernet plugs into a computer — important and replaceable, but not foundational. You would never call a machine an Ethernet computer. So why is your shop a Kafka shop?

AlgoX2 speaks Kafka, NATS, MQTT, and Redis at its edges because those are the protocols people use today. But the core is none of them. The core is the substrate underneath: a single abstraction, the stream, ordered once and executed everywhere. It is the same bet Unix made in 1971.

One abstraction, everything built from it. Everything is a stream.

Want more context on the platform? Read about the product, or get in touch with us on the contact page.