Every sensor is a brain: a WAF with no cloud backend

Most edge WAFs end up in one of two failure modes. The heavyweight ones try to do everything locally; deep structural parsing, schema learning, behavioral analysis, the works. They fall apart under load. The cloud-offloaded ones ship every decision to a central service and wait for the network on every request. They fall apart on latency.

Both are correct answers to two fundamentally different problems forced into a single architecture.

I spent years building a traditional attacker-centric WAF sensor in the previous generation of this space. When I started over with Synapse, the question I wanted to answer wasn't "how do we split the work between edge and cloud?" It was: why are we splitting the work at all?

This article is about the answer I landed on. Every Synapse sensor is a complete detection engine. Pattern matching, behavioral analysis, risk scoring, schema learning, campaign correlation, DLP, fingerprinting, session tracking, credential stuffing, bot detection, interrogator challenges. All local. All in one 25MB binary. All behind a ~75μs WAF decision and a ~450μs full-stack request budget. No cloud backend. No network round-trip. No serialization boundaries.

That's the claim. Here's why, and how.

The split nobody questions

For the last decade, the dominant WAF architecture has been thin sensor + fat cloud. The sensor at the edge is a forwarder. It looks at the request, maybe runs a handful of cheap signature matches, and ships the interesting stuff to a vendor cloud service. The "real" detection lives in the cloud: deep parsing, rule evaluation, behavioral analysis, actor scoring. The sensor waits for a verdict and enforces it.

That design made sense when compute at the edge was expensive and compute in the cloud was cheap. It also made sense for vendors, who wanted the data flowing through their infrastructure for commercial reasons. Most of the WAF market is built on it.

The problem is that it's architecturally a single point of failure dressed up as a two-tier system. If the cloud is slow, your "protection" is slow. If the cloud is unreachable, your "protection" degrades or fails open. If you need to deploy in an air-gapped environment, you can't. The sensor is not actually a sensor. It's a pipe to the thing that does the work.

I wanted to build the opposite. A sensor that doesn't need the cloud. An edge location that can make every decision it needs to make, locally, in microseconds, without calling anything.

What Synapse actually does, locally, in under a millisecond

Synapse is a full-stack WAF proxy built on Cloudflare's Pingora framework. It's implemented in pure Rust with in-process detection. No FFI, no IPC, no serialization boundaries. Every capability below runs inside the same Pingora worker thread that's handling the request.

The request hits the sensor and goes through an 8-phase detection pipeline:

Context Building

Parse method, URI, query params, headers. Compute feature flags (has_args, has_json, is_static).

Actor Store

Look up or create the actor by IP. Update last-seen timestamps. Record User-Agent fingerprint hash.

Credential Stuffing

Detect auth endpoints. Record the attempt. Check rate limits. Detect distributed patterns across actors.

Profile Anomaly

Look up the endpoint baseline. Check for unusual params, unexpected payload size, wrong content-type, unusual method.

Candidate Selection

The rule index filters 248 rules down to ~35 candidates based on method, features, and headers. Cache lookup for repeat traffic.

Rule Evaluation

Evaluate the candidate rules. Apply transforms (decode, lowercase). Regex and contains matching. Short-circuit on first false.

Entity Tracking

Update the entity risk score. Apply time decay. Record which rules contributed. Check the autoblock threshold.

Verdict

Aggregate risk. Determine action (Allow, Challenge, Block). Build the response with matched rules attached.

That's the whole detection cycle. Every phase runs locally. Every phase runs on the request's own thread. There are no shortcuts, no sampling, no deferred decisions. The benchmark says ~75μs for the WAF phase and ~450μs for the full pipeline including proxy overhead and DLP. That's roughly 2,200–4,400x faster than the same request going through a cloud-backed pipeline, which is the headline number that falls out of removing the network hop.

If you're keeping score, that's behavioral analysis, schema violation detection, actor risk scoring, rule evaluation, credential stuffing detection, and verdict generation. All in under a millisecond. All without leaving the process.

~75μsWAF
Detection

248Production
Rules

99.8%OWASP CRS
Coverage

0%False
Positives

25MBBinary
Size

A single request traveling the pipeline in real time. Watch the risk score climb as SQLi, schema violations, and campaign-member signals accumulate, then land on BLOCK at stage 5. Loops every 10 seconds.

Four decisions that make this work at the edge

Once you commit to "the sensor does everything locally," a lot of downstream architecture stops being a judgment call. It becomes a consequence.

01Parsing is flat, not recursive.

The sensor doesn't need to know the JSON structure of the body. It needs to find suspicious byte patterns quickly. So Synapse treats the request body as a flat stream of bytes and runs an Aho-Corasick multi-pattern matcher plus a RegexSet over it in a single pass. It doesn't parse the JSON at all.

Finding a SQL injection in an 8MB payload with flat streaming is faster than a recursive parser could walk a 10KB nested body. The structural context you lose at the edge (the "found at /user/profile/payment_methods[0]" level of detail) is recomputed later for analyst workflows, not during the request.

02Rules are indexed, not iterated.

A naive rule engine evaluates every rule against every request. Synapse builds a Rule Index: a bitmask representation of which rules apply to which method/header/URL combinations, plus an LRU candidate cache that remembers which ~35 rules out of 248 are worth evaluating for a request matching this fingerprint.

Repeat traffic (which is most traffic) skips the "which rules apply" phase entirely. Amortized complexity is O(1) against the full rule set. In practice the sensor skips the vast majority of rules on every request.

03DLP uses validators, not regexes.

Traditional attacker-centric DLP is regex-only, which is both slow and false-positive-prone. A random 16-digit number isn't a credit card.

Synapse pairs the Aho-Corasick literal scan with zero-allocation validators: Luhn for credit cards, Mod-97 for IBANs, area-code and service-code checks for phone numbers, checksum rules for SSNs that filter out advertising numbers. A match only fires if the literal pattern is found and the validator accepts. The sensor scans 22 built-in patterns in a single pass with a hard 8KB soft-truncation cap, which keeps DLP sub-millisecond even on multi-megabyte payloads.

04Decisions are inline, not background.

Most heavyweight WAFs use a Policy Decision Point model: rules are evaluated asynchronously, in a background loop, against recent activity pulled from a database. The background loop decides "this actor should be blocked" and pushes that decision back to the sensor, which enforces it on subsequent requests.

This works fine for a lot of security policies. It's structurally incapable of blocking the first request. By the time the background loop sees it, the request has already gone through.

Synapse is a Policy Enforcement Point. Every blocking decision (rule match, schema violation, fingerprint velocity, risk score threshold) happens inline in the request hotpath, during the request, before the upstream server ever sees it. There is no background loop. There is no "we'll catch this on the next one." The sensor decides, the sensor enforces, and it does both in under a millisecond.

Beyond the WAF core

The 8-phase pipeline above is the WAF decision core. It runs on every request, in the Pingora request_filter hook, and it produces the risk score and the pass/challenge/block verdict. The sensor does substantially more than the WAF core inline, and all of it runs in the same Pingora worker thread during the same request:

JA4 and JA4H fingerprinting. Computed from the TLS handshake before the request body is even parsed. Feeds an O(1) fingerprint-to-IP inverted index used by the campaign correlation layer for bot detection and velocity scoring.
Per-site rate limiting. Token bucket, evaluated per request, inline. Separate from per-endpoint rate tracking, which the profiler maintains for anomaly detection.
Session tracking with hijack detection. Over 1,500 lines of dedicated session management. Detects IP rotation, cookie inconsistencies, and fingerprint changes mid-session. Not a passive log, an active signal source for the risk score.
Multi-IP actor correlation. Actors aren't just IPs. The sensor builds composite identities across IP changes using fingerprint continuity and behavioral signals, so a rotating-IP attacker shows up as one actor instead of a hundred.
Endpoint profiling and header baselining. Per-endpoint statistical profiles using Welford's algorithm for streaming means and variance, entropy calculation for anomaly detection, pattern detection for field types (UUID, email, ISO date, API key), and per-endpoint header baselines. All updated during the request, all streaming.
Schema learning and validation. Auto-learned API schema validation at ~5μs per request. The sensor learns what each endpoint normally accepts and flags deviations. Schema violations generate risk points directly, no rule match required.
Honeypot trap detection. If the request hits a honeypot endpoint, the trap matcher assigns instant maximum risk (100) and the actor is blocked for all future requests. Handled entirely in Rust proxy logic.
Tarpitting. Progressive response delays for slow-drip defense. The sensor can deliberately slow its responses to exhaust attacker resources without dropping the connection.
Interrogator challenge progression. A 5-level system covering HMAC-signed cookie challenges, JavaScript proof-of-work, and headless browser detection. The sensor decides when to escalate, when to de-escalate, and when to emit a challenge page, all inline.
Shadow mirroring. Asynchronous traffic mirroring to honeypot infrastructure. Runs in parallel with normal request handling, rate-limited to prevent overload, never blocking the hotpath.
Response-phase inspection. The eval path doesn't end when the upstream responds. Synapse runs a response_filter phase that inspects the response body and status code, catches DLP leaks in outbound data, and can block responses that would leak sensitive data even if the request itself was allowed.
Campaign correlation, running in-process. Eight trait-based detectors (attack sequence, auth token, behavioral similarity, graph correlation, JA4 rotation, network proximity, shared fingerprint, timing correlation) evaluated against live request state. The sensor detects coordinated attacks on its own, without waiting for Synapse Fleet to aggregate anything.

None of those are async jobs ticking over in the background on another thread. They're in the same process, on the same Pingora worker, during the same request. The 8-phase WAF core is the part that produces the pass/fail verdict. Everything else on that list is what turns a WAF into an actual security sensor.

The sensor does everything on the hot path. Synapse Fleet receives async telemetry and pushes rules and config back, but never sits inline. Every decision Synapse makes happens inside the sensor box.

The brain is the sensor

Here's the part most people get wrong when they first look at the architecture. They assume there's a central brain somewhere and the edge is just the enforcement arm of it. They ask, "what does Synapse Fleet do?" expecting the answer to be "it does the detection and the sensor enforces."

Synapse Fleet is not the brain.

Every Synapse sensor is already a complete brain. Every capability in the previous section runs inside a single 25MB binary, with every worker thread sharing a single learning state through Arc<RwLock<Synapse>>, so the whole process contributes to one global brain instead of maintaining isolated per-thread fragments. Schema profiles, actor identities, fingerprint indices, session state, campaign state, and rate-limit counters are all shared across threads, persisted to disk, and loaded back on restart.

Each sensor also ships with its own admin interfaces: a terminal UI for operators who live in a shell, and a lightweight web UI for everyone else. Both surface the same things (live request stats, rule match counts, actor and session state, schema profiles, campaign state, current config) directly from the running sensor's memory. No external database. No separate control plane. A single-sensor deployment is fully manageable through the binary itself.

Synapse Fleet is the SOC tooling and fleet operations plane that sits above many sensors. It's a TypeScript/Node.js application, self-hostable in any private network, with a specific, bounded set of responsibilities:

Fleet operations. Heartbeat monitoring, central config push, rule distribution, one-click fleet deploys, onboarding. It's the fleet-scale version of what each sensor already exposes through its own admin TUI and web UI, aggregated across every sensor in one view instead of one at a time.
Cross-sensor correlation. Every sensor runs its own campaign detectors. Synapse Fleet aggregates the output across many sensors so you can see when the same attacker is hitting five customers at once. A single sensor can't see that pattern. Synapse Fleet can.
Analyst tooling. Live threat map, campaign graph visualization, War Rooms for collaborative incident response, threat hunting with Sigma rules, SOC utilities. This is where humans look at what the sensors produced. It's the workbench, not the engine.
Aggregated analytics. Fleet-wide dashboards, API catalogs, trend retention, historical search. These exist because SOC teams need to understand the fleet. The sensors don't need any of it to make decisions.

Nothing in that list is on the request hotpath. Synapse Fleet can't block a request. Only the sensor can.

If every Synapse sensor went offline tomorrow, Synapse Fleet would have nothing to do. If Synapse Fleet went offline tomorrow, every Synapse sensor would keep protecting.

Every sensor is a brain. Synapse Fleet is how you run many of them together.

The deployability payoff

Because every sensor is a complete engine, Synapse Fleet is genuinely optional. Not "detachable." Not "degrades gracefully." Optional. You deploy Synapse without it and you lose the dashboard and the cross-sensor correlation, but you don't lose the detection, because the detection was never centralized in the first place.

That unlocks deployments split-architecture vendors can't touch:

Air-gapped networks. No egress, no cloud call-out, no dependency on reaching the internet. The sensor runs standalone behind the air gap and enforces every decision locally.
Classified environments. A 25MB static binary with zero external dependencies is a lot easier to get through an accreditation process than a sensor that calls home.
Edge locations with unreliable connectivity. Ships, oil rigs, retail branches, remote datacenters. The sensor doesn't care whether the link to HQ is up. It makes its decisions locally.
Customers who need full on-prem sovereignty. No "send us your traffic for analysis." The data never leaves their infrastructure.
Private fleet intelligence. Synapse Fleet itself is self-hostable. A SOC team can run it on internal infrastructure, correlate signals across every sensor in the company, and get fleet-wide threat intelligence without sending anything to an external vendor. Fleet visibility and data sovereignty aren't a tradeoff.
Small deployments that don't need a fleet plane at all. Run one sensor in front of one API, skip Synapse Fleet entirely. Manage it through the built-in admin TUI or web UI. Config is a YAML file, rules are a JSON file, and the sensor shows you what it's seeing in real time. You're done.

I didn't set out to make air-gap deployability the headline feature of Synapse. I set out to make a WAF that didn't need the cloud. The deployments fell out of that decision, not the other way around.

Remove the architectural dependency on a central service and you also remove every operational constraint that comes with one — not because the split was bad, but because it was never necessary.

What we gave up

This isn't free. The biggest thing we lost at the edge was structural path context. A recursive tree parser can tell you "this credit card was found at /user/profile/payment_methods[0]/card_number." Synapse's flat-streaming scanner can only tell you "a credit card pattern was found at byte offset 1847." That's a real loss for forensic analysis, debugging, and rule authoring.

The fix was to add that context back in Synapse Fleet for analyst workflows, not at the edge. Synapse Fleet receives a copy of flagged requests as telemetry, does the recursive structural parse on its own time, and records the exact JSON path in the signal report. Analysts looking at an incident in the War Room get the full context. The sensor never had to compute it during the request.

We also gave up the illusion that a central brain is a feature. It's not. It's a constraint we'd been taught to call an architecture.

The generalization

If your edge depends on the cloud, your system has a single point of failure your architecture diagram doesn't show. You can hide it behind CDN terminology and multi-region failover and edge caching, but the fact remains: every request waits for a network hop to a thing that might be slow, unreachable, or contractually unavailable in the environment your customer actually operates in.

The pattern isn't specific to WAFs. Any system with the structure "real-time decision + periodic learning from aggregated data" benefits from putting the decision loop at the edge and making the aggregation optional:

Rate limiters should enforce locally and share state across peers asynchronously.
Recommendation systems can serve pre-trained models locally; only retraining needs to aggregate.
Anomaly detectors should classify against thresholds that update periodically, not on every request.
Fraud engines can score inline against feature vectors that are computed in batch.

If you're building one of these, ask yourself: what happens if the central service is unreachable? If the answer is "enforcement stops," you're building a system that can't be deployed in half the environments its customers actually live in.

The novel claim isn't that we made the edge faster. It's that we made the edge complete.

Keep Reading

For a zoom-in on one piece of the detection loop, Line-rate DLP covers the two-phase literal-first scanner that runs data loss prevention inline without slowing the request path. The Platform Architecture infographic shows both the autonomous sensor and the Synapse Fleet command plane in one view. The WAF Rule Pipeline walks through the detection loop in detail, and the Deployment Topology covers how sensors run standalone or under fleet management. Or browse all writing.