SynapseEdge Defense

Schema Learning Lifecycle

How Synapse infers per-endpoint JSON schemas from live traffic — and why training is deferred until after the body-phase WAF verdict. Microsecond validation. No manifests. Poison-resistant baselines.

~5μs

Learn Per Request

~3μs

Validate Per Request

1.5×

Length Tolerance

LRU

Schema Eviction

Request Path — Validate Now, Train Later

STAGE 01

Parse & Template

JSON body parsed. Path normalized to an endpoint template (e.g. /api/users/:id).

OBSERVE

STAGE 02

Validate Against Schema

Existing schema (if any) checks field types, patterns, lengths, bounds. Emits ValidationResult.

INFER

STAGE 03

Body-Phase WAF

Rules consume the validation result via schema_violation match kind alongside DLP & SQLi checks.

WAF

STAGE 04

Verdict Gate

Allow or block. Training payload stashed in pending_learn remains untouched until this gate passes.

GATE

STAGE 05

Train If Allowed

On allow: consume pending_learn, update field types & bounds. On block: drop it — baseline untouched.

APPLY

→ ALLOW PATH

SCHEMA_LEARNER.learn_from_request(template, body) fires. The sample increments the endpoint's baseline.

→ BLOCK PATH

pending_learn is dropped without consumption. Attack payloads never enter the baseline.

What the Learner Captures

Field Types

PRIMITIVE CLASSIFICATION

string · number · boolean
null · object · array
mixed (ambiguity flag)

Promotes to mixed on divergence

Value Patterns

SEMANTIC INFERENCE

UUID · email · ISO date
URL · IP · hex digest
regex-detected formats

Via profiler::patterns::detect_pattern

Length & Value Bounds

NUMERIC CONSTRAINTS

min / max string length
min / max numeric value
tolerance multiplier applied

Default tolerance: 1.5× learned max

Sample Count

TRAINING MATURITY

Per-endpoint sample counter
min_samples_for_validation
Silent until matured

Prevents false positives while learning

Nested Structure

OBJECT TREES

Recursive field maps
Array element schemas
Depth-limited traversal

max_nesting_depth guards memory

Endpoint Identity

TEMPLATE KEY

Keyed by normalized path
LRU eviction past max_schemas
DashMap thread-safe

Generation queue for O(1) eviction

Example — POST /api/users

OBSERVED REQUEST BODY

{
  "id": "3f1a9c7e-2b4d",
  "email": "[email protected]",
  "age": 34,
  "verified": true,
  "joined": "2026-04-12"
}

INFERRED FIELD SCHEMA

// endpoint: /api/users · samples: 42
id       → uuid    (len 36)
email    → email   (len ≤ 48)
age      → number  (0–120)
verified → boolean
joined   → iso-date

Downstream Consumers of the Schema

WAF · schema_violation

BODY-PHASE RULE KIND

Rules match on total_score via compare_threshold (gte/gt/eq/lte/lt). Evaluated synchronously in the body phase — same pass that produces the verdict.

DLP Scanner

POST-VERDICT HAND-OFF

Learned field patterns inform which fields are candidates for PII / secret scanning. Runs after the WAF verdict on allowed bodies.

Anomaly Detection

STRUCTURAL DRIFT

Unexpected fields, missing required fields, and type promotions to mixed surface as anomalies — feeding risk scores and Horizon telemetry.

Anti-poisoning by design

The body-phase WAF runs before the learner trains. If a request is blocked — SQLi, schema violation, DLP hit — its payload is dropped without entering the baseline. Blocked traffic never reaches the learner, so the schema is built from accepted requests only.