SynapseEdge Defense

Schema Learning Lifecycle

How Synapse infers per-endpoint JSON schemas from live traffic — and why training is deferred until after the body-phase WAF verdict. Microsecond validation. No manifests. Poison-resistant baselines.

~5μs
Learn Per Request
~3μs
Validate Per Request
1.5×
Length Tolerance
LRU
Schema Eviction
Request Path — Validate Now, Train Later
STAGE 01
Parse & Template
JSON body parsed. Path normalized to an endpoint template (e.g. /api/users/:id).
OBSERVE
STAGE 02
Validate Against Schema
Existing schema (if any) checks field types, patterns, lengths, bounds. Emits ValidationResult.
INFER
STAGE 03
Body-Phase WAF
Rules consume the validation result via schema_violation match kind alongside DLP & SQLi checks.
WAF
STAGE 04
Verdict Gate
Allow or block. Training payload stashed in pending_learn remains untouched until this gate passes.
GATE
STAGE 05
Train If Allowed
On allow: consume pending_learn, update field types & bounds. On block: drop it — baseline untouched.
APPLY
→ ALLOW PATH
SCHEMA_LEARNER.learn_from_request(template, body) fires. The sample increments the endpoint's baseline.
→ BLOCK PATH
pending_learn is dropped without consumption. Attack payloads never enter the baseline.
What the Learner Captures
Field Types
PRIMITIVE CLASSIFICATION
string · number · boolean
null · object · array
mixed (ambiguity flag)
Promotes to mixed on divergence
Value Patterns
SEMANTIC INFERENCE
UUID · email · ISO date
URL · IP · hex digest
regex-detected formats
Via profiler::patterns::detect_pattern
Length & Value Bounds
NUMERIC CONSTRAINTS
min / max string length
min / max numeric value
tolerance multiplier applied
Default tolerance: 1.5× learned max
Sample Count
TRAINING MATURITY
Per-endpoint sample counter
min_samples_for_validation
Silent until matured
Prevents false positives while learning
Nested Structure
OBJECT TREES
Recursive field maps
Array element schemas
Depth-limited traversal
max_nesting_depth guards memory
Endpoint Identity
TEMPLATE KEY
Keyed by normalized path
LRU eviction past max_schemas
DashMap thread-safe
Generation queue for O(1) eviction
Example — POST /api/users
OBSERVED REQUEST BODY
{
  "id": "3f1a9c7e-2b4d",
  "email": "[email protected]",
  "age": 34,
  "verified": true,
  "joined": "2026-04-12"
}
INFERRED FIELD SCHEMA
// endpoint: /api/users · samples: 42
id       → uuid    (len 36)
email    → email   (len ≤ 48)
age      → number  (0–120)
verified → boolean
joined   → iso-date
Downstream Consumers of the Schema
WAF · schema_violation
BODY-PHASE RULE KIND
Rules match on total_score via compare_threshold (gte/gt/eq/lte/lt). Evaluated synchronously in the body phase — same pass that produces the verdict.
DLP Scanner
POST-VERDICT HAND-OFF
Learned field patterns inform which fields are candidates for PII / secret scanning. Runs after the WAF verdict on allowed bodies.
Anomaly Detection
STRUCTURAL DRIFT
Unexpected fields, missing required fields, and type promotions to mixed surface as anomalies — feeding risk scores and Horizon telemetry.
Anti-poisoning guarantee
The body-phase WAF runs before the learner trains. If a request is blocked — SQLi, schema violation, DLP hit — its payload is dropped without entering the baseline. Attackers cannot teach Synapse to trust the shape of an attack, even under sustained traffic. Schemas stay clean.