Skip to content

Time-Travel Network Debugging (Pro)

Network state is ephemeral — when something went wrong 10 minutes ago, the evidence is often gone. Lattice's Time-Travel Debugging captures automatic snapshots of your entire network state at every meaningful event, letting you inspect, diff, and AI-debug any point in the past.

Pro feature. Available in the Pro edition.

Snapshot Model

Each snapshot captures the complete network state at a point in time:

FieldDescription
peersAll LatticePeer resources (JSON)
policiesAll LatticePolicy resources (JSON)
networksAll LatticeNetwork resources (JSON)
presencePeer online/offline status map
trigger_typeWhat caused the snapshot
captured_atTimestamp

Individual snapshots are under 10 KB (semantic JSON, managedFields stripped).

Trigger Events

EventTrigger Type
LatticePolicy created/updated/deletedpolicy_change
LatticePeer comes onlinepeer_online
LatticePeer goes offlinepeer_offline
WorkflowService execution completesworkflow_executed
Manual API callmanual
Daily at 2 AMscheduled

Snapshot controller (snapshot_controller.go) watches CRD change events with debouncing.

  • Retention: 90 days (Pro: 1 year)

API Endpoints

http
GET  /api/v1/workspaces/:id/snapshots
GET  /api/v1/workspaces/:id/snapshots/:snapshotId
GET  /api/v1/workspaces/:id/snapshots/diff?from=:id1&to=:id2
POST /api/v1/ai/debug

AI Debug

The POST /api/v1/ai/debug endpoint accepts natural language questions about past network state (SSE streaming):

json
{
  "workspace_id": "ws-prod",
  "question": "what caused the connectivity loss between api-server and db at 2:30pm?",
  "time_range": {
    "from": "2026-05-06T14:00:00Z",
    "to": "2026-05-06T15:00:00Z"
  }
}

The AI has access to snapshots in the time range, can diff them, and runs connectivity simulations against historical state to identify root causes.

MCP Debug Tools

When connected via MCP, these additional tools are available:

ToolDescription
list_snapshotsList snapshots within a time range
get_snapshotGet complete state at a specific point in time
diff_snapshotsCompare two snapshots and show changes
check_connectivity_atSimulate connectivity check against historical state

Use Cases

  • "The database became unreachable 20 minutes ago — what changed?"
  • "Show me all policy changes between 2pm and 3pm yesterday."
  • "Was there a peer that went offline right before the incident?"
  • "Compare the network state before and after the last deployment."

Built with Lattice · Console