Skip to content

Bundle format

This page documents the capture bundle format for contributors and advanced users who want to understand or manipulate bundles directly.

Overview

A capture bundle contains network traces, WebSocket data, UI context events, and a timeline that ties everything together. The Chrome extension exports bundles as ZIP files; internally, managed storage stores them as flat directories with the same layout. The format was designed specifically for Spectral because existing formats like HAR have significant limitations:

  • HAR is JSON/UTF-8 only — no native binary support (would need base64 encoding, adding 33% overhead)
  • HAR has no standard WebSocket support (Chrome uses non-standard _webSocketMessages)
  • HAR has no concept of UI context or unique trace IDs for cross-referencing
  • HAR's request/response pair model doesn't fit WebSocket's async full-duplex messaging

Bundle structure

Path Contents
manifest.json Session metadata: capture ID, timestamps, app info, browser info, statistics
traces/t_NNNN.json HTTP trace metadata: method, URL, headers, status, timing, initiator, context refs
traces/t_NNNN_request.bin Raw request body (binary-safe, may be empty)
traces/t_NNNN_response.bin Raw response body (binary-safe, may be empty)
ws/ws_NNNN.json WebSocket connection metadata: URL, handshake ref, protocols, message count
ws/ws_NNNN_mNNN.bin WebSocket message payload (binary-safe)
ws/ws_NNNN_mNNN.json WebSocket message metadata: direction, opcode, timestamp, connection ref
contexts/c_NNNN.json UI context event: action type, element info, page URL, page content snapshot
timeline.json Ordered list of all events with timestamps and cross-references

Manifest

The manifest contains session-level metadata:

Field Type Description
format_version string Bundle format version (currently 1.0.0)
capture_id string Unique session identifier (UUID)
created_at string ISO 8601 timestamp
app.name string Application name
app.base_url string Application base URL
app.title string Page title at capture start
browser.name string Browser name
browser.version string Browser version
extension_version string Extension version
capture_method string How the capture was produced: chrome_extension, proxy, or merged
duration_ms integer Capture duration in milliseconds
stats object Counts: trace_count, ws_connection_count, ws_message_count, context_count

Trace metadata

Each trace has a stable string ID (t_NNNN) used for cross-referencing from contexts, timeline, and analysis output.

Field Description
id Trace identifier
timestamp Epoch milliseconds
type Always http
request.method HTTP method
request.url Full URL
request.headers Array of {name, value} objects (arrays, not objects, because HTTP allows duplicate header names)
request.body_file Path to the companion .bin file
request.body_size Body size in bytes
request.body_encoding Body encoding if applicable (e.g., base64), or null
response.status HTTP status code
response.status_text HTTP status text (e.g., OK, Not Found)
response.headers Array of {name, value} objects
response.body_file Path to the companion .bin file
response.body_size Body size in bytes
response.body_encoding Body encoding if applicable, or null
timing Breakdown: dns_ms, connect_ms, tls_ms, send_ms, wait_ms, receive_ms, total_ms
initiator What triggered the request: type (script, parser, etc.), URL, line number
context_refs Array of context IDs active when this trace was captured

Bodies are stored as separate binary files rather than inline JSON to avoid encoding overhead and preserve binary fidelity.

WebSocket data

WebSocket connections have a metadata file with the connection URL, handshake trace reference, negotiated protocols, message count, and context refs. Each message has its own metadata file (direction, opcode, timestamp, context refs) and a binary payload file.

Opcode values Meaning
text Text frame (UTF-8)
binary Binary frame
ping Ping control frame
pong Pong control frame
close Connection close frame

Direction is send (client to server) or receive (server to client).

UI context events

Each context event captures what the user did and the state of the page at that moment.

Action What is recorded
click Element details (tag, text, attributes, CSS selector, XPath), page URL, page content
input Field identity only (not the typed value, for privacy)
submit Form target element
scroll Scroll position change
navigate New URL (SPA navigation via pushState/replaceState/popstate)

Each context event also includes viewport information (width, height, scroll position).

The page content snapshot includes visible headings (up to 10), navigation links (up to 15), main text content (up to 500 characters), forms with field identifiers (up to 5), table headers (up to 5), and alerts/notifications (up to 5).

Timeline

The timeline is a flat ordered list of all events across traces, WebSocket activity, and context events. Each entry has a timestamp, an event type, and a reference to the corresponding item.

Event type Reference
context Context ID (e.g., c_0001)
trace Trace ID (e.g., t_0001)
ws_open WebSocket connection ID (e.g., ws_0001)
ws_message WebSocket message ID (e.g., ws_0001_m001)

The flat timeline makes correlation straightforward: to find which API calls relate to a UI action, scan forward from the context event within a time window.

Timestamps

The Chrome extension converts Chrome DevTools Protocol timestamps (monotonic seconds since browser start) to epoch milliseconds. An offset is computed from the first event (Date.now() - chromeTimestamp * 1000) and applied consistently to all subsequent events. The MITM proxy uses wall-clock timestamps directly.