Lesson 10
Measuring the Mesh
Every active peer link runs an instance of the Metrics Measurement Protocol, MMP. Its job is to measure what the link is doing (latency, loss, jitter, goodput, congestion) and turn those measurements into something the rest of the stack can act on. The spanning tree reads MMP's per-link cost to decide who to use as a parent. The operator reads MMP's periodic reports to understand what the mesh is doing.
MMP runs in three modes, picked by node.mmp.mode:
full (default)
SenderReport and ReceiverReport both flow. All metrics available: RTT, loss, jitter, goodput, OWD trend.
lightweight
ReceiverReport only. Loss from counter gaps, jitter, OWD trend. No RTT.
minimal
No reports at all. Only the spin bit and the CE congestion flag ride on regular traffic.
What MMP measures
Every FMP frame carries a monotonic counter and a session-relative timestamp in its header. MMP derives everything from those fields plus periodic report messages.
- SRTT. Smoothed round-trip time, Jacobson-style with α = 1/8. Samples come only from ReceiverReport timestamp echoes, after the receiver's dwell time has been subtracted.
- Loss. Bidirectional loss inferred from counter gaps: any
missing counter between
interval_start_counterandinterval_end_counteris a dropped frame in that direction. Tracked as both per-interval rate and a long-term EWMA. - Jitter. RFC 3550 interarrival jitter in microseconds.
- Goodput. Payload bytes per second. MMP excludes its own reports so the number reflects useful traffic.
- OWD trend. One-way delay trend in microseconds per second, signed. Latency rising before loss appears is the earliest hint of queue buildup.
- ETX. Expected Transmission Count, derived from bidirectional delivery ratios. A perfect link has ETX close to 1. A link with 20% loss in each direction has ETX close to 1 / (0.8 × 0.8) ≈ 1.56.
Report scheduling
Reports ride at an RTT-adaptive interval:
clamp(2 × SRTT, 100ms, 2000ms). A cold-start interval of 500ms
applies until SRTT converges. The clamp keeps chatter bounded on both ends: fast links do
not flood the CPU, slow links still produce enough samples for the EWMA to stay current.
Timestamp echo with dwell compensation
SRTT comes from the timestamp_echo and
dwell_time fields in ReceiverReport. The sender stamps each frame with
its session-relative
timestamp. The receiver stores the latest observed timestamp and, when it emits its next ReceiverReport, echoes that value along with the number of milliseconds
it sat on the value before sending. The original sender subtracts the dwell time to get a clean
RTT sample.
The spin bit is the other well-known RTT primitive (QUIC uses it). FIPS implements the TX-side reflection state machine, but RTT samples from the spin bit itself are discarded. In a mesh where frames are driven by separate timers (tree announces, bloom filters, MMP reports), inter-frame processing delays inflate spin-bit RTT unpredictably. Timestamp-echo is left as the sole source of SRTT.
From metrics to link cost
The spanning tree uses one scalar summary of link quality:
link_cost = ETX × (1 + SRTT_ms / 100)
effective_depth = peer.depth + link_cost
ETX covers loss. The SRTT term keeps ETX from treating a clean-but-slow link (think LoRa at
500ms RTT, 0% loss) the same as fiber. Each node picks the parent with the lowest effective
depth, with hysteresis: a new candidate has to beat the current parent by at least
parent_hysteresis (default 20%) before the parent actually switches.
That stops pointless flapping when two peers are within noise of each other.
One caveat from the spec: link_cost is used only in parent selection
so far. Data forwarding via
find_next_hop() still ranks candidates by tree distance first. Link
cost is already measured and already on the wire, but the forwarder has not yet been wired up
to use it as a primary key.
Try it: pick a parent
The control below shows two candidate parents. Move the sliders to explore how ETX and SRTT interact, and see when the spanning tree would actually switch. The default scenario is fiber versus long-range radio, with radio one hop closer to the root; you can see why depth alone is not enough.
- ETX
- 1.00
- link_cost
- 1.01
- eff. depth
- 3.01
- ETX
- 1.11
- link_cost
- 6.65
- eff. depth
- 7.65
Fiber is already the current parent with effective depth 3.01. No switch needed.
ECN: pushing congestion back to the source
The CE flag in the FMP flags byte is hop-by-hop congestion feedback.
Transit nodes set CE on a forwarded packet when any of three conditions trip:
-
Outgoing link MMP loss rate at or above
node.ecn.loss_threshold(default 5%). -
Outgoing link ETX at or above
node.ecn.etx_threshold(default 3.0). -
Kernel receive buffer drops detected on any local UDP socket (
SO_RXQ_OVFL).
Once CE is set, it stays set for every subsequent hop. At the final destination, the IPv6
adapter marks the Traffic Class ECN bits to
CE (0b11) before handing the packet to the TUN, but only if the packet
was already ECN-capable (ECT(0) or ECT(1)). Not-ECT packets are never marked, per RFC 3168. The guest TCP stack then echoes ECE in
its ACKs, and standard TCP congestion control shrinks the window. The mesh gets end-to-end
congestion response without having to speak TCP anywhere in its own stack.
What ends up in the operator log
With default settings MMP emits one info-level line per link every 30 seconds:
MMP link metrics peer=node-b rtt=2.3ms loss=0.2% jitter=0.1ms goodput=76.0MB/s tx_pkts=1234 rx_pkts=5678
Teardown emits a final summary with SRTT, loss rate, jitter, ETX, goodput, and the
cumulative tx and rx counters. Between those two, everything else is available live from
fipsctl show routing.
Lesson quiz
1. In full mode, what two MMP messages does each peer link exchange?
2. What does MMP actually use to compute SRTT?
3. Why does FIPS implement the spin bit's reflection state machine but discard its RTT samples?
4. A peer has ETX = 2.0 and SRTT = 50ms. What is its link_cost?
5. What does parent_hysteresis (default 0.2) protect against?
6. When a transit node sets the CE flag, what eventually makes a guest TCP stack slow down?