Introduction
patchbay builds realistic network topologies out of Linux network namespaces
and lets you run real code against them. You describe routers, devices, NAT
policies, firewalls, and link conditions through a Rust builder API. The
library creates a namespace per node, wires them with veth pairs, installs
nftables rules for NAT and firewalling, and applies tc netem shaping for
loss, latency, jitter, and rate limits. Each device gets its own kernel
network stack, so code running inside a namespace sees exactly what it would
see on a separate machine. Everything runs unprivileged and cleans up when
the Lab is dropped.
How this book is organized
The Guide section walks through patchbay’s concepts in the order you are likely to need them. It starts with the motivation behind the project and progresses through setting up a lab, building topologies, configuring NAT and firewalls, running code inside namespaces, and running labs in a QEMU VM on non-Linux hosts. Each chapter builds on the previous one and includes runnable examples.
The Reference section covers specialized topics in depth. It documents real-world IPv6 deployment patterns and how to simulate them, recipes for common network scenarios like WiFi handoff and VPN tunnels, the internals of NAT traversal and hole-punching as implemented in nftables, and the TOML simulation file format used by the patchbay runner.
The Limitations page documents known boundaries of the current model. Read it before relying on packet-level control-plane behavior, OS-specific network-stack quirks, or low-level timing fidelity.
A built-in devtools server (patchbay serve) provides an interactive web
UI for inspecting lab runs: topology graphs, event timelines,
per-namespace structured logs, and performance results. Set
PATCHBAY_OUTDIR when running tests or simulations to capture output,
then serve it in the browser.
Limitations
patchbay models real Linux networking with high fidelity, but it has boundaries. Understanding them helps you decide when patchbay is a good fit and where to expect differences from production systems.
IPv6 limitations
RA and RS are modeled, not packet-emulated
In Ipv6ProvisioningMode::RaDriven, patchbay models Router
Advertisement (RA) and Router Solicitation (RS) behavior through route
updates and structured tracing events. It does not send raw ICMPv6 RA
or RS packets on virtual links. Application-level routing behavior is
close to production, but packet-capture workflows that expect real RA/RS
frames will not see them.
SLAAC behavior is partial
patchbay models default-route and address behavior needed for routing tests, but it does not implement a full Stateless Address Autoconfiguration (SLAAC) state machine with all timing transitions. Connectivity and route-selection tests work well. Detailed host autoconfiguration timing studies are out of scope.
Neighbor Discovery timing is not fully emulated
Neighbor Discovery (ND) address and router behavior is represented in route and interface state, but exact kernel-level timing of ND probes, retries, and expiration is not emulated. Most application tests are unaffected. Low-level protocol timing analysis should use a dedicated packet-level setup.
DHCPv6 prefix delegation is not implemented
patchbay does not implement a DHCPv6 Prefix Delegation server or client flow. Use static /64 allocation in topologies instead. Prefix-based routing and NAT64 scenarios work with static setup, but residential-prefix churn workflows are not represented.
General platform and model limitations
Linux-only execution model
patchbay uses Linux network namespaces, nftables, and tc — it requires a Linux kernel. macOS and Windows host stacks are not emulated. For non-Linux development machines, patchbay-vm wraps simulations in a QEMU Linux VM.
Requires kernel features and host tooling
patchbay depends on unprivileged user namespaces and the nft and tc
userspace tools. If these capabilities are unavailable or restricted —
as in some CI containers or hardened environments — labs cannot run. See
Getting Started for the kernel sysctl
settings that may need adjustment.
No wireless or cellular radio-layer simulation
patchbay models link effects with tc parameters: latency, jitter,
loss, and rate limits. It does not model WiFi or cellular PHY/MAC
behavior such as radio scheduling, channel contention, or handover
signaling. The link condition presets (Wifi, Mobile4G, etc.) apply
realistic impairment at the IP layer, which is sufficient for transport
and application resilience testing but not for radio-layer research.
Dynamic routing protocols are not built in
patchbay focuses on static topology wiring, NAT, firewalling, and route management through its API. It does not include built-in BGP, OSPF, or RIP control-plane implementations. You can run routing daemons inside namespaces yourself — the namespaces are real Linux network stacks — but protocol orchestration is user-managed, not first-class.
Time and clock behavior are not virtualized
patchbay uses the host kernel clock and scheduler. It does not virtualize per-node clocks or provide deterministic virtual time. Most integration tests work as expected, but time-sensitive distributed-system tests that depend on precise clock relationships between nodes may need additional controls.
Motivation and Scope
The problem
Networking code is notoriously hard to test. Unit tests can verify serialization and state machines, but they cannot tell you whether your connection logic survives a home NAT, whether your hole-punching strategy works through carrier-grade NAT, or whether your reconnect path handles a WiFi-to-cellular handoff without dropping state. Those questions require actual network stacks with actual packet processing, and the only way most teams answer them today is by deploying to staging and hoping for the best.
Tools like Docker Compose, Mininet, and custom iptables scripts can help,
but each comes with trade-offs around privilege requirements, cleanup
reliability, and how easily you can parameterize topologies from a test
harness. patchbay was built to make this kind of testing ergonomic for Rust
projects: no root, no cleanup, and a builder API that fits naturally into
#[tokio::test] functions.
What patchbay does
patchbay builds realistic network topologies out of Linux network namespaces and lets you run real code against them. You describe routers, devices, NAT policies, firewalls, and link conditions through a Rust builder API. The library creates a namespace per node, wires them together with veth pairs, installs nftables rules for NAT and firewalling, and applies tc netem/tbf shaping for loss, latency, jitter, and rate limits. Each device gets its own kernel network stack, so code running inside a namespace sees exactly what it would see on a separate machine.
Everything runs unprivileged. The library enters an unprivileged user
namespace at startup, so no root access is needed at any point. When the
Lab value is dropped, all namespaces, interfaces, and rules disappear
automatically.
Where it fits
patchbay is a testing and development tool, designed for three primary use cases:
Integration tests. Write #[tokio::test] functions that build a
topology, run your networking code inside it, and assert on outcomes. Each
test gets an isolated lab with its own address space, so tests can run in
parallel without interfering with each other or with the host.
Performance and regression testing. Apply link conditions to simulate constrained networks (3G, satellite, lossy WiFi) and measure throughput, latency, or reconnection time under controlled impairment. Because tc netem operates at the kernel level, the shaping is realistic enough for comparative benchmarks, though absolute numbers will differ from hardware links due to scheduling overhead and the absence of real radio or cable physics.
Interactive experimentation. Build a topology in a binary or script, attach to device namespaces with shell commands, and observe how traffic flows. This is useful for understanding NAT behavior, debugging connectivity issues, or validating protocol assumptions before writing tests.
patchbay operates at the kernel namespace level with real TCP/IP stacks, not at the packet simulation level. This means the fidelity is high (you are testing against real Linux networking), but the scale is limited to what a single machine can support (typically dozens of namespaces, not thousands).
Getting Started
This chapter walks through building your first patchbay lab: a home router with NAT, a datacenter router, and two devices that communicate across them. By the end you will have a working topology with a ping traversing a NAT and an async TCP exchange between two isolated network stacks.
System requirements
patchbay needs a Linux environment. A bare-metal machine, a VM, or a CI container all work. You need two userspace tools in your PATH:
tcfrom theiproute2package, used for link condition shaping.nftfrom thenftablespackage, used for NAT and firewall rules.
You also need unprivileged user namespaces, which are enabled by default on most distributions. You can verify this with:
sysctl kernel.unprivileged_userns_clone
If the value is 0, enable it with sudo sysctl -w kernel.unprivileged_userns_clone=1.
On Ubuntu 24.04 and later, AppArmor restricts unprivileged user namespaces
separately:
sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0
No root access is needed at runtime. patchbay enters an unprivileged user namespace at startup that grants it the capabilities needed to create network namespaces, veth pairs, and nftables rules.
Adding patchbay to your project
Add patchbay and its runtime dependencies to your Cargo.toml. You need
tokio with at least the rt and macros features, since patchbay is async
internally:
[dependencies]
patchbay = "0.1"
tokio = { version = "1", features = ["rt", "macros"] }
anyhow = "1"
Entering the user namespace
Before any threads are spawned, your program must call init_userns() to
enter the unprivileged user namespace. This has to happen before tokio
starts its thread pool, because unshare(2) only works in a
single-threaded process. The standard pattern splits main into a sync
entry point and an async body:
fn main() -> anyhow::Result<()> {
patchbay::init_userns().expect("failed to enter user namespace");
async_main()
}
#[tokio::main]
async fn async_main() -> anyhow::Result<()> {
// All lab code goes here.
Ok(())
}
If you skip this call, Lab::new() will fail because the process lacks
the network namespace capabilities it needs.
In integration tests, you can avoid the main / async_main split by
using a #[ctor] initializer that runs before any test thread is spawned:
#![allow(unused)]
fn main() {
#[cfg(test)]
#[ctor::ctor]
fn init() {
patchbay::init_userns().expect("failed to enter user namespace");
}
#[tokio::test]
async fn my_test() -> anyhow::Result<()> {
let lab = patchbay::Lab::new().await?;
// ...
Ok(())
}
}
The ctor crate runs the function at load time, before main or the
test harness starts. This keeps your test functions clean and avoids
repeating the namespace setup in every binary.
Creating a lab
A Lab is the top-level container for a topology. When you create one, it
sets up a root network namespace with an internet exchange (IX) bridge.
Every top-level router connects to this bridge, which provides the
backbone for inter-router connectivity.
#![allow(unused)]
fn main() {
let lab = patchbay::Lab::new().await?;
}
Adding routers and devices
Routers connect to the IX bridge and provide network access to downstream
devices. A router without any NAT configuration gives its devices public
IP addresses, like a datacenter. Adding .nat(Nat::Home) places a NAT in
front of the router’s downstream, assigning devices private addresses and
masquerading their traffic, like a typical home WiFi router.
#![allow(unused)]
fn main() {
use patchbay::{Nat, LinkCondition};
// A datacenter router whose devices get public IPs.
let dc = lab.add_router("dc").build().await?;
// A home router whose devices sit behind NAT.
let home = lab.add_router("home").nat(Nat::Home).build().await?;
}
Devices attach to routers through named network interfaces. Each interface is a veth pair connecting the device’s namespace to the router’s namespace. You can optionally apply a link condition to the interface to simulate real-world impairment like packet loss, latency, and jitter.
#![allow(unused)]
fn main() {
// A server in the datacenter, with a clean link.
let server = lab
.add_device("server")
.iface("eth0", dc.id(), None)
.build()
.await?;
// A laptop behind the home router, over a lossy WiFi link.
let laptop = lab
.add_device("laptop")
.iface("eth0", home.id(), Some(LinkCondition::Wifi))
.build()
.await?;
}
At this point you have five network namespaces — the IX root, two routers (dc and home), and two devices (server and laptop) — wired together with veth pairs. The laptop has a private IP behind the home router’s NAT, and the server has a public IP on the datacenter router’s subnet.
Running a ping across the NAT
Every device handle can spawn OS commands inside its network namespace. To verify connectivity, ping the server from the laptop:
#![allow(unused)]
fn main() {
let mut child = laptop.spawn_command_sync({
let mut cmd = std::process::Command::new("ping");
cmd.args(["-c1", &server.ip().unwrap().to_string()]);
cmd
})?;
let status = tokio::task::spawn_blocking(move || child.wait()).await??;
assert!(status.success());
}
The ICMP echo request travels from the laptop’s namespace through the home router, where nftables masquerade translates the source address. The packet then crosses the IX bridge, enters the datacenter router’s namespace, and arrives at the server. The reply follows the reverse path. All of this happens in real kernel network stacks, fully isolated from your host.
Running async code in a namespace
For anything beyond shell commands, you will want to run async Rust code
inside a namespace. The spawn method runs an async closure on the
device’s single-threaded tokio runtime, giving you access to the full
tokio networking stack (TCP, UDP, listeners, timeouts) within that
namespace’s isolated network:
#![allow(unused)]
fn main() {
use std::net::SocketAddr;
use tokio::io::{AsyncReadExt, AsyncWriteExt};
let addr = SocketAddr::from((server.ip().unwrap(), 8080));
// Start a TCP listener on the server.
let server_task = server.spawn(async move |_dev| {
let listener = tokio::net::TcpListener::bind(addr).await?;
let (mut stream, _peer) = listener.accept().await?;
let mut buf = vec![0u8; 64];
let n = stream.read(&mut buf).await?;
assert_eq!(&buf[..n], b"hello");
anyhow::Ok(())
})?;
// Connect from the laptop. Traffic is NATed through the home router.
let client_task = laptop.spawn(async move |_dev| {
let mut stream = tokio::net::TcpStream::connect(addr).await?;
stream.write_all(b"hello").await?;
anyhow::Ok(())
})?;
client_task.await??;
server_task.await??;
}
Both tasks run in separate network namespaces with completely isolated stacks. The tokio primitives behave exactly as they would in a normal application, but all traffic flows through the simulated topology. The Running Code in Namespaces chapter covers all execution methods in detail.
Cleanup
When the Lab goes out of scope, it shuts down all namespace workers and
closes the namespace file descriptors. The kernel automatically removes
veth pairs, routes, and nftables rules when the last reference to a
namespace disappears. No cleanup code is needed and no leftover state
pollutes the host.
Viewing results in the browser
patchbay can write structured output to disk, including topology events,
per-namespace tracing logs, and extracted custom events, and serve them
in an interactive web UI. Set the PATCHBAY_OUTDIR environment variable
to enable this:
PATCHBAY_OUTDIR=/tmp/pb cargo test my_test
Each Lab creates a timestamped subdirectory under the outdir. You can
optionally label it for easier identification:
#![allow(unused)]
fn main() {
let lab = Lab::with_opts(LabOpts::default().label("my-test")).await?;
}
After the test completes, serve the output directory:
patchbay serve /tmp/pb --open
This opens the devtools UI in your browser with tabs for topology, events, logs, timeline, and performance results. Multiple runs accumulate in the same outdir and appear in the run selector dropdown.
You can also emit custom events to the timeline using the _events::
tracing target convention:
#![allow(unused)]
fn main() {
tracing::info!(target: "myapp::_events::PeerConnected", addr = %peer_addr);
}
The per-namespace tracing subscriber extracts these into .events.jsonl
files, which the timeline tab renders automatically.
The next chapters — Building Topologies, NAT and Firewalls, and Running Code in Namespaces — cover routers, devices, regions, link conditions, all NAT and firewall modes, the execution model, and dynamic topology operations in depth.
Building Topologies
A patchbay topology is built from three kinds of objects: routers that provide network connectivity, devices that run your code, and regions that introduce latency between groups of routers. This chapter explains how to compose them into realistic network layouts.
Routers
Every router connects to the lab’s internet exchange (IX) bridge and receives a public IP address on that link. Downstream devices connect to the router through veth pairs and receive addresses from the router’s address pool. The simplest router has no NAT and no firewall — devices behind it get public IPs, like a datacenter switch:
#![allow(unused)]
fn main() {
let dc = lab.add_router("dc").build().await?;
}
To model different real-world environments, you configure NAT, firewalls, IP support, and address pools on the router builder. The NAT and Firewalls chapter covers those options in detail.
Chaining routers
Routers can be chained behind other routers using the .upstream() method.
Instead of connecting directly to the IX, the downstream router receives
its address from the parent router’s pool. This is how you build
multi-layer topologies like ISP + home or corporate gateway + branch
office:
#![allow(unused)]
fn main() {
let isp = lab.add_router("isp").nat(Nat::Cgnat).build().await?;
let home = lab
.add_router("home")
.upstream(isp.id())
.nat(Nat::Home)
.build()
.await?;
}
In this example, the home router sits behind the ISP. Devices behind
home are double-NATed: their traffic passes through home NAT first, then
through carrier-grade NAT at the ISP. This is a common topology for
testing P2P connectivity where both peers sit behind multiple layers of
NAT.
Router presets
For common deployment patterns, RouterPreset configures NAT, firewall,
IP support, and address pool in a single call. This avoids repeating the
same combinations across tests:
#![allow(unused)]
fn main() {
use patchbay::RouterPreset;
let home = lab.add_router("home").preset(RouterPreset::Home).build().await?;
let dc = lab.add_router("dc").preset(RouterPreset::Public).build().await?;
let corp = lab.add_router("corp").preset(RouterPreset::Corporate).build().await?;
}
The following table lists all available presets. Each row shows the NAT mode, firewall policy, IP address family, and downstream address pool that the preset configures:
| Preset | NAT | Firewall | IP support | Pool |
|---|---|---|---|---|
Home | Home (EIM+APDF) | BlockInbound | DualStack | Private |
Public | None | None | DualStack | Public |
PublicV4 | None | None | V4Only | Public |
IspCgnat | Cgnat (EIM+EIF) | None | DualStack | Private |
IspV6 | None (v4) / Nat64 (v6) | BlockInbound | V6Only | Public |
Corporate | Corporate (sym) | Corporate | DualStack | Private |
Hotel | Corporate (sym) | CaptivePortal | V4Only | Private |
Cloud | CloudNat (sym) | None | DualStack | Private |
Methods called after .preset() override the preset’s defaults, so you
can use a preset as a starting point and customize individual settings.
For example, RouterPreset::Home with .nat(Nat::FullCone) gives you a
home-style topology with fullcone NAT instead of the default
endpoint-dependent filtering.
Address families
By default, routers run dual-stack (both IPv4 and IPv6). You can restrict
a router to a single address family with .ip_support():
#![allow(unused)]
fn main() {
use patchbay::IpSupport;
let v6_only = lab.add_router("carrier")
.ip_support(IpSupport::V6Only)
.build().await?;
}
The three options are V4Only, V6Only, and DualStack. Devices behind
a V6Only router will only receive IPv6 addresses. If the router also has
NAT64 enabled, those devices can still reach IPv4 destinations through the
NAT64 prefix; see the NAT and Firewalls chapter
for details.
Devices
Devices are the endpoints where your code runs. Each device gets its own network namespace with one or more interfaces, each connected to a router. IP addresses are assigned automatically from the router’s pool.
#![allow(unused)]
fn main() {
let server = lab
.add_device("server")
.iface("eth0", dc.id(), None)
.build()
.await?;
}
You can read a device’s assigned addresses through the handle:
#![allow(unused)]
fn main() {
let v4: Option<Ipv4Addr> = server.ip();
let v6: Option<Ipv6Addr> = server.ip6();
let ll: Option<Ipv6Addr> = server.default_iface().and_then(|i| i.ll6());
}
For router-side address inspection, use router.interfaces() or
router.iface("ix") / router.iface("wan") and read ip6() plus ll6()
from RouterIface.
Multi-homed devices
A device can have multiple interfaces, each connected to a different router. This models machines with both WiFi and Ethernet, phones with WiFi and cellular, or VPN scenarios where a tunnel interface coexists with the physical link:
#![allow(unused)]
fn main() {
let phone = lab
.add_device("phone")
.iface("wlan0", home.id(), Some(LinkCondition::Wifi))
.iface("cell0", carrier.id(), Some(LinkCondition::Mobile4G))
.default_via("wlan0")
.build()
.await?;
}
The .default_via("wlan0") call sets which interface carries the default
route. At runtime, you can switch the default route to a different
interface to simulate a handoff:
#![allow(unused)]
fn main() {
phone.set_default_route("cell0").await?;
}
Link conditions
Link conditions simulate real-world network impairment. Under the hood,
patchbay uses tc netem for loss, latency, and jitter, and tc tbf for
rate limiting. You can apply conditions at build time through interface
presets, through custom parameters, or dynamically at runtime.
Presets
The built-in presets model common access technologies:
| Preset | Loss | Latency | Jitter | Rate |
|---|---|---|---|---|
Wifi | 2% | 5 ms | 1 ms | 54 Mbit/s |
Mobile4G | 1% | 30 ms | 10 ms | 50 Mbit/s |
Mobile3G | 3% | 100 ms | 30 ms | 2 Mbit/s |
Satellite | 0.5% | 600 ms | 50 ms | 10 Mbit/s |
Apply a preset when building the interface:
#![allow(unused)]
fn main() {
let dev = lab.add_device("laptop")
.iface("eth0", home.id(), Some(LinkCondition::Wifi))
.build().await?;
}
Custom parameters
When the presets do not match your scenario, build a LinkLimits struct
directly:
#![allow(unused)]
fn main() {
use patchbay::{LinkCondition, LinkLimits};
let degraded = LinkCondition::Manual(LinkLimits {
rate_kbit: 1000, // 1 Mbit/s
loss_pct: 10.0, // 10% packet loss
latency_ms: 50, // 50 ms one-way delay
jitter_ms: 20, // 20 ms jitter
..Default::default()
});
let dev = lab.add_device("laptop")
.iface("eth0", home.id(), Some(degraded))
.build().await?;
}
Runtime changes
You can change or remove link conditions at any point after the topology is built. This is useful for simulating network degradation during a test, for example switching from WiFi to a congested 3G link and verifying that your application adapts:
#![allow(unused)]
fn main() {
dev.set_link_condition("eth0", Some(LinkCondition::Mobile3G)).await?;
// Later, restore a clean link.
dev.set_link_condition("eth0", None).await?;
}
Regions
Regions model geographic distance between groups of routers. When you
assign routers to different regions and link those regions, traffic between
them passes through per-region router namespaces that apply configurable
latency via tc netem. This gives you realistic cross-continent delays on
top of any per-link conditions.
#![allow(unused)]
fn main() {
let eu = lab.add_region("eu").await?;
let us = lab.add_region("us").await?;
lab.link_regions(&eu, &us, RegionLink::good(80)).await?;
let dc_eu = lab.add_router("dc-eu").region(&eu).build().await?;
let dc_us = lab.add_router("dc-us").region(&us).build().await?;
}
In this topology, traffic between dc_eu and dc_us carries 80 ms of
added round-trip latency. Routers within the same region communicate
without the region penalty.
Fault injection with region links
You can break and restore region links at runtime to simulate network partitions. This is valuable for testing how your application handles split-brain scenarios, failover logic, and reconnection:
#![allow(unused)]
fn main() {
lab.break_region_link(&eu, &us).await?;
// All traffic between EU and US routers is now blackholed.
// ... run your partition test ...
lab.restore_region_link(&eu, &us).await?;
// Connectivity is restored.
}
The break is immediate: packets in flight are dropped, and no new packets can cross the link until it is restored.
NAT and Firewalls
patchbay implements NAT and firewalls using nftables rules injected into router namespaces. Because these are real kernel-level packet processing rules, they behave identically to their counterparts on physical hardware. This chapter covers all available NAT modes, firewall presets, custom configurations, and runtime mutation.
IPv4 NAT
NAT controls how a router translates addresses for traffic flowing between its downstream (private) and upstream (public) interfaces. Two independent properties define how a NAT behaves, both specified by RFC 4787.
Mapping determines how the router assigns external ports. With endpoint-independent mapping, the router reuses the same external port for all destinations. A device that binds port 40000 and sends to a STUN server gets mapped to, say, external port 40000. When it then sends to a different host, the mapping stays the same. This is what makes UDP hole-punching possible: a peer learns the mapped address via STUN, shares it with another peer, and the mapping holds regardless of who sends to it. With endpoint-dependent mapping, each new destination gets a different external port, so the address learned from STUN is useless for other peers.
Filtering determines which inbound packets the router forwards. Endpoint-independent filtering (fullcone) accepts packets from any external host, as long as a mapping exists. Endpoint-dependent filtering only forwards packets from hosts the internal device has already contacted — unsolicited packets from unknown hosts are dropped even if the port is mapped.
You configure NAT on the router builder with .nat():
#![allow(unused)]
fn main() {
use patchbay::Nat;
let home = lab.add_router("home").nat(Nat::Home).build().await?;
}
Each preset combines a mapping and filtering mode to match a real-world device class:
| Mode | Mapping | Filtering | Real-world model |
|---|---|---|---|
None | n/a | n/a | Datacenter, public IPs |
Home | Endpoint-independent | Endpoint-dependent | Home WiFi router |
Corporate | Endpoint-independent | Endpoint-dependent | Enterprise gateway |
FullCone | Endpoint-independent | Endpoint-independent | Gaming router, fullcone VPN |
CloudNat | Endpoint-dependent | Endpoint-dependent | AWS/GCP cloud NAT |
Cgnat | Endpoint-dependent | Endpoint-dependent | Carrier-grade NAT at the ISP |
For a deep dive into how these modes are implemented in nftables and how hole-punching works across them, see the NAT Hole-Punching reference.
Custom NAT configurations
When the presets do not match your scenario, you can build a NatConfig
directly and choose the mapping, filtering, and timeout behavior
independently:
#![allow(unused)]
fn main() {
use patchbay::nat::{NatConfig, NatMapping, NatFiltering};
let custom = Nat::Custom(NatConfig {
mapping: NatMapping::EndpointIndependent,
filtering: NatFiltering::EndpointIndependent,
..Default::default()
});
let router = lab.add_router("custom").nat(custom).build().await?;
}
Changing NAT at runtime
You can switch a router’s NAT mode after the topology is built. This is
useful for testing how your application reacts when the NAT environment
changes mid-session, for example simulating a network migration. Call
flush_nat_state() afterward to clear stale conntrack entries so that
new connections use the updated rules:
#![allow(unused)]
fn main() {
router.set_nat_mode(Nat::Corporate).await?;
router.flush_nat_state().await?;
}
IPv6 NAT
IPv6 NAT is configured separately from IPv4 using .nat_v6(). In most
real-world deployments, IPv6 does not use NAT at all: devices receive
globally routable addresses and a stateful firewall handles inbound
filtering. patchbay defaults to this behavior. For the scenarios where
IPv6 NAT does exist in practice, four modes are available:
#![allow(unused)]
fn main() {
use patchbay::NatV6Mode;
let router = lab.add_router("r")
.ip_support(IpSupport::DualStack)
.nat_v6(NatV6Mode::Nptv6)
.build().await?;
}
| Mode | Description |
|---|---|
None | No IPv6 NAT. Devices get globally routable addresses. This is the default and the most common real-world configuration. |
Nat64 | Stateless IP/ICMP Translation (RFC 6145). Allows IPv6-only devices to reach IPv4 hosts through the well-known prefix 64:ff9b::/96. The most important v6 NAT mode in practice; used by major mobile carriers. |
Nptv6 | Network Prefix Translation (RFC 6296). Performs stateless 1:1 prefix mapping at the border, preserving end-to-end connectivity while hiding internal prefixes. |
Masquerade | IPv6 masquerade, analogous to IPv4 NAPT. Rare in production but useful for testing applications that must handle v6 address rewriting. |
NAT64
NAT64 is the mechanism that lets IPv6-only mobile networks (T-Mobile US,
Jio, NTT Docomo) provide IPv4 connectivity. The router runs a userspace
SIIT translator that rewrites packet headers between IPv6 and IPv4.
When an IPv6-only device sends a packet to an address in the 64:ff9b::/96
prefix, the translator extracts the embedded IPv4 address, rewrites the
headers, and forwards the packet as IPv4. Return traffic is translated
back to IPv6.
You can configure NAT64 explicitly or use the IspV6 preset, which
sets up a V6Only router with NAT64 and an inbound firewall, matching the
configuration of a typical mobile carrier gateway:
#![allow(unused)]
fn main() {
use patchbay::{IpSupport, NatV6Mode, Nat, RouterPreset};
// Explicit configuration:
let carrier = lab
.add_router("carrier")
.ip_support(IpSupport::V6Only)
.nat_v6(NatV6Mode::Nat64)
.firewall(Firewall::BlockInbound)
.build()
.await?;
// Or equivalently, using the preset:
let carrier = lab
.add_router("carrier")
.preset(RouterPreset::IspV6)
.build()
.await?;
}
To reach an IPv4 server from an IPv6-only device, embed the server’s IPv4
address in the NAT64 prefix using the embed_v4_in_nat64 helper:
#![allow(unused)]
fn main() {
use patchbay::nat64::embed_v4_in_nat64;
let server_v4: Ipv4Addr = dc.uplink_ip().unwrap();
let nat64_addr = embed_v4_in_nat64(server_v4);
// nat64_addr is 64:ff9b::<v4 octets>, e.g. 64:ff9b::cb00:710a
let target = SocketAddr::new(IpAddr::V6(nat64_addr), 8080);
// Connecting to this address goes through the NAT64 translator.
}
The IPv6 Deployments reference covers how real carriers deploy NAT64 and how to simulate each scenario in patchbay.
Firewalls
Firewall presets control which traffic a router allows in each direction. They are independent of NAT: a router can have a firewall without NAT (common for datacenter servers behind a stateful firewall), NAT without a firewall, or both.
#![allow(unused)]
fn main() {
use patchbay::Firewall;
let corp = lab.add_router("corp")
.firewall(Firewall::Corporate)
.build().await?;
}
The following presets are available:
| Preset | Inbound policy | Outbound policy |
|---|---|---|
None | All traffic allowed | All traffic allowed |
BlockInbound | Block unsolicited connections (RFC 6092 CE router behavior) | All traffic allowed |
Corporate | Block unsolicited connections | Allow only TCP 80, 443 and UDP 53 |
CaptivePortal | Block unsolicited connections | Allow only TCP 80, 443 and UDP 53; block all other UDP |
The Corporate and CaptivePortal presets are particularly useful for
testing P2P applications: corporate firewalls block STUN and direct UDP,
forcing applications to fall back to TURN relaying over TLS on port 443.
Captive portal firewalls additionally kill QUIC by blocking all
non-DNS UDP.
Custom firewall rules
When the presets do not match your test scenario, build a FirewallConfig
directly:
#![allow(unused)]
fn main() {
use patchbay::firewall::FirewallConfig;
let config = FirewallConfig::builder()
.block_inbound(true)
.allow_tcp_ports(&[80, 443, 8080])
.allow_udp_ports(&[53, 443])
.build();
let router = lab.add_router("strict")
.firewall(Firewall::Custom(config))
.build().await?;
}
Composing NAT and firewalls
NAT and firewalls are orthogonal. A router can have any combination of the two, and they operate at different points in the nftables pipeline. Some typical compositions:
#![allow(unused)]
fn main() {
// Home router: NAT + inbound firewall. The most common residential setup.
let home = lab.add_router("home")
.nat(Nat::Home)
.firewall(Firewall::BlockInbound)
.build().await?;
// Datacenter with strict outbound rules but no NAT.
let dc = lab.add_router("dc")
.firewall(Firewall::Corporate)
.build().await?;
// Double NAT: ISP carrier-grade NAT in front of a home router.
let isp = lab.add_router("isp").nat(Nat::Cgnat).build().await?;
let home = lab.add_router("home")
.upstream(isp.id())
.nat(Nat::Home)
.build().await?;
}
Router presets set both NAT and firewall to sensible defaults for each
deployment pattern. Calling individual methods after .preset() overrides
the preset’s defaults, so you can start from a known configuration and
adjust only what your test needs.
Running Code in Namespaces
Every node in a patchbay topology, whether it is a device, a router, or
the IX itself, has its own Linux network namespace. Each namespace comes
with two workers: an async worker backed by a single-threaded tokio
runtime, and a sync worker backed by a dedicated OS thread. You never
interact with setns directly; the workers enter the correct namespace
before executing your code.
This chapter describes all the execution methods available on node handles, when to use each one, and how to modify the topology at runtime.
Async tasks
The spawn method is the primary way to run networking code inside a
namespace. It takes an async closure, dispatches it to the namespace’s
tokio runtime, and returns a join handle that resolves when the task
completes:
#![allow(unused)]
fn main() {
let handle = dev.spawn(async move |_dev| {
let stream = tokio::net::TcpStream::connect("203.0.113.10:80").await?;
let mut buf = vec![0u8; 1024];
let n = stream.read(&mut buf).await?;
anyhow::Ok(n)
})?;
let bytes_read = handle.await??;
}
The closure receives a clone of the device handle, which you can use to
query addresses or spawn further tasks. All tokio networking primitives
work inside spawn: TcpStream, TcpListener, UdpSocket, timeouts,
intervals, and anything built on top of them. Because the runtime is
single-threaded and pinned to the namespace, all socket operations happen
against the namespace’s isolated network stack.
You should use spawn for any work that involves network I/O. The
alternative, blocking I/O in a sync context, will stall the worker thread
and can cause kernel-level timeouts for TCP (SYN retransmit takes roughly
127 seconds to exhaust). Always prefer async networking via spawn.
Sync closures
The run_sync method dispatches a closure to the namespace’s sync worker
thread and blocks until it returns. It is intended for quick, non-I/O
operations: reading a sysctl value, creating a socket to inspect its local
address, or spawning an OS process.
#![allow(unused)]
fn main() {
let local_addr = dev.run_sync(|| {
let sock = std::net::UdpSocket::bind("0.0.0.0:0")?;
Ok(sock.local_addr()?)
})?;
}
Because run_sync blocks both the calling thread and the sync worker,
avoid doing anything slow inside it. TCP connects, HTTP requests, and
other blocking network I/O belong in spawn, not in run_sync.
OS commands
spawn_command runs an OS process inside the namespace and registers
the child with the namespace’s tokio reactor, so .wait() and
.wait_with_output() work as non-blocking futures. It takes a
tokio::process::Command and returns a tokio::process::Child:
#![allow(unused)]
fn main() {
let mut child = dev.spawn_command({
let mut cmd = tokio::process::Command::new("curl");
cmd.arg("http://203.0.113.10");
cmd
})?;
let status = child.wait().await?;
assert!(status.success());
}
When you need a synchronous std::process::Child instead (for example
to pass to spawn_blocking or manage outside of an async context), use
spawn_command_sync:
#![allow(unused)]
fn main() {
let mut child = dev.spawn_command_sync({
let mut cmd = std::process::Command::new("curl");
cmd.arg("http://203.0.113.10");
cmd
})?;
let output = tokio::task::spawn_blocking(move || {
child.wait_with_output()
}).await??;
assert!(output.status.success());
}
Dedicated threads
When you have long-running blocking work that would starve the sync
worker, spawn_thread creates a dedicated OS thread inside the
namespace. Unlike run_sync, this thread does not compete with other
sync operations on the same namespace:
#![allow(unused)]
fn main() {
let handle = dev.spawn_thread(|| {
// long-running blocking work here
Ok(())
})?;
}
UDP reflectors
spawn_reflector starts a UDP echo server in the namespace. It is a
convenience method for connectivity tests: send a datagram to the
reflector and measure the round-trip time to verify that the path works.
#![allow(unused)]
fn main() {
let bind_addr = SocketAddr::new(IpAddr::V4(server_ip), 9000);
server.spawn_reflector(bind_addr)?;
}
The reflector runs on the namespace’s async worker and echoes every received datagram back to its sender.
Dynamic topology operations
A patchbay topology is not static. After building the initial layout, you can modify interfaces, routes, link conditions, and NAT configuration at runtime. These operations are useful for simulating network events during a test: a WiFi handoff, a link failure, or a NAT policy change.
Replugging interfaces
Move a device’s interface from one router to another. The interface receives a new IP address from the new router’s pool, and routes are updated automatically:
#![allow(unused)]
fn main() {
dev.replug_iface("wlan0", other_router.id()).await?;
}
This models scenarios like roaming between WiFi access points or switching between ISPs.
Switching the default route
For multi-homed devices, change which interface carries the default route. This simulates a WiFi-to-cellular handoff or a VPN tunnel activation:
#![allow(unused)]
fn main() {
dev.set_default_route("cell0").await?;
}
Bringing interfaces down and up
Simulate link failures by administratively disabling an interface. While the interface is down, packets sent to or from it are dropped:
#![allow(unused)]
fn main() {
dev.link_down("wlan0").await?;
// All traffic over wlan0 is now dropped.
dev.link_up("wlan0").await?;
// The interface is back and traffic flows again.
}
Changing link conditions at runtime
Modify link impairment on the fly to simulate degrading or improving network quality:
#![allow(unused)]
fn main() {
use patchbay::{LinkCondition, LinkLimits};
// Switch to a 3G-like link.
dev.set_link_condition("wlan0", Some(LinkCondition::Mobile3G)).await?;
// Apply custom impairment.
dev.set_link_condition("wlan0", Some(LinkCondition::Manual(LinkLimits {
rate_kbit: 500,
loss_pct: 15.0,
latency_ms: 200,
..Default::default()
}))).await?;
// Remove all impairment and return to a clean link.
dev.set_link_condition("wlan0", None).await?;
}
Changing NAT at runtime
Switch a router’s NAT mode and flush stale connection tracking state. This is covered in more detail in the NAT and Firewalls chapter:
#![allow(unused)]
fn main() {
router.set_nat_mode(Nat::Corporate).await?;
router.flush_nat_state().await?;
}
Handles
Device, Router, and Ix are lightweight, cloneable handles. All three
types support the same set of execution methods described above: spawn,
run_sync, spawn_thread, spawn_command, spawn_command_sync, and
spawn_reflector. Cloning a handle is cheap; it does not duplicate the
underlying namespace or its workers.
Handle methods return Result or Option when the underlying node has
been removed from the lab. If you hold a handle to a device that no longer
exists, calls will return an error rather than panicking.
When debugging IPv6 behavior, inspect interface snapshots instead of only
top-level ip6() accessors:
device.default_iface().and_then(|i| i.ip6())for global/ULA IPv6.device.default_iface().and_then(|i| i.ll6())for link-localfe80::/10.router.interfaces()forRouterIfacesnapshots onix/wanand bridge.
Cleanup
When the Lab is dropped, it shuts down all async and sync workers, then
closes the namespace file descriptors. The kernel removes veth pairs,
routes, and nftables rules when the last reference to a namespace
disappears. No explicit cleanup is needed, and no state leaks onto the
host between test runs.
Testing with patchbay
This chapter shows how to write integration tests that use patchbay, run them on Linux and macOS, and inspect the results in the browser.
Project setup
Add patchbay as a dev dependency alongside tokio and anyhow. If you want
test output directories that persist across runs, add testdir too:
[dev-dependencies]
patchbay = "0.1"
tokio = { version = "1", features = ["rt", "macros", "net", "io-util", "time"] }
anyhow = "1"
ctor = "0.2"
testdir = "0.9"
Writing a test
Create a test file (for example tests/netsim.rs) with the namespace
init, a topology, and assertions:
#![allow(unused)]
fn main() {
use std::net::{IpAddr, SocketAddr};
use anyhow::{Context, Result};
use patchbay::{Lab, LabOpts, Nat, OutDir};
use testdir::testdir;
use tokio::io::{AsyncReadExt, AsyncWriteExt};
/// Runs once before any test thread, entering the user namespace.
#[ctor::ctor]
fn init() {
patchbay::init_userns().expect("user namespace");
}
#[tokio::test(flavor = "current_thread")]
async fn tcp_through_nat() -> Result<()> {
// Write topology events and logs into a testdir for later inspection.
let outdir = testdir!();
let lab = Lab::with_opts(
LabOpts::default()
.outdir(OutDir::Exact(outdir))
.label("tcp-nat"),
)
.await?;
// Datacenter router (public IPs) and home router (NAT).
let dc = lab.add_router("dc").build().await?;
let home = lab
.add_router("home")
.nat(Nat::Home)
.build()
.await?;
// Server in the datacenter, client behind NAT.
let server = lab
.add_device("server")
.iface("eth0", dc.id(), None)
.build()
.await?;
let client = lab
.add_device("client")
.iface("eth0", home.id(), None)
.build()
.await?;
// Start a TCP echo server.
let server_ip = server.ip().context("no server ip")?;
let addr = SocketAddr::new(IpAddr::V4(server_ip), 9000);
server.spawn(move |_| async move {
let listener = tokio::net::TcpListener::bind(addr).await?;
let (mut stream, _) = listener.accept().await?;
let mut buf = vec![0u8; 64];
let n = stream.read(&mut buf).await?;
stream.write_all(&buf[..n]).await?;
anyhow::Ok(())
})?;
tokio::time::sleep(std::time::Duration::from_millis(100)).await;
// Send "hello" from the client, expect it echoed back.
let echoed = client.spawn(move |_| async move {
let mut stream = tokio::net::TcpStream::connect(addr).await?;
stream.write_all(b"hello").await?;
let mut buf = vec![0u8; 64];
let n = stream.read(&mut buf).await?;
anyhow::Ok(buf[..n].to_vec())
})?.await??;
assert_eq!(echoed, b"hello");
Ok(())
}
}
Key points:
#[ctor::ctor]callsinit_userns()once before any threads start. Without this, namespace creation will fail.#[tokio::test(flavor = "current_thread")]is required. patchbay namespaces use single-threaded tokio runtimes internally.testdir!()creates a numbered directory next to the test binary (e.g.target/testdir-current/tcp_through_nat/). Previous runs are kept automatically.OutDir::Exact(path)tells the lab to write events and logs into that directory. After the test, you can browse them in the devtools UI.
Running on Linux
On Linux, tests run natively. Install patchbay’s CLI if you want the
serve command for viewing results:
cargo install --git https://github.com/n0-computer/patchbay patchbay-runner
Then run your tests and serve the output:
# Run the test.
cargo test tcp_through_nat
# Serve the testdir output in the browser.
patchbay serve --testdir --open
The --testdir flag automatically locates <target-dir>/testdir-current
using cargo metadata, so you don’t need to pass a path.
Running on macOS
macOS lacks Linux network namespaces, so tests must run inside a QEMU
VM. Install patchbay-vm:
cargo install --git https://github.com/n0-computer/patchbay patchbay-vm
You also need QEMU installed (brew install qemu on macOS). On first
run, patchbay-vm downloads a Debian cloud image and boots a VM with
all required tools pre-installed.
Run your tests:
# Run all tests in a package.
patchbay-vm test -p myproject
# Run a specific test file and filter by name.
patchbay-vm test -p myproject --test netsim tcp_through_nat
# Pass environment variables through (RUST_LOG, RUST_BACKTRACE, etc).
RUST_LOG=debug patchbay-vm test -p myproject tcp_through_nat
The test binary is cross-compiled for x86_64-unknown-linux-musl,
staged into the VM, and executed there. Output written to testdir ends
up in .patchbay-work/binaries/tests/ which is shared back to the host.
Serve the results:
patchbay-vm serve --testdir --open
The VM stays running between commands, so subsequent runs skip the boot
step. Use patchbay-vm down to stop it, or --recreate to start fresh.
Viewing results
Both patchbay serve and patchbay-vm serve open the devtools UI with:
- Topology — a graph of routers and devices in the lab.
- Logs — per-namespace tracing output and structured event files.
- Timeline — custom events plotted across nodes over time.
To emit custom events that show up on the timeline, use the _events::
tracing target convention:
#![allow(unused)]
fn main() {
tracing::info!(target: "myapp::_events::ConnectionEstablished", peer = %addr);
}
Reading logs from the terminal
The fmt-log command re-renders .tracing.jsonl files as human-readable
ANSI output, matching the familiar tracing_subscriber console format:
# Print a log file.
patchbay fmt-log target/testdir-current/tcp_through_nat/device.client.tracing.jsonl
# Pipe from stdin.
cat device.client.tracing.jsonl | patchbay fmt-log
# Follow a file in real time (like tail -f).
patchbay fmt-log -f device.client.tracing.jsonl
Controlling log output
Per-namespace tracing logs are written to {kind}.{name}.tracing.jsonl
files in the output directory. The filter is read from PATCHBAY_LOG,
falling back to RUST_LOG, falling back to info. Full directive
syntax is supported:
# Only capture trace-level output from your crate's networking code.
PATCHBAY_LOG=myapp::net=trace cargo test tcp_through_nat
Limitation: the file filter can only capture events at levels the
global subscriber (console output) already enables. tracing-core caches
callsite interest globally, so if the global subscriber rejects TRACE,
those callsites are permanently disabled — including for the file
writer. To get TRACE in file output, ensure the global subscriber also
enables TRACE (e.g. RUST_LOG=trace).
Common flags
patchbay-vm test supports the same flags as cargo test:
| Flag | Short | Description |
|---|---|---|
--package <name> | -p | Test a specific package |
--test <name> | Select a test target (binary) | |
--jobs <n> | -j | Parallel compilation jobs |
--features <f> | -F | Activate cargo features |
--release | Build in release mode | |
--lib | Test only the library | |
--no-fail-fast | Run all tests even if some fail | |
--recreate | Stop and recreate the VM | |
-- <args> | Extra args passed to cargo |
Running in CI
If you run a patchbay-serve instance (see patchbay-serve
below), you can push test results from GitHub Actions and get a link
posted as a PR comment.
Set two repository secrets: PATCHBAY_URL (e.g. https://patchbay.example.com)
and PATCHBAY_API_KEY.
Add this to your workflow after the test step:
- name: Push patchbay results
if: always()
env:
PATCHBAY_URL: ${{ secrets.PATCHBAY_URL }}
PATCHBAY_API_KEY: ${{ secrets.PATCHBAY_API_KEY }}
run: |
set -euo pipefail
PROJECT="${{ github.event.repository.name }}"
TESTDIR="$(cargo metadata --format-version=1 --no-deps | jq -r .target_directory)/testdir-current"
if [ ! -d "$TESTDIR" ]; then
echo "No testdir output found, skipping push"
exit 0
fi
# Create run.json manifest
cat > "$TESTDIR/run.json" <<MANIFEST
{
"project": "$PROJECT",
"branch": "${{ github.head_ref || github.ref_name }}",
"commit": "${{ github.sha }}",
"pr": ${{ github.event.pull_request.number || 'null' }},
"pr_url": "${{ github.event.pull_request.html_url || '' }}",
"title": "${{ github.event.pull_request.title || github.event.head_commit.message || '' }}",
"created_at": "$(date -u +%Y-%m-%dT%H:%M:%SZ)"
}
MANIFEST
# Upload as tar.gz
RESPONSE=$(tar -czf - -C "$TESTDIR" . | \
curl -s -w "\n%{http_code}" \
-X POST \
-H "Authorization: Bearer $PATCHBAY_API_KEY" \
-H "Content-Type: application/gzip" \
--data-binary @- \
"$PATCHBAY_URL/api/push/$PROJECT")
HTTP_CODE=$(echo "$RESPONSE" | tail -1)
BODY=$(echo "$RESPONSE" | head -n -1)
if [ "$HTTP_CODE" != "200" ]; then
echo "Push failed ($HTTP_CODE): $BODY"
exit 1
fi
INVOCATION=$(echo "$BODY" | jq -r .invocation)
VIEW_URL="$PATCHBAY_URL/#/inv/$INVOCATION"
echo "PATCHBAY_VIEW_URL=$VIEW_URL" >> "$GITHUB_ENV"
echo "Results uploaded: $VIEW_URL"
- name: Comment on PR
if: always() && github.event.pull_request && env.PATCHBAY_VIEW_URL
uses: actions/github-script@v7
with:
script: |
const marker = '<!-- patchbay-results -->';
const body = `${marker}\n**patchbay results:** ${process.env.PATCHBAY_VIEW_URL}`;
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
});
const existing = comments.find(c => c.body.includes(marker));
if (existing) {
await github.rest.issues.updateComment({
owner: context.repo.owner,
repo: context.repo.repo,
comment_id: existing.id,
body,
});
} else {
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body,
});
}
The PR comment is auto-updated on each push, so you always see the latest run.
patchbay-serve
patchbay-serve is a standalone server for hosting run results. CI
pipelines push test output to it; the devtools UI lets you browse them.
Install
cargo install --git https://github.com/n0-computer/patchbay patchbay-server --bin patchbay-serve
Quick start
patchbay-serve \
--accept-push \
--api-key "$(openssl rand -hex 32)" \
--http-bind 0.0.0.0:8080 \
--retention 10GB
With automatic TLS:
patchbay-serve \
--accept-push \
--api-key "$(openssl rand -hex 32)" \
--acme-domain patchbay.example.com \
--acme-email you@example.com \
--retention 10GB
This will:
- Serve the runs index at
/runs - Accept pushed runs at
POST /api/push/{project} - Auto-provision TLS via Let’s Encrypt (when
--acme-domainis set) - Store data in
~/.local/share/patchbay-serve/(runs + ACME certs) - Delete oldest runs when total size exceeds the retention limit
Flags
| Flag | Description |
|---|---|
--run-dir <path> | Override run storage location |
--data-dir <path> | Override data directory (default: ~/.local/share/patchbay-serve) |
--accept-push | Enable the push API |
--api-key <key> | Required with --accept-push; also reads PATCHBAY_API_KEY env |
--acme-domain <d> | Enable automatic TLS for domain |
--acme-email <e> | Contact email for Let’s Encrypt (required with --acme-domain) |
--retention <size> | Max total run storage (e.g. 500MB, 10GB) |
--http-bind <addr> | HTTP listen address (default: 0.0.0.0:8080; redirect when ACME is active) |
--https-bind <addr> | HTTPS listen address (default: 0.0.0.0:4443; only with --acme-domain) |
systemd
A unit file is included at patchbay-server/patchbay-serve.service.
To install:
# Create service user and data directory
sudo useradd -r -s /usr/sbin/nologin patchbay
sudo mkdir -p /var/lib/patchbay-serve
sudo chown patchbay:patchbay /var/lib/patchbay-serve
# Install the binary
cargo install --git https://github.com/n0-computer/patchbay patchbay-server --bin patchbay-serve
sudo cp ~/.cargo/bin/patchbay-serve /usr/local/bin/
# Install and configure the unit file
sudo cp patchbay-server/patchbay-serve.service /etc/systemd/system/
sudo systemctl edit patchbay-serve # set PATCHBAY_API_KEY, --acme-domain, --acme-email
sudo systemctl enable --now patchbay-serve
Check status:
sudo systemctl status patchbay-serve
journalctl -u patchbay-serve -f
Running in a VM
patchbay requires Linux network namespaces, which means it cannot run
natively on macOS or Windows. The patchbay-vm crate solves this by
wrapping your simulations and tests in a QEMU Linux VM, giving you the
same experience on any development machine.
Installing patchbay-vm
cargo install --git https://github.com/n0-computer/patchbay patchbay-vm
Running simulations
The run command boots a VM (or reuses a running one), stages the
simulation files and binaries, and executes them inside the guest:
patchbay-vm run ./sims/iperf-baseline.toml
Results and logs are written to the work directory (.patchbay-work/ by
default). You can pass multiple simulation files, and they run
sequentially in the same VM.
Controlling the patchbay version
By default, patchbay-vm downloads the latest release of the patchbay
runner binary. You can pin a version, build from a Git ref, or point to a
local binary:
patchbay-vm run sim.toml --patchbay-version v0.10.0
patchbay-vm run sim.toml --patchbay-version git:main
patchbay-vm run sim.toml --patchbay-version path:/usr/local/bin/patchbay
Binary overrides
If your simulation references custom binaries (test servers, protocol implementations), you can stage them into the VM:
patchbay-vm run sim.toml --binary myserver:path:./target/release/myserver
The binary is copied into the guest’s work directory and made available at the path the simulation expects.
Running tests
The test command cross-compiles your Rust tests for musl, stages the
test binaries in the VM, and runs them:
patchbay-vm test
patchbay-vm test --package patchbay
patchbay-vm test -- --test-threads=4
This is the recommended way to run patchbay integration tests on macOS. The VM has all required tools pre-installed (nftables, iproute2, iperf3) and unprivileged user namespaces enabled.
VM lifecycle
The VM boots on first use and stays running between commands. Subsequent
run or test calls reuse the existing VM, which avoids the 30-60
second boot time on repeated invocations.
patchbay-vm up # Boot the VM (or verify it is running)
patchbay-vm status # Show VM state, SSH port, mount paths
patchbay-vm down # Shut down the VM
patchbay-vm cleanup # Remove stale sockets and PID files
You can also SSH into the guest directly for debugging:
patchbay-vm ssh -- ip netns list
patchbay-vm ssh -- nft list ruleset
How it works
patchbay-vm downloads a Debian cloud image (cached in
~/.local/share/patchbay/qemu-images/), creates a COW disk backed by
it, and boots QEMU with cloud-init for initial provisioning. The guest
gets SSH access via a host-forwarded port (default 2222) and three shared
mount points:
| Guest path | Host path | Access | Purpose |
|---|---|---|---|
/app | Workspace root | Read-only | Source code and simulation files |
/target | Cargo target dir | Read-only | Build artifacts |
/work | Work directory | Read-write | Simulation output and logs |
File sharing uses virtiofs when available (faster, requires virtiofsd on the host) and falls back to 9p. Hardware acceleration is auto-detected: KVM on Linux, HVF on macOS, TCG emulation as a last resort.
Configuration
All settings have sensible defaults. Override them through environment variables when needed:
| Variable | Default | Description |
|---|---|---|
QEMU_VM_MEM_MB | 4096 | Guest RAM in megabytes |
QEMU_VM_CPUS | 4 | Guest CPU count |
QEMU_VM_SSH_PORT | 2222 | Host port forwarded to guest SSH |
QEMU_VM_NAME | patchbay-vm | VM instance name |
QEMU_VM_DISK_GB | 40 | Disk size in gigabytes |
VM state lives in .qemu-vm/<name>/ in your project directory. The disk
image uses COW backing, so it only consumes space for blocks that differ
from the base image.
Real-World IPv6 Deployments
IPv6 deployment varies widely across ISPs, carriers, and enterprises. The differences matter for testing: a P2P application that works over a residential dual-stack connection may fail on a corporate network that blocks non-web UDP, or on a mobile carrier that assigns only IPv6 addresses and translates IPv4 traffic through NAT64. This page explains how each environment works and how to reproduce it in patchbay.
IPv6 Terms Used Here
A few IPv6 terms appear throughout this page:
- GUA (Global Unicast Address) — a publicly routable address, the IPv6 equivalent of a public IPv4 address. Devices with GUAs are reachable from anywhere on the internet unless a firewall intervenes.
- ULA (Unique Local Address) — an address in
fd00::/8, routable only within a site. Analogous to RFC 1918 private IPv4 space, but rarely used as the sole address family. - Link-local address — an address in
fe80::/10, valid only on the directly connected link. Every IPv6 interface has one. Used for neighbor discovery, router solicitation, and as next-hop addresses in routing tables. - SLAAC (Stateless Address Autoconfiguration) — the mechanism by which a host picks its own address from a prefix advertised by a router. No DHCP server involved.
- RA (Router Advertisement) — a message a router sends to announce its presence, the prefix it serves, and default-route information.
- RS (Router Solicitation) — a message a host sends to ask nearby routers to send an RA immediately instead of waiting for the next periodic one.
- DAD (Duplicate Address Detection) — a probe the kernel sends before using an address, to verify no other host on the link already claims it.
How ISPs Actually Deploy IPv6
Residential (FTTH, Cable, DSL)
The ISP assigns the home router a globally routable prefix — typically a /56 or /60 — via DHCPv6 Prefix Delegation (DHCPv6-PD). The home router carves /64 subnets from this prefix, one per LAN segment, and announces them via Router Advertisements. Devices on the LAN run SLAAC to pick their own addresses within the /64. The result is that every device gets a public, globally routable IPv6 address with no NAT involved.
The security boundary is a stateful firewall on the home router (the CE router in RFC 6092 terms). It blocks unsolicited inbound connections while allowing outbound traffic and replies to established sessions. This firewall is what prevents the outside world from reaching devices directly despite their public addresses. Privacy extensions (RFC 4941) rotate the source address periodically so that outbound connections do not reveal a stable device identifier.
IPv4 access runs in parallel, either via a separate IPv4 address with traditional NAT44, or via transition mechanisms like DS-Lite, MAP-E, or MAP-T that tunnel IPv4 inside IPv6 to the ISP’s gateway.
Carriers that deploy this model include Deutsche Telekom, Comcast, AT&T, Orange, BT, and NTT.
Mobile (4G/5G)
Mobile carriers assign each device a single /64 prefix via Router Advertisement. The device is the only host on its /64 — there is no home router between the device and the carrier gateway. This means the carrier gateway is the first IP hop, and it controls all routing and policy.
For IPv4 connectivity, carriers take one of two approaches. Some run
pure IPv6 with NAT64: the device has no IPv4 address at all, and the
carrier gateway translates IPv4-bound traffic using the well-known
prefix 64:ff9b::/96. DNS64 synthesizes AAAA records so applications
connect to IPv6 addresses that the gateway maps back to IPv4. T-Mobile
US and Jio operate this way. Other carriers like Verizon and NTT Docomo
run dual-stack, giving devices both IPv4 (often behind CGNAT) and IPv6
addresses.
Mobile networks typically do not run per-device firewalls. Instead, they rely on the fact that each device has its own /64 prefix, which provides natural isolation — no other subscriber shares the prefix.
Enterprise / Corporate
Enterprises typically run dual-stack internally using provider-allocated (PA) or provider-independent (PI) address space. The defining characteristic is a strict outbound firewall: only TCP 80/443 and UDP 53 are allowed. All other ports are blocked, which means STUN and TURN on non-standard ports fail. Applications that need relay connectivity must use TURN-over-TLS on port 443.
Some enterprises use ULA (fd00::/8) internally with NAT66 at the
border, though this is discouraged by RFC 4864 and IETF best practices.
See the section on ULA + NAT66 below.
Hotel / Airport / Guest WiFi
After captive portal authentication, guest networks allow web traffic (TCP 80 and 443) and DNS (TCP/UDP 53) but block most other UDP. This kills QUIC, STUN, and direct P2P connectivity. Unlike corporate networks, some guest networks allow TCP on non-standard ports, but this varies. Many guest networks are still IPv4-only. Those that offer IPv6 assign GUA addresses behind a restrictive firewall.
ULA + NAT66: Mostly a Myth
RFC 4193 ULA (fd00::/8) was designed for stable internal addressing,
not as an IPv6 equivalent of RFC 1918 private space. No major ISP
deploys NAT66 — it defeats the end-to-end principle that IPv6 was
designed to restore. Android does not support NAT66 at all because it
lacks a DHCPv6 client and relies entirely on SLAAC. Where ULA appears
in practice, it is used alongside GUA for stable internal service
addresses, never as the sole address family.
RFC 6296 NPTv6 (Network Prefix Translation) does exist for stateless
1:1 prefix mapping at site borders, primarily for multihoming. If you
need to simulate “NATted IPv6” in patchbay, use NatV6Mode::Nptv6,
but understand that this configuration is rare in production.
Simulating Real-World Scenarios in Patchbay
Each ISP deployment model described above maps to a patchbay router
configuration. RouterPreset captures the most common combinations in
a single call, and individual builder methods let you override any
default when your test scenario diverges from the preset.
#![allow(unused)]
fn main() {
// One-liner for each common case:
let home = lab.add_router("home").preset(RouterPreset::Home).build().await?;
let dc = lab.add_router("dc").preset(RouterPreset::Public).build().await?;
let corp = lab.add_router("corp").preset(RouterPreset::Corporate).build().await?;
// Override one knob:
let home = lab.add_router("home")
.preset(RouterPreset::Home)
.nat(Nat::FullCone) // swap NAT type, keep everything else
.build().await?;
}
The full preset table:
| Preset | NAT | NAT v6 | Firewall | IP | Pool |
|---|---|---|---|---|---|
Home | Home (EIM+APDF) | None | BlockInbound | DualStack | Private |
Public | None | None | None | DualStack | Public |
PublicV4 | None | None | None | V4Only | Public |
IspCgnat | Cgnat (EIM+EIF) | None | None | DualStack | Private |
IspV6 | None | Nat64 | BlockInbound | V6Only | Public |
Corporate | Corporate (sym) | None | Corporate | DualStack | Private |
Hotel | Corporate (sym) | None | CaptivePortal | V4Only | Private |
Cloud | CloudNat (sym) | None | None | DualStack | Private |
Scenario 1: Residential Dual-Stack (Most Common)
Most residential connections today are dual-stack: IPv4 behind NAT, IPv6 with public addresses behind a stateful firewall. This is the baseline for testing home-user connectivity. Applications using Happy Eyeballs (RFC 8305) will prefer IPv6 when both families are available.
#![allow(unused)]
fn main() {
let home = lab.add_router("home").preset(RouterPreset::Home).build().await?;
let laptop = lab.add_device("laptop").uplink(home.id()).build().await?;
// laptop.ip() -> 10.0.x.x (private IPv4, NATted)
// laptop.ip6() -> fd10:0:x::2 (ULA v6, firewalled)
}
Scenario 2: IPv6-Only Mobile with NAT64
T-Mobile US, Jio, and other large carriers run IPv6-only networks. Your
application receives no IPv4 address. To reach an IPv4 server, the
carrier gateway translates between IPv6 and IPv4 using the well-known
prefix 64:ff9b::/96: the device connects to an IPv6 address that
embeds the IPv4 destination, and the gateway rewrites the headers.
This is one of the most important scenarios to test against, because it breaks applications that hardcode IPv4 addresses or assume a dual-stack environment.
#![allow(unused)]
fn main() {
let carrier = lab.add_router("carrier")
.preset(RouterPreset::IspV6)
.build().await?;
let phone = lab.add_device("phone").uplink(carrier.id()).build().await?;
// phone.ip6() -> 2001:db8:1:x::2 (public GUA)
// phone.ip() -> None (no IPv4 on the device)
// Reach an IPv4 server via NAT64:
use patchbay::nat64::embed_v4_in_nat64;
let nat64_addr = embed_v4_in_nat64(server_v4_ip);
// Connect to [64:ff9b::<server_v4>]:port, translated to IPv4 by the router
}
The IspV6 preset configures IpSupport::V6Only,
NatV6Mode::Nat64, Firewall::BlockInbound, and a public GUA pool.
You can also configure NAT64 manually on any router when you need a
different combination:
#![allow(unused)]
fn main() {
let carrier = lab.add_router("carrier")
.ip_support(IpSupport::DualStack) // or V6Only
.nat_v6(NatV6Mode::Nat64)
.build().await?;
}
Scenario 3: Corporate Firewall (Restrictive)
Enterprise networks block everything except web traffic. STUN binding requests on non-standard ports are silently dropped, so ICE candidates never resolve. P2P applications must detect this and fall back to TURN-over-TLS on port 443 — the only UDP port that survives the firewall is DNS on 53.
#![allow(unused)]
fn main() {
let corp = lab.add_router("corp").preset(RouterPreset::Corporate).build().await?;
let workstation = lab.add_device("ws").uplink(corp.id()).build().await?;
}
Scenario 4: Hotel / Captive Portal
Guest WiFi networks allow web browsing but block most UDP, which kills QUIC and prevents direct P2P connections. The difference from corporate is that some hotel networks allow TCP on non-standard ports, so TURN-over-TCP (not just TLS on 443) may work.
#![allow(unused)]
fn main() {
let hotel = lab.add_router("hotel").preset(RouterPreset::Hotel).build().await?;
let guest = lab.add_device("guest").uplink(hotel.id()).build().await?;
}
Scenario 5: Mobile Carrier (CGNAT + Dual-Stack)
Carriers that still offer IPv4 typically share a single public IPv4 address across many subscribers via CGNAT. The device has both IPv4 and IPv6, but the IPv4 address is behind carrier-grade NAT — an extra layer on top of any home NAT.
#![allow(unused)]
fn main() {
let carrier = lab.add_router("carrier").preset(RouterPreset::IspCgnat).build().await?;
let phone = lab.add_device("phone").uplink(carrier.id()).build().await?;
}
Scenario 6: Peer-to-Peer Connectivity Test Matrix
The real value of these presets is composing them to test how two peers connect across different network types. A home user behind cone NAT can hole-punch with another home user, but a corporate user behind a strict firewall forces a relay fallback. Testing the full matrix catches connectivity regressions that single-topology tests miss.
#![allow(unused)]
fn main() {
let home = lab.add_router("home")
.preset(RouterPreset::Home)
.nat(Nat::FullCone)
.build().await?;
let alice = lab.add_device("alice").uplink(home.id()).build().await?;
let mobile = lab.add_router("mobile").preset(RouterPreset::IspCgnat).build().await?;
let bob = lab.add_device("bob").uplink(mobile.id()).build().await?;
let corp = lab.add_router("corp").preset(RouterPreset::Corporate).build().await?;
let charlie = lab.add_device("charlie").uplink(corp.id()).build().await?;
// Test: can alice reach bob? bob reach charlie? etc.
}
IPv6 Feature Reference
| Feature | API | Notes |
|---|---|---|
| Dual-stack | IpSupport::DualStack | Both v4 and v6 |
| IPv6-only | IpSupport::V6Only | No v4 routes |
| IPv4-only | IpSupport::V4Only | No v6 routes (default) |
| NPTv6 | NatV6Mode::Nptv6 | Stateless 1:1 prefix translation |
| NAT66 (masquerade) | NatV6Mode::Masquerade | Like NAT44 but for v6 |
| Block inbound | Firewall::BlockInbound | RFC 6092 CE router |
| Corporate FW | Firewall::Corporate | Block inbound + TCP 80,443 + UDP 53 |
| Captive portal FW | Firewall::CaptivePortal | Block inbound + block non-web UDP |
| Custom FW | Firewall::Custom(cfg) | Full control via FirewallConfig |
| NAT64 | NatV6Mode::Nat64 | Userspace SIIT + nftables masquerade |
| DHCPv6-PD | not planned | Use static /64 allocation |
Link-Local Addressing and Scope
Every IPv6 interface has a link-local address in fe80::/10. Unlike
global or ULA addresses, link-local addresses are valid only on the
directly connected link — they cannot be routed across hops. The kernel
uses them for neighbor discovery (finding other hosts on the link) and
as next-hop addresses in routing tables. They are always present, even
when no global prefix has been assigned.
In patchbay, you can inspect link-local addresses through interface snapshots:
- Device side:
DeviceIface::ll6() - Router side:
RouterIface::ll6() - Router snapshots:
Router::iface(name)andRouter::interfaces()
Use ip6() when you need a global/ULA source or destination. Use
ll6() for neighbor/router-local checks and link-local route
assertions.
Provisioning mode and DAD mode
patchbay supports two IPv6 provisioning modes, configured at lab creation. The choice controls how IPv6 routes and addresses are set up in each namespace.
Ipv6ProvisioningMode::Static installs routes during topology wiring.
This is the simpler model: routes are deterministic, and there is no
timing dependency on router advertisements. Use this when your test
cares about connectivity and routing outcomes, not about the
provisioning process itself.
Ipv6ProvisioningMode::RaDriven models the RA/RS-driven provisioning
path. patchbay emits structured RA and RS events and installs link-local
scoped default routes for default interfaces. This models real host
routing behavior while keeping tests deterministic and introspectable.
Use this when your application depends on RA timing, default-route
installation order, or link-local gateway behavior.
DAD (Duplicate Address Detection) is disabled by default to keep test
setup deterministic — the kernel DAD probe adds a delay before an
address becomes usable, which introduces timing variance. Enable it with
Ipv6DadMode::Enabled when you specifically need to test DAD-related
behavior.
#![allow(unused)]
fn main() {
let lab = Lab::with_opts(
LabOpts::default()
.ipv6_provisioning_mode(Ipv6ProvisioningMode::Static)
.ipv6_dad_mode(Ipv6DadMode::Enabled),
).await?;
}
Fidelity boundaries
patchbay models RA and RS behavior at the control-plane level: it updates routes and emits structured events in tracing logs, but it does not emit raw ICMPv6 RA or RS packets on virtual links. Application-level route and connectivity behavior is covered, but packet-capture workflows that expect real RA/RS frames are not.
Specific areas outside the model:
- Full SLAAC state-machine behavior across all timers and transitions.
- Neighbor Discovery timing details, including exact probe/retransmit timing.
- Host temporary address rotation and privacy-address lifecycles.
For the complete list, see Limitations.
Scoped default route behavior
When an IPv6 default gateway is link-local (fe80::/10), the route
must include the outgoing interface as scope — without it, the kernel
does not know which link the gateway lives on. patchbay handles this
automatically during route installation, so default routing remains
valid after interface changes.
Common Pitfalls
NPTv6 and NDP
NPTv6 dnat prefix to rules must include address match clauses (e.g.,
ip6 daddr <wan_prefix>) to avoid translating NDP packets. Without
this, neighbor discovery breaks and the router becomes unreachable.
IPv6 Firewall Is Not Optional
On IPv4, NAT implicitly blocks inbound connections — no port mapping
means no access. On IPv6 with public GUA addresses, there is no NAT
and devices are directly addressable from the internet. Without
Firewall::BlockInbound, any host on the IX can connect to your
devices. This matches reality: every residential CE router ships with an
IPv6 stateful firewall enabled by default.
On-Link Prefix Confusion
When IX-level routers share a /64 IX prefix, their WAN addresses are on-link with each other. If downstream routing prefixes are carved from the same range, the kernel may treat them as on-link too, sending packets directly via NDP rather than through the gateway. patchbay avoids this by using distinct prefix ranges for the IX (/64) and downstream pools (/48 from a different range).
Real-World Network Patterns
Patterns for testing P2P applications against common real-world network conditions. Each section describes what happens from the application’s perspective and how to simulate it.
VPN Connect / Disconnect
What happens when a VPN connects
A VPN client performs three operations:
-
IP change - A new tunnel interface (wg0, tun0) gets a VPN-assigned address. The device now has two IPs: physical and tunnel.
-
Route change - For full-tunnel VPNs, a new default route via the tunnel is installed. All traffic exits through the VPN server. For split-tunnel, only specific CIDRs (corporate ranges) route through the tunnel.
-
DNS change - VPN pushes its own DNS servers. Private hostnames become resolvable.
Impact on existing connections: Existing TCP connections do not automatically die but break in practice. The source IP that the remote knows is the old physical IP. After routing changes, outgoing packets exit via the tunnel with a different source IP. The remote sends responses to the old IP. Connections stall and eventually time out. QUIC connections can migrate if both sides support it.
Full-tunnel VPN
All traffic exits through the VPN server. STUN reports the VPN server’s public IP as the reflexive address. Direct connections between two VPN peers go through two VPN hops.
#![allow(unused)]
fn main() {
// VPN exit node (NATs all clients behind server IP)
let vpn_exit = lab.add_router("vpn-exit")
.nat(Nat::Home)
.mtu(1420) // WireGuard overhead
.build().await?;
// Before VPN: device on home network
let home = lab.add_router("home").nat(Nat::Home).build().await?;
let device = lab.add_device("client").uplink(home.id()).build().await?;
// Connect VPN: device moves to VPN router, gets new IP
device.replug_iface("eth0", vpn_exit.id()).await?;
// Disconnect VPN: device returns to home router
device.replug_iface("eth0", home.id()).await?;
}
Split-tunnel VPN
Some traffic goes through VPN, rest uses physical interface. Model with two interfaces on different routers:
#![allow(unused)]
fn main() {
let device = lab.add_device("client")
.iface("eth0", home.id(), None) // physical: internet traffic
.iface("wg0", vpn_exit.id(), None) // tunnel: corporate traffic
.default_via("eth0") // default route on physical
.build().await?;
// Corporate server only reachable via VPN
let corp_server = lab.add_device("server").uplink(vpn_exit.id()).build().await?;
// Internet server reachable via physical
let public_server = lab.add_device("relay").uplink(dc.id()).build().await?;
// Switch from split to full tunnel
device.set_default_route("wg0").await?;
// Switch back
device.set_default_route("eth0").await?;
}
VPN kill switch
A kill switch drops all non-tunnel traffic immediately:
#![allow(unused)]
fn main() {
device.link_down("eth0").await?; // kill switch fires
device.replug_iface("eth0", vpn_exit.id()).await?; // tunnel established
device.link_up("eth0").await?;
}
VPN MTU impact
VPN encapsulation reduces effective MTU. Common values:
| Protocol | Overhead | Inner MTU |
|---|---|---|
| WireGuard | 60B (v4) / 80B (v6) | 1420 / 1400 |
| OpenVPN UDP | ~50-60B | ~1400 |
| IPsec ESP (NAT-T) | 52-72B | ~1400 |
If ICMP “fragmentation needed” is blocked (common in corporate/cloud), PMTUD fails silently. Small requests work, large transfers hang.
#![allow(unused)]
fn main() {
// Simulate VPN MTU + PMTUD blackhole
let vpn = lab.add_router("vpn")
.mtu(1420)
.block_icmp_frag_needed() // PMTU blackhole
.build().await?;
}
NAT Traversal
See NAT Hole-Punching for the full NAT implementation reference (nftables fullcone map, conntrack behavior, and debugging notes).
Hole punching (STUN + simultaneous open)
Both peers discover their reflexive address via STUN, exchange it through a signaling channel, then send UDP probes simultaneously. Each probe creates a NAT mapping that the peer’s probe can traverse.
#![allow(unused)]
fn main() {
// Both behind cone NATs: hole punching works
let nat_a = lab.add_router("nat-a").nat(Nat::Home).build().await?;
let nat_b = lab.add_router("nat-b").nat(Nat::Home).build().await?;
// Assert: direct connection established
// One side symmetric: hole punching fails, relay needed
let nat_a = lab.add_router("nat-a").nat(Nat::Home).build().await?;
let nat_b = lab.add_router("nat-b").nat(Nat::Corporate).build().await?;
// Assert: falls back to relay (TURN/DERP)
}
Double NAT (CGNAT + home router)
The device is behind two NAT layers. STUN returns the outermost public IP. Port forwarding (UPnP) only works on the home router, not the CGNAT. Hole punching is more timing-sensitive.
#![allow(unused)]
fn main() {
let cgnat = lab.add_router("cgnat").nat(Nat::Cgnat).build().await?;
let home = lab.add_router("home")
.upstream(cgnat.id())
.nat(Nat::Home)
.build().await?;
let device = lab.add_device("client").uplink(home.id()).build().await?;
}
NAT mapping timeout
After a period of inactivity, NAT mappings expire. The application must send keepalives to prevent this. Default UDP timeouts vary by NAT type (120-350s). Test by waiting beyond the timeout period then verifying connectivity.
#![allow(unused)]
fn main() {
// Custom short timeout for fast testing
let nat = lab.add_router("nat")
.nat(Nat::Custom(
NatConfig::builder()
.mapping(NatMapping::EndpointIndependent)
.filtering(NatFiltering::AddressAndPortDependent)
.udp_timeout(5) // seconds, short for testing
.build(),
))
.build().await?;
// Wait for timeout, verify mapping expired
tokio::time::sleep(Duration::from_secs(6)).await;
router.flush_nat_state().await?;
// Assert: reflexive address changed (new mapping)
}
WiFi to Cellular Handoff
When a device switches from WiFi to cellular, it loses its WiFi IP address and receives a new one from the cellular carrier. Existing TCP connections break because the remote peer is sending replies to the old address. QUIC connections can survive if both sides support connection migration. In practice there is a 0.5–5 second gap with no connectivity during the transition while the cellular radio attaches and the new address is assigned.
#![allow(unused)]
fn main() {
let wifi_router = lab.add_router("wifi").nat(Nat::Home).build().await?;
let cell_router = lab.add_router("cell").nat(Nat::Cgnat).build().await?;
let device = lab.add_device("phone")
.iface("eth0", wifi_router.id(), Some(LinkCondition::Wifi))
.build().await?;
// Simulate handoff with connectivity gap
device.link_down("eth0").await?;
tokio::time::sleep(Duration::from_millis(500)).await;
device.replug_iface("eth0", cell_router.id()).await?;
device.set_link_condition("eth0", Some(LinkCondition::Mobile4G)).await?;
device.link_up("eth0").await?;
// Assert: application reconnects within X seconds
}
Corporate Firewall Blocking UDP
UDP packets are silently dropped. STUN requests time out. ICE falls back through: UDP direct -> UDP relay (TURN) -> TCP relay -> TLS/TCP relay on 443.
#![allow(unused)]
fn main() {
let corp = lab.add_router("corp")
.nat(Nat::Corporate)
.firewall(Firewall::Corporate) // TCP 80,443 + UDP 53 only
.build().await?;
let workstation = lab.add_device("ws").uplink(corp.id()).build().await?;
// Assert: connection type is Relay, not Direct
// Assert: relay uses TCP/TLS on port 443
}
Asymmetric Bandwidth
Most consumer connections have significantly less upload bandwidth than download. Residential cable runs around 100/10 Mbps, cellular around 50/10 Mbps, satellite around 100/10 Mbps. The asymmetry matters for P2P applications because the bottleneck is always the uploader’s upload speed, not their download speed.
The bottleneck for P2P transfers is the uploader’s upload speed. For video calls, each direction is limited by the sender’s upload.
#![allow(unused)]
fn main() {
// 20 Mbps down, 2 Mbps up (10:1 ratio)
let router = lab.add_router("isp")
.nat(Nat::Home)
.downlink_condition(LinkCondition::Manual(LinkLimits {
rate_kbit: 20_000,
..Default::default()
}))
.build().await?;
let device = lab.add_device("client").uplink(router.id()).build().await?;
device.set_link_condition("eth0", Some(LinkCondition::Manual(LinkLimits {
rate_kbit: 2_000,
..Default::default()
})))?;
}
IPv6 Transition
See IPv6 Deployments for the full IPv6 deployment reference and router preset table.
Dual-stack
Device has both v4 and v6 addresses. Applications using Happy Eyeballs (RFC 8305) try v6 first. ICE collects both v4 and v6 candidates. Direct v6 connections skip NAT traversal entirely if both peers have public v6 addresses.
#![allow(unused)]
fn main() {
let router = lab.add_router("dual")
.ip_support(IpSupport::DualStack)
.nat(Nat::Home)
.build().await?;
}
v6-only with NAT64
Device has only an IPv6 address. IPv4 destinations are reached via NAT64:
the router translates packets between IPv6 and IPv4 using the well-known
prefix 64:ff9b::/96. Applications connect to [64:ff9b::<ipv4>]:port
and the router handles the rest. ICE candidates are v6 only; TURN must
be dual-stack.
#![allow(unused)]
fn main() {
use patchbay::nat64::embed_v4_in_nat64;
// One-liner: IspV6 preset = V6Only + NAT64 + BlockInbound
let carrier = lab.add_router("carrier")
.preset(RouterPreset::IspV6)
.build().await?;
let phone = lab.add_device("phone").uplink(carrier.id()).build().await?;
// Reach an IPv4 server via NAT64:
let nat64_addr = embed_v4_in_nat64(server_v4_ip);
let target = SocketAddr::new(IpAddr::V6(nat64_addr), 443);
}
Captive Portal
The device has L3 connectivity but no internet access. HTTP requests redirect to the portal. HTTPS and UDP fail. All connection attempts time out.
#![allow(unused)]
fn main() {
// Isolated router with no upstream (simulates pre-auth portal)
let portal = lab.add_router("portal").build().await?; // no upstream
let device = lab.add_device("victim").uplink(portal.id()).build().await?;
// Assert: all connections fail/timeout
// User "authenticates" - move to real router
device.replug_iface("eth0", real_router.id()).await?;
// Assert: connections now succeed
}
DHCP Renewal (IP Change on Same Network)
The device stays on the same network but its IP address changes. This happens during DHCP lease renewal, cloud instance metadata refresh, or ISP-side reassignment.
#![allow(unused)]
fn main() {
let old_ip = device.ip();
let new_ip = device.renew_ip("eth0").await?;
assert_ne!(old_ip, new_ip);
// Assert: application detects IP change and re-establishes connections
}
Degraded Network Conditions
Progressive degradation
Network conditions worsen over time (moving away from WiFi AP, entering tunnel on cellular, weather affecting satellite).
#![allow(unused)]
fn main() {
device.set_link_condition("eth0", Some(LinkCondition::Wifi)).await?;
tokio::time::sleep(Duration::from_secs(5)).await;
device.set_link_condition("eth0", Some(LinkCondition::WifiBad)).await?;
tokio::time::sleep(Duration::from_secs(5)).await;
device.set_link_condition("eth0", None).await?; // remove impairment
}
Intermittent connectivity
Network flaps briefly, simulating tunnels, elevators, or brief signal loss.
#![allow(unused)]
fn main() {
for _ in 0..3 {
device.link_down("eth0").await?;
tokio::time::sleep(Duration::from_millis(200)).await;
device.link_up("eth0").await?;
tokio::time::sleep(Duration::from_secs(2)).await;
}
// Assert: application recovers after each flap
}
Simulator Primitive Reference
| Real-World Event | Simulator Primitive |
|---|---|
| VPN connects (full tunnel) | device.replug_iface("eth0", vpn_router) |
| VPN disconnects | device.replug_iface("eth0", original_router) |
| VPN kill switch | link_down then replug_iface |
| VPN split tunnel | Two interfaces on different routers + set_default_route |
| WiFi to cellular | replug_iface + change set_link_condition |
| Network goes down briefly | link_down, sleep, link_up |
| Cone NAT | Nat::Home |
| Symmetric NAT | Nat::Corporate |
| Double NAT / CGNAT | Chain routers: home.upstream(cgnat.id()) |
| Corporate UDP block | Firewall::Corporate on router |
| Captive portal | Router with no upstream |
| DHCP renewal | device.renew_ip("eth0") |
| Asymmetric bandwidth | downlink_condition on router + set_link_condition on device |
| Degrading conditions | Sequential set_link_condition calls |
| MTU reduction (VPN) | .mtu(1420) on router or device builder |
| PMTU blackhole | .block_icmp_frag_needed() on router builder |
| IPv6 dual-stack | .ip_support(IpSupport::DualStack) |
| IPv6 only | .ip_support(IpSupport::V6Only) |
| IPv6-only + NAT64 | .preset(RouterPreset::IspV6) or .nat_v6(NatV6Mode::Nat64) |
| Mobile carrier (CGNAT) | .preset(RouterPreset::IspCgnat) |
| Mobile carrier (v6-only) | .preset(RouterPreset::IspV6) |
NAT Hole-Punching
This is an advanced reference for readers who want to understand how patchbay implements NAT traversal at the nftables level. You do not need to read this to use patchbay; the NAT and Firewalls guide covers the user-facing API.
patchbay implements NAT mapping and filtering using nftables. Getting UDP hole-punching to work across different NAT types in Linux network namespaces required solving several problems that are not obvious from the nftables documentation.
RFC 4787: mapping and filtering
Two independent axes define NAT behavior for UDP. Mapping controls how external ports are assigned: endpoint-independent mapping (EIM) reuses the same external port for all destinations, while endpoint-dependent mapping (EDM) assigns a different port per destination. Filtering controls which inbound packets are forwarded to a mapped port: endpoint-independent filtering (EIF) accepts packets from any external host, while endpoint-dependent filtering only forwards replies from hosts the internal client has already contacted.
Combined, these axes produce the real-world NAT profiles that patchbay simulates:
| Preset | Mapping | Filtering | Hole-punch? | Real-world examples |
|---|---|---|---|---|
Nat::Home | EIM | APDF | Yes, simultaneous open | FritzBox, Unifi, TP-Link, ASUS RT, OpenWRT |
Nat::FullCone | EIM | EIF | Always | Old FritzBox firmware, some CGNAT |
Nat::Corporate | EDM | APDF | Never (need relay) | Cisco ASA, Palo Alto, Fortinet, Juniper SRX |
Nat::CloudNat | EDM | APDF | Never (need relay) | AWS/Azure/GCP NAT Gateway |
Nat::Cgnat | – | – | Varies | ISP-level, stacks with home NAT |
The fullcone dynamic map
The only reliable way to get endpoint-independent mapping in nftables is to
explicitly track port mappings in a dynamic map. The kernel’s built-in
snat and masquerade statements do not preserve ports across independent
conntrack entries, even when there is no port conflict (see the pitfalls
section below). patchbay works around this with an @fullcone map:
table ip nat {
map fullcone {
type inet_service : ipv4_addr . inet_service
flags dynamic,timeout
timeout 300s
size 65536
}
chain prerouting {
type nat hook prerouting priority dstnat; policy accept;
iif "ix" meta l4proto udp dnat to udp dport map @fullcone
}
chain postrouting {
type nat hook postrouting priority srcnat; policy accept;
oif "ix" meta l4proto udp update @fullcone {
udp sport timeout 300s : ip saddr . udp sport
}
oif "ix" snat to <wan_ip>
}
}
The postrouting chain records the pre-SNAT source address and port in the
map before the snat rule executes. The map key is the UDP source port
and the value is internal_ip . internal_port. Even if snat later
remaps the port, the map holds the correct mapping keyed by the original
port. On the inbound side, the prerouting chain looks up incoming UDP
packets by destination port in the map and DNATs them to the internal
host, bypassing conntrack reverse-NAT entirely.
The update statement must come before snat in the postrouting chain.
nftables NAT statements record the transformation, but the conntrack
entry’s reply tuple is not yet available during the same chain evaluation.
By recording udp sport and ip saddr before SNAT, we capture the
original tuple. Map entries time out after 300 seconds and are refreshed
by outbound traffic.
Filtering modes
Endpoint-independent filtering (fullcone)
Nat::FullCone uses the fullcone map above with no additional filtering.
The prerouting DNAT fires for any inbound packet whose destination port
appears in the map, regardless of source address. Once an internal device
sends one outbound packet, any external host can reach it on the mapped
port.
Address-and-port-dependent filtering (home NAT)
Nat::Home uses the same fullcone map for endpoint-independent mapping,
plus a forward filter that restricts inbound traffic to established
connections:
table ip filter {
chain forward {
type filter hook forward priority 0; policy accept;
iif "ix" ct state established,related accept
iif "ix" drop
}
}
This combination is what makes hole-punching work with home NATs. The sequence is:
- The internal device sends a UDP packet to the peer. Postrouting SNAT creates a conntrack entry and the fullcone map records the port mapping.
- The peer sends a packet to the device’s mapped address. Prerouting DNAT via the fullcone map rewrites the destination from the router’s WAN IP to the device’s internal IP.
- After DNAT, the packet’s 5-tuple matches the reply direction of the
outbound conntrack entry from step 1. Conntrack marks it as
ct state established. - The forward filter allows the packet through.
An unsolicited packet from an unknown host also gets DNATed in step 2, but
no matching outbound conntrack entry exists, so the packet arrives with
ct state new and the filter drops it.
Endpoint-dependent mapping (corporate and cloud NAT)
Nat::Corporate and Nat::CloudNat use plain masquerade random
without a fullcone map:
table ip nat {
chain postrouting {
type nat hook postrouting priority 100;
oif "ix" masquerade random
}
}
The random flag randomizes the source port for each conntrack entry.
Without a fullcone map and without a prerouting chain, hole-punching is
impossible because the peer cannot predict the mapped port from a STUN
probe.
nftables pitfalls
Port preservation is unreliable
The single biggest surprise during implementation. Conventional wisdom
says snat to <ip> without a port range is “port-preserving”. In
practice, Linux conntrack assigns different external ports for different
conntrack entries from the same source socket, even when there is no port
conflict.
For example: a device binds port 40000, sends to a STUN server (port preserved to 40000), then sends to a peer. Conntrack assigns port 27028 instead of 40000, despite the absence of any conflict on that port.
None of the following fix this:
oif "ix" snat to 203.0.113.11 # port NOT preserved across entries
oif "ix" snat to 203.0.113.11 persistent # still remaps
oif "ix" masquerade persistent # still remaps
The persistent flag is documented to “give a client the same
source-ip,source-port”, but the kernel’s NAT tuple uniqueness check still
triggers port reallocation across independent conntrack entries. This is
why the fullcone dynamic map is necessary for endpoint-independent mapping.
A prerouting nat chain is required even if empty
Without a type nat hook prerouting chain registered in the nat table,
the kernel does not perform conntrack reverse-NAT lookup on inbound
packets. Packets destined for the router’s WAN IP that should be
reverse-DNATed are delivered to the router’s INPUT chain instead of being
forwarded to the internal device.
Conntrack reverse-NAT depends on port consistency
Even with a prerouting chain, conntrack reverse-NAT only works when the inbound packet’s 5-tuple matches the reply tuple of an existing conntrack entry. If SNAT changed the port (which it does, as described above), the peer sends to the wrong port and conntrack cannot match the entry.
Test helper subtlety
Both sides of a hole-punch test call holepunch_send_recv, which sends
UDP probes every 200ms and checks for a response. There is a critical
ordering issue: when one side receives a probe first, it must send a few
more packets before returning. Otherwise, side A receives side B’s probe,
returns success, and stops sending. But side B’s early probes may have
arrived before side A created its outbound conntrack entry at side B’s
NAT, so those probes were dropped by APDF filtering. With side A no
longer sending, side B never receives a packet.
The fix is to send three additional “ack” packets after receiving, to ensure the peer’s NAT has an established conntrack entry in both directions.
NatConfig architecture
The Nat enum provides named presets. Each preset expands via
Nat::to_config() to a NatConfig struct that drives rule generation:
#![allow(unused)]
fn main() {
pub struct NatConfig {
pub mapping: NatMapping, // EIM or EDM
pub filtering: NatFiltering, // EIF or APDF
pub timeouts: ConntrackTimeouts, // udp, udp_stream, tcp_established
}
}
The generate_nat_rules() function in core.rs builds nftables rules
from NatConfig alone, without matching on Nat variants. This means
users can either use the named presets (router.nat(Nat::Home)) or build
custom configurations with arbitrary mapping and filtering combinations.
CGNAT is a special case: Nat::Cgnat is applied at the ISP router level
via apply_isp_cgnat() rather than through NatConfig. It uses plain
masquerade (without the random flag) on the IX-facing interface and
stacks with the downstream home router’s NAT.
NPTv6 implementation notes
NPTv6 (Network Prefix Translation for IPv6) translates source and
destination prefixes while preserving the host part, using nftables
snat prefix to and dnat prefix to. Several issues were found during
implementation:
-
Prefix length mismatch breaks translation. NPTv6 requires matching prefix lengths on LAN and WAN sides. The
nptv6_wan_prefix()function derives a unique /64 from the router’s IX address. -
Unrestricted
dnat prefixbreaks NDP. Without an address match clause, NDP and ICMPv6 packets get translated, making the router unreachable. The rules are restricted toip6 saddr/daddrmatching the WAN or LAN prefix. -
WAN prefix must be outside the IX on-link range. The IX CIDR was changed from /32 to /64 so WAN prefixes are off-link and routed via the gateway.
-
Return routes needed for private v6 downstreams. IPv6 return routes are added for all IX-level routers regardless of downstream pool configuration.
See IPv6 Deployments for the full IPv6 deployment reference.
Limitations
The fullcone map tracks UDP only. TCP hole-punching (simultaneous SYN) relies on plain conntrack, which matches real-world behavior where TCP hole-punching is unreliable.
There is also a port preservation assumption in the map: if snat to <ip>
remaps the source port, the fullcone map key (the original port) differs
from the actual mapped port. In practice this does not happen in patchbay
simulations because there are few concurrent flows relative to the 64k
port space.
Future work
- Address-restricted cone (EIM + address-dependent filtering): extend the fullcone map to track contacted remote IPs.
- Hairpin NAT: add a prerouting rule for LAN packets addressed to the router’s own WAN IP.
- TCP fullcone: extend
@fullconeto TCP for a complete NAT model. - Port-conflict-safe fullcone: two-stage postrouting to read
ct reply proto-dstafter conntrack finalizes the mapping.
Sim TOML Reference
A simulation is defined by one TOML file. That file describes what topology to use, what binaries to run, and the sequence of steps to execute. This page covers every field.
File layout
[[extends]] # optional: inherit from a shared defaults file
file = "..."
[matrix] # optional: generate multiple sims via Cartesian product
topo = ["1to1", "1to3"]
[sim] # simulation metadata
name = "..."
topology = "..."
[[binary]] # optional: binary definitions (repeatable)
...
[[prepare]] # optional: prebuild configuration (repeatable)
...
[[step-template]] # optional: reusable single-step templates (repeatable)
...
[[step-group]] # optional: reusable multi-step groups (repeatable)
...
[[step]] # the actual steps to execute (repeatable)
...
Inline topology tables ([[router]], [device.*], [region.*]) can also
appear directly in the sim file instead of referencing an external topology.
An optional [matrix] table generates multiple simulations from one file via
Cartesian product expansion. See [matrix] below.
[matrix]
Defines axes whose Cartesian product generates multiple simulation variants
from a single TOML file. Each axis is an array of string values. Placeholders
of the form ${matrix.<key>} in string values throughout the file are
replaced with the corresponding axis value for each variant.
Files without a [matrix] table produce exactly one simulation, unchanged.
Basic usage
[matrix]
topo = ["1to1", "1to3", "1to5"]
[sim]
name = "iroh-${matrix.topo}-baseline"
topology = "${matrix.topo}-public"
This produces three simulations. ${matrix.topo} is replaced with each value
in order.
Multi-axis expansion
Multiple axes produce the Cartesian product of all values:
[matrix]
topo = ["1to1", "1to3"]
cond = ["baseline", "impaired"]
This produces four simulations (2 x 2). Each combination of topo and cond
generates one variant.
Params
When a matrix axis needs more than one substitution value per variant, use
[matrix.params.<axis>]. Each key in the params table corresponds to an axis
value and maps to a table of additional placeholder values:
[matrix]
cond = ["baseline", "impaired"]
[matrix.params.cond]
baseline = { latency = "0", rate = "0", impaired = "false" }
impaired = { latency = "200", rate = "4000", impaired = "true" }
When cond = "impaired", the placeholders resolve as follows:
${matrix.cond} becomes impaired, ${matrix.latency} becomes 200,
${matrix.rate} becomes 4000, and ${matrix.impaired} becomes true.
Param keys are flattened into the ${matrix.*} namespace alongside the axis
value itself.
All param values are strings. Fields that expect numbers (like latency_ms in
link conditions) accept both native TOML numbers and string representations,
so latency_ms = "200" and latency_ms = 200 are equivalent.
Conditional steps with when
Steps can include a when field to conditionally skip execution based on a
matrix variable. A step with when = "false" is skipped; any other value
(or no when field) means the step runs normally:
[[step]]
when = "${matrix.impaired}"
action = "set-link-condition"
device = "fetcher"
condition = { latency_ms = "${matrix.latency}", rate_kbit = "${matrix.rate}" }
In the baseline variant (impaired = "false"), this step is skipped. In the
impaired variant (impaired = "true"), it runs and applies the condition.
Interaction with extends
Matrix expansion runs after extends are loaded. An [[extends]] file can
contribute templates, groups, and binaries; the [matrix] table in the main
sim file then expands the merged result.
[[extends]]
Pulls in definitions from another TOML file. The loaded file can contribute
[[binary]], [[prepare]], [[step-template]], and [[step-group]] entries.
The sim file’s own declarations always win on name collision. Multiple
[[extends]] blocks are supported and processed in order.
| Key | Type | Description |
|---|---|---|
file | string | Path to the shared file. Searched relative to the sim file, then one directory up, then the working directory. |
Example:
[[extends]]
file = "iroh-defaults.toml"
[sim]
| Key | Type | Description |
|---|---|---|
name | string | Identifier used in output filenames and the report header. |
topology | string | Name of a topology file to load from the topos/ directory next to the sim file. Overrides any topology from [[extends]]. |
[[binary]]
Declares a named binary that steps can reference as ${binary.<name>}. Exactly
one source field is required (or mode can be set explicitly).
| Key | Type | Description |
|---|---|---|
name | string | Reference key. Used as ${binary.relay}, ${binary.transfer}, etc. |
mode | string | Source mode: "path", "fetch", "build", or "target". Inferred from other fields when omitted. |
path | string | Local path to a prebuilt binary or source directory. Prefix target: to resolve relative to the Cargo target directory. |
url | string | Download URL. Supports .tar.gz archives; the binary is extracted automatically. |
repo | string | Git repository URL. Must pair with example or bin. |
commit | string | Branch, tag, or SHA for repo source. Defaults to "main". |
example | string | Build with cargo --example <name>. Works with repo (build mode) or mode = "target". |
bin | string | Build with cargo --bin <name>. Works with repo (build mode) or mode = "target". |
features | array | Cargo feature list to enable when building. |
all-features | boolean | Build with --all-features. |
Mode inference: When mode is omitted, it is inferred: path → "path";
url → "fetch"; repo, example, or bin → "build". Use
mode = "target" explicitly to reference a pre-built artifact in the Cargo target
directory by example or bin name (skips building).
[[prepare]]
Declares binaries to prebuild from the project workspace before execution. Multiple entries are supported; each produces release-mode artifacts.
| Key | Type | Description |
|---|---|---|
mode | string | Preparation mode. Currently only "build" (the default). |
examples | array | Example names to build with cargo build --example. |
bins | array | Binary names to build with cargo build --bin. |
features | array | Cargo feature list to enable. |
all-features | boolean | Build with --all-features. |
[[step-template]]
A named, reusable step definition. Contains the same fields as a [[step]]
plus a name. Referenced with use = "<name>" in a step; the call-site fields
are merged on top before the step executes.
[[step-template]]
name = "transfer-fetcher"
action = "spawn"
parser = "ndjson"
cmd = ["${binary.transfer}", "--output", "json", "fetch"]
[step-template.captures.size]
match = { kind = "DownloadComplete" }
pick = ".size"
[step-template.results]
down_bytes = ".size"
Call site:
[[step]]
use = "transfer-fetcher"
id = "fetcher"
device = "fetcher"
args = ["${provider.endpoint_id}"]
The call site’s id, device, timeout, args, env, requires,
captures, and results fields are merged into the template. args is
appended to the template’s cmd. env is merged (call site wins on
collision). captures is merged (call site wins). results replaces entirely
if supplied.
[[step-group]]
A named sequence of steps that expands inline wherever use = "<group-name>"
appears. Groups support variable substitution for parameterization.
| Key | Type | Description |
|---|---|---|
name | string | Group identifier. |
[[step-group.step]] | array | Ordered step definitions. |
The call site uses a [[step]] with use and vars:
[[step]]
use = "relay-setup"
vars = { device = "relay" }
Inside group steps, ${group.<key>} is substituted with the caller-supplied
value before the steps execute. This substitution happens at expansion time
(before runtime), so a two-stage pattern is used for nested references:
# In the group step:
content = "cert_path = \"${${group.device}-cert.cert_pem_path}\""
# After group expansion (e.g. device="relay"):
# -> cert_path = "${relay-cert.cert_pem_path}"
# Then resolved at runtime as a capture reference.
Group steps can themselves use use = "<step-template-name>" to inherit from
a template. Groups cannot nest other groups.
[[step]]
Common fields
These fields apply to most or all step types.
| Key | Type | Description |
|---|---|---|
action | string | Step type. See the sections below for valid values. Defaults to "run" when cmd is present. |
id | string | Step identifier. Required for spawn, gen-certs, gen-file. Referenced as ${id.capture_name} in later steps. |
use | string | Template or group name. When referencing a group, only vars is used from this entry; all other fields come from the group. |
vars | table | Group substitution variables. Only meaningful when use references a [[step-group]]. |
device | string | Name of the network namespace to run the command in. |
env | table | Extra environment variables, merged with any template env. |
requires | array of strings | Capture keys to wait for before this step starts. Format: "step_id.capture_name". Blocks until all are resolved. |
when | string | Conditional guard. If "false", the step is skipped. Any other value or absent means run. Typically set via ${matrix.*} substitution. |
Counted device expansion
When a step targets a device that has count > 1 in the topology, the step is
automatically expanded into N copies. Each copy’s device and id fields are
suffixed with -0, -1, …, -N-1. For example, a step with device = "peer"
against a topology with [device.peer] count = 3 produces three steps targeting
peer-0, peer-1, and peer-2.
wait-for steps are similarly expanded when their id matches a counted device
name.
action = "run"
Runs a command and waits for it to exit before moving to the next step.
| Key | Type | Default | Description |
|---|---|---|---|
cmd | array | required | Command and arguments. Supports ${binary.<n>}, $NETSIM_IP_<device>, ${id.capture}. |
args | array | none | Appended to the template’s cmd. Does not replace it. |
parser | string | "text" | Output parser. See parsers. |
captures | table | none | Named captures. See [captures]. |
results | table | none | Normalized result fields. See [results]. |
action = "spawn"
Starts a process in the background. A later wait-for step waits for it to exit.
| Key | Type | Default | Description |
|---|---|---|---|
cmd | array | required | Command and arguments. |
args | array | none | Appended to the template’s cmd. |
parser | string | "text" | Output parser. See parsers. |
ready_after | duration | none | How long to wait after spawning before the next step runs. Useful when a process needs startup time but doesn’t print a ready signal. |
captures | table | none | Named captures. See [captures]. |
results | table | none | Normalized result fields. Collected when the process exits. |
action = "wait-for"
Waits for a spawned process to exit. Collects its captures and results.
| Key | Type | Default | Description |
|---|---|---|---|
id | string | required | ID of a previously spawned step. |
timeout | duration | "300s" | How long to wait before failing. |
action = "wait"
Sleeps for a fixed duration.
| Key | Type | Description |
|---|---|---|
duration | duration | Required. How long to sleep. |
action = "set-link-condition" (alias "set-impair")
Applies link impairment (rate limit, loss, latency) to a device interface using
tc netem and tc tbf. Pass null / omit condition to clear impairment.
| Key | Type | Description |
|---|---|---|
device | string | Target device. |
interface | string | Interface name (e.g. "eth0"). Defaults to the device’s first interface. |
condition | string or table | Preset name or a custom table. See link conditions. Alias: impair. |
action = "link-down" / action = "link-up"
Brings a device interface up or down.
| Key | Type | Description |
|---|---|---|
device | string | Target device. |
interface | string | Interface name. |
action = "set-default-route"
Switches the default route on a device to a given interface. Useful for simulating path changes.
| Key | Type | Description |
|---|---|---|
device | string | Target device. |
to | string | Interface to set as the new default route. |
action = "gen-certs"
Generates a self-signed TLS certificate and key using rcgen. The outputs are
written to {work_dir}/certs/{id}/ and also stored as captures.
| Key | Type | Default | Description |
|---|---|---|---|
id | string | required | Step ID, prefixes the output captures. |
device | string | none | Device whose IP is automatically added to the Subject Alternative Names. |
cn | string | "patchbay" | Certificate Common Name. |
san | array of strings | [device_ip] | Additional SANs. $NETSIM_IP_<device> variables are expanded. Values that parse as IP addresses become IP SANs; others become DNS SANs. |
Output captures: {id}.cert_pem_path, {id}.key_pem_path.
action = "gen-file"
Writes an interpolated string to disk and records the path as a capture. Useful for generating config files that reference captures from earlier steps.
| Key | Type | Description |
|---|---|---|
id | string | Required. |
content | string | Required. ${...} tokens are interpolated; blocks on unresolved capture references. |
Output capture: {id}.path.
The file is written to {work_dir}/files/{id}/content.
action = "assert"
Checks one or more assertion expressions. All must pass; the sim fails on the first that doesn’t.
| Key | Type | Description |
|---|---|---|
check | string | Single assertion expression. |
checks | array of strings | Multiple expressions; equivalent to multiple check fields. |
Expression syntax:
step_id.capture_name operator rhs
The LHS must be a capture key in the form step_id.capture_name. The value
used is the most recent one recorded for that capture.
| Operator | Passes when |
|---|---|
== rhs | Exact string match. |
!= rhs | Not an exact match. |
contains rhs | rhs is a substring of the capture value. |
matches rhs | rhs is a Rust regex that matches the capture value. |
>= rhs | Both sides parsed as numbers; LHS is greater or equal. |
Examples:
[[step]]
action = "assert"
checks = [
"fetcher.conn_type contains Direct",
"fetcher.size matches [0-9]+",
"iperf-run.bps != 0",
"ping-check.avg_rtt >= 50",
]
Parsers
Set on run or spawn steps with parser = "...".
| Value | When it fires | What it can do |
|---|---|---|
"text" | Streaming, per line | regex captures only. |
"ndjson" | Streaming, per line | regex captures, plus match/pick on JSON lines. |
"json" | After process exits | pick on the single JSON document. No per-line matching. |
[captures]
Defined as sub-tables of a run or spawn step:
[[step]]
action = "run"
id = "iperf"
parser = "json"
cmd = ["iperf3", "-J", ...]
[step.captures.bytes]
pick = ".end.sum_received.bytes"
[step.captures.seconds]
pick = ".end.sum_received.seconds"
Or on a template:
[[step-template]]
name = "transfer-provider"
...
[step-template.captures.endpoint_id]
match = { kind = "EndpointBound" }
pick = ".endpoint_id"
| Key | Type | Default | Description |
|---|---|---|---|
pipe | string | "stdout" | Which output stream to read: "stdout" or "stderr". |
regex | string | none | Regex applied to the raw text line. Group 1 is captured if present, otherwise the full match. Works with all parsers. |
match | table | none | Key=value guards on a parsed JSON object. All keys must match. Requires pick. Only valid with "ndjson" or "json" parser. |
pick | string | none | Dot-path into the parsed JSON value, e.g. ".endpoint_id" or ".end.sum_received.bytes". Requires "ndjson" or "json" parser. |
With "ndjson", every matching line updates the capture value. With "json",
the capture is set once from the parsed document. With "text", only regex
matching is available.
The latest value is available for interpolation as ${step_id.capture_name}.
[results]
Maps well-known output fields to capture references, so the report and UI can show normalized throughput and latency comparisons across steps and runs.
[step.results]
duration = "iperf-run.seconds"
down_bytes = "iperf-run.bytes"
latency_ms = "ping-check.avg_rtt"
Inside a [[step-template]], the shorthand .capture_name (leading dot, no
step ID) refers to the template step’s own captures. It gets rewritten to
{id}.capture_name when the template is expanded:
[step-template.results]
duration = ".duration" # becomes "fetcher.duration" when id="fetcher"
down_bytes = ".size"
| Field | Type | Description |
|---|---|---|
duration | string | Capture key for the duration of the transfer or test (microseconds as integer, or seconds as float). |
up_bytes | string | Capture key for bytes sent (upload). |
down_bytes | string | Capture key for bytes received (download). |
latency_ms | string | Capture key for round-trip or one-way latency in milliseconds. |
Throughput (down_bytes / duration) is computed in the UI. Unset fields are
omitted from the output.
Link conditions
Used by the set-link-condition step (condition / impair field) and by
device interface impair fields in the topology.
Presets:
| Value | Latency | Jitter | Loss | Rate limit |
|---|---|---|---|---|
"lan" | 0 ms | 0 ms | 0 % | unlimited |
"wifi" | 5 ms | 2 ms | 0.1 % | unlimited |
"wifi-bad" | 40 ms | 15 ms | 2 % | 20 Mbit |
"mobile-4g" | 25 ms | 8 ms | 0.5 % | unlimited |
"mobile-3g" | 100 ms | 30 ms | 2 % | 2 Mbit |
"satellite" | 40 ms | 7 ms | 1 % | unlimited |
"satellite-geo" | 300 ms | 20 ms | 0.5 % | 25 Mbit |
Custom table:
impair = { latency_ms = 100, jitter_ms = 10, loss_pct = 0.5, rate_kbit = 10000 }
All numeric fields also accept string representations (latency_ms = "100" is
equivalent to latency_ms = 100). This enables matrix variable substitution
in link condition tables.
| Field | Type | Default | Description |
|---|---|---|---|
rate_kbit | u32 | 0 | Rate limit in kbit/s (0 = unlimited). |
loss_pct | f32 | 0.0 | Packet loss percentage (0.0–100.0). |
latency_ms | u32 | 0 | One-way latency in milliseconds. |
jitter_ms | u32 | 0 | Jitter in milliseconds (uniform ±jitter around latency). |
reorder_pct | f32 | 0.0 | Packet reordering percentage. |
duplicate_pct | f32 | 0.0 | Packet duplication percentage. |
corrupt_pct | f32 | 0.0 | Bit-error corruption percentage. |
Variable interpolation
Supported in cmd, args, env values, content (gen-file), and san
(gen-certs).
| Pattern | Resolves to |
|---|---|
${binary.<name>} | Resolved filesystem path to the named binary. |
$NETSIM_IP_<DEVICE> | IP address of the device (name uppercased, non-alphanumeric characters replaced with _). |
${step_id.capture_name} | Latest value of the named capture. Blocks until the capture resolves. |
${matrix.<key>} | Matrix axis value or param. Substituted at load time before deserialization. See [matrix]. |
Duration format
Durations are strings of the form "<n>s", "<n>ms", or "<n>m".
Examples: "30s", "500ms", "2m", "300s".
Output files
For each invocation of patchbay run, a timestamped run directory is created
under the work root (default .patchbay-work/):
.patchbay-work/
latest -> sim-YYMMDD-HHMMSS # symlink to most recent run
sim-YYMMDD-HHMMSS/ # run root
manifest.json # run-level metadata and sim summaries
progress.json # live progress (updated during execution)
combined-results.json # aggregated results across all sims
combined-results.md # human-readable combined summary
<sim-name>/ # per-sim subdirectory
sim.json # sim-level summary (status, setup, errors)
results.json # captures and normalized results
results.md # human-readable results table
events.jsonl # lab lifecycle events
nodes/
<device>/
stdout.log
stderr.log
files/ # gen-file outputs
<id>/content
certs/ # gen-certs outputs
<id>/cert.pem
<id>/key.pem
results.json structure:
{
"sim": "iperf-baseline",
"steps": [
{
"id": "iperf-run",
"duration": "10.05",
"down_bytes": "1234567890",
"latency_ms": null,
"up_bytes": null
}
]
}
Topology files
A topology file (in topos/) defines the network graph: routers with optional
NAT, and devices with their interfaces and gateways.
# A datacenter router (no NAT)
[[router]]
name = "dc"
# A home NAT router (endpoint-independent mapping, port-restricted filtering)
[[router]]
name = "lan-client"
nat = "home"
# A device with one interface behind the DC router
[device.server.eth0]
gateway = "dc"
# A device behind the NAT router
[device.client.eth0]
gateway = "lan-client"
# A device with initial link impairment
[device.sender.eth0]
gateway = "dc"
impair = { latency_ms = 100 }
# Multiple devices of the same name (count expansion)
[device.fetcher]
count = 10
[device.fetcher.eth0]
gateway = "dc"
Device interface fields:
| Key | Type | Description |
|---|---|---|
gateway | string | Required. Name of the upstream router. |
impair | string or table | Initial link impairment. Accepts the same values as link conditions. Applied after network setup. |
Device-level fields:
| Key | Type | Default | Description |
|---|---|---|---|
count | integer | 1 | Number of instances. Creates {name}-0 through {name}-{N-1}. Steps targeting the base name are automatically expanded. |
NAT modes:
| Value | Behavior |
|---|---|
| (absent) | No NAT; device has a public IP on the upstream network. |
"home" | EIM+APDF: same external port for all destinations (port-restricted cone). |
"corporate" | EDM+APDF: different port per destination (symmetric NAT). |
"cgnat" | EIM+EIF: carrier-grade NAT, stacks with home NAT. |
"cloud-nat" | EDM+APDF: symmetric NAT with longer timeouts (AWS/Azure/GCP). |
"full-cone" | EIM+EIF: any host can reach the mapped port. |
Region latency can be added to introduce inter-router delays:
[region.us-west]
latencies = { us-east = 80, eu-central = 140 }
Values are one-way latency in milliseconds. Attach a router to a region with
region = "us-west" in the [[router]] table.
Example: minimal iperf sim
[sim]
name = "iperf-baseline"
topology = "1to1-public"
[[step]]
action = "spawn"
id = "iperf-server"
device = "provider"
cmd = ["iperf3", "-s", "-1"]
ready_after = "1s"
[[step]]
action = "run"
id = "iperf-run"
device = "fetcher"
parser = "json"
cmd = ["iperf3", "-c", "$NETSIM_IP_provider", "-t", "10", "-J"]
[step.captures.bytes]
pick = ".end.sum_received.bytes"
[step.captures.seconds]
pick = ".end.sum_received.seconds"
[step.results]
duration = "iperf-run.seconds"
down_bytes = "iperf-run.bytes"
[[step]]
action = "wait-for"
id = "iperf-server"
[[step]]
action = "assert"
checks = [
"iperf-run.bytes matches [0-9]+",
]
Example: ping with latency capture
[sim]
name = "ping-latency"
[[router]]
name = "dc"
[device.sender.eth0]
gateway = "dc"
impair = { latency_ms = 100 }
[device.receiver.eth0]
gateway = "dc"
[[step]]
action = "run"
id = "ping-check"
device = "sender"
cmd = ["ping", "-c", "3", "$NETSIM_IP_receiver"]
parser = "text"
[step.captures.avg_rtt]
pipe = "stdout"
regex = "rtt min/avg/max/mdev = [\\d.]+/([\\d.]+)/"
[step.results]
latency_ms = "ping-check.avg_rtt"
Example: iroh transfer with relay (NAT topology)
This uses templates and a step group defined in iroh-defaults.toml.
[[extends]]
file = "iroh-defaults.toml"
[sim]
name = "iroh-1to1-nat"
topology = "1to1-nat"
# Expands to: gen-certs -> gen-file (relay config) -> spawn relay
[[step]]
use = "relay-setup"
vars = { device = "relay" }
[[step]]
use = "transfer-provider"
id = "provider"
device = "provider"
requires = ["relay.ready"]
args = ["--relay-url", "https://$NETSIM_IP_relay:3340"]
[[step]]
use = "transfer-fetcher"
id = "fetcher"
device = "fetcher"
args = ["${provider.endpoint_id}",
"--relay-url", "https://$NETSIM_IP_relay:3340",
"--remote-relay-url", "https://$NETSIM_IP_relay:3340"]
[[step]]
action = "wait-for"
id = "fetcher"
timeout = "45s"
[[step]]
action = "assert"
checks = [
"fetcher.size matches [0-9]+",
]
The relay-setup group (from iroh-defaults.toml) runs gen-certs, writes a
relay config file with gen-file, and spawns the relay binary. The relay step
captures a ready signal from stderr; provider uses requires = ["relay.ready"]
to block until it fires.