Announcements

Introducing Sailboxes: Persistent Sandboxes for Long-Horizon Cloud Agents

Nirvik BaruahApr 27 20264 min read

Why Sailboxes
How Sailboxes Work
Pricing
Sailboxes in action: A 110-Hour coding agent swarm
Try Us Today

Today we unveil the Sailbox, our purpose-built sandbox product for long-horizon agents. We later discuss how a swarm of 4 Sailboxes successfully built Redis in Rust over a 27-hour lifetime uninterrupted, at 30% more cost efficiency than other leading sandbox providers.

Why Sailboxes

We believe today's sandboxing products are inadequate for long-horizon background agents for a number of reasons:

Pay for waiting. Multi-turn agents spend the majority of their time waiting on inference results as opposed to doing useful work. Most providers continue to charge you even during these idle periods. We believe you should only ever be charged for active sandbox usage.
Capped lifetimes. Few providers allow you to run sandboxes for an indefinite period of time. Background agents are the most valuable when they are able to reliably retain weeks of context at a time.
Brittle by default. Fault tolerance is currently the user's problem. Given enough runtime, every classic distributed-systems issue becomes inevitable: hosts fail, network partitions occur, and commands may be partially executed. Developers are forced to write boilerplate code to handle these failures; we believe a good sandboxing product should abstract them away.

Sailboxes were designed to solve all of these issues. We combine the ergonomics of traditional sandboxing APIs with the durable-execution semantics of systems like Temporal. This fault-tolerant architecture allows us to charge you only for active sandbox usage by pausing your VM when it is idle, and guarantees that your cloud agent will be available for any period of time.

How Sailboxes Work

Sailboxes provide durable execution by aggressively snapshotting all VM state and replaying any commands that were executed since the latest snapshot. The at-least-once execution semantics of Sailboxes are safe for any deterministic operation within a sandbox, and non-deterministic operations can be made safe by manually triggering a checkpoint immediately after the operation. At the network level, we proxy all inbound network traffic and provide the ability to proxy outbound network traffic so that stateful operations can have exactly-once execution semantics.

Our work at both the VM and network layer allows us to pause and resume any sandboxed workload, even agent harnesses that depend on HTTP servers during long-horizon tasks. Additionally, users can make idle Sailboxes sleep after a period of inactivity and automatically resume in seconds when they receive network traffic. We leverage this to ensure that users are only charged for the period of time when they can use a sandbox.

import sail

# Starts a sailbox. Charges will begin once this function returns.
sb = sail.Sailbox.create(image=sail.Image.debian_arm)
sb.exec("echo Hello!")

# Performs a network request. Since blocking=True, we will put your sailbox
# to sleep for the duration of the network call and will automatically resume
# once a response is received. You will not be charged during this time.
sb.request("POST", "...", blocking=True)

# Checkpoints the internal state of the sailbox. Once this function returns, we
# guarantee that we will never replay any commands from before this point.
# Checkpoints take ~1s and you will never be charged during the duration of it.
sb.checkpoint()

# Starts a daemon for a long-horizon workflow on a Sailbox that listens on port 3000.
sb = sail.Sailbox.create_daemon(
    "python3 http.server 3000",
    image=sail.Image.debian_arm,
    ingress_ports=[3000],
)

# Get the public URL of the sailbox.
# e.g. https://sb-...sailresearch.com/
sb.listener(3000).url

# Puts the Sailbox to sleep. It will be automatically resumed when it receives
# incoming network traffic. You will not be charged while it is asleep.
sb.sleep()

# Freezes the state of a VM. Can be resumed again at any point in the future.
sb.pause()

sb.terminate()

Pricing

Sailboxes only ever charge for active time: we will never charge users for events such as cold-start, and provide first-class support for temporarily quiescing sandboxes during blocking operations, such as waiting for the result of an inference request. Over the course of a long-horizon agent with multiple turns, this makes us more cost-efficient than any other provider.

Sailbox pricing

Active vCPU/hour	Active RAM (GB)/hour	Active disk (GB)/hour
$0.075	$0.015	$0.00012

Sailboxes in action: A 110-Hour coding agent swarm

To demonstrate Sailboxes, we tasked a swarm of 4 Sailboxes, each with 1 vCPU, 2GB RAM, and 16GB disks, to create a wire-compatible clone of Redis in Rust given just a set of test cases and performance benchmarks. The Sailboxes ran OpenCode using GLM 5.0 on Sail Inference and maintained persistent sessions, freezing their state automatically during idle periods such as inference requests. Even though our Sailboxes were alive for a cumulative ~110 hours (27.5 hours each), they only used 64 CPU hours in total and were charged accordingly. The ability to only pay for active Sailbox usage time made this workflow significantly more cost-efficient than other providers:

Workload usage

Metric	Idle	Active
CPU hours (vCPU-hours)	45.89	64.11
Memory hours (GB-hours)	91.78	128.22
Disk hours (GB-hours)	590.24	1025.76

Cost of running workload

Sandbox Provider	Estimated cost
Daytona	~$9.30
Modal	N/A (Modal sandboxes cannot run for longer than 24 hours)
Vercel	N/A (Vercel sandboxes cannot run for longer than 5 hours)
E2B	N/A (E2B sandboxes cannot run for longer than 24 hours)
Sailboxes	$6.85

Try Us Today

Sailboxes are currently in private beta, with plans to make them generally available in the coming weeks. If you're interested in piloting Sailboxes for your agentic workflow, reach out to us at founders@sailresearch.com.

← Back to News