TL;DR
Headscale is an open-source, Tailscale-compatible coordination server. It is a real alternative to running Tailscale’s managed control plane, with a predictable set of trade-offs. The headline “free” of running open source is misleading: on a fully loaded cost basis including engineering time, backups, monitoring, and high availability, self-host often costs more than managed until a team reaches 20–30 active users, and comes with slower feature delivery. Self-host wins on data sovereignty, on licensing flexibility, and on customisation. Managed wins on features-per-day, on time to first connection, and on compliance paperwork. This post compares the two honestly with costs and operational patterns we have seen in real deployments.
Who this is for
Platform engineers deciding whether to self-host a mesh-VPN coordination server. Infrastructure leads doing a build-vs-buy analysis. CISOs with self-host constraints from regulators. This post assumes you already know the basics of mesh VPN and have identified the coordination-server-choice as the open question.
Table of contents
- What Headscale actually does
- The case for managed (Tailscale, QuickZTNA, NetBird Cloud)
- The case for self-host (Headscale, NetBird self-host, QuickZTNA Workforce)
- Hidden cost categories in self-host
- A fully loaded cost model
- When self-host wins
- When managed wins
- Hybrid patterns
- Operational playbook for Headscale
- Decision framework
1. What Headscale actually does
Headscale is a Go program that implements the Tailscale control-plane protocol. It runs as a single binary plus a database (SQLite, PostgreSQL). You point official Tailscale clients at its URL via configuration, and the clients authenticate, register, and receive peer lists from Headscale instead of from Tailscale’s managed service.
Headscale handles:
- User registration and authentication (including OIDC integration for SSO).
- Node key registration and expiry.
- ACL rules (using a format compatible with Tailscale’s HuJSON ACL language).
- MagicDNS.
- Subnet routes and exit nodes.
- DERP server configuration.
Headscale does not implement every advanced Tailscale server feature; see the Headscale docs and issue tracker for the current status of specific feature parity.
2. The case for managed (Tailscale, QuickZTNA, NetBird Cloud)
Why teams choose managed.
- Zero ops. No server to patch, no database to back up, no DERP fabric to maintain.
- Feature velocity. New features land in the managed product first; self-host catches up.
- Compliance attestations. SOC 2, ISO 27001, GDPR DPAs — the vendor owns them and you inherit for the coordination plane.
- SLA. Uptime guarantees backed by money.
- Support. Humans to ask when something breaks.
- Time to first tunnel. Typically under five minutes from signup.
The managed cost is typically per-user per-month. For a small team, this is often a small fraction of a competent engineer’s fully loaded cost per hour.
3. The case for self-host (Headscale, NetBird self-host, QuickZTNA Workforce)
Why teams choose self-host.
- Data sovereignty. The coordination plane never leaves your infrastructure.
- No per-user pricing. Costs scale with infrastructure, not seat count.
- Licence flexibility. Headscale is BSD-3-Clause; NetBird is BSD-3-Clause; QuickZTNA self-host is the same managed codebase shipped to your infrastructure.
- Customisation. Fork-friendly for open-source options. API hooks for specific workflows.
- Air-gapped capability. Isolated environments where managed SaaS is not reachable.
- Long-term cost predictability. Known infrastructure line items over years.
4. Hidden cost categories in self-host
The “open source is free” framing misses real costs. Honest total cost breaks into six categories.
4.1 Infrastructure
- Compute. A single-instance Headscale deployment is small — 1–2 vCPU, 2 GB RAM. A highly available deployment with PostgreSQL replica and multiple Headscale instances behind a load balancer is larger. Typical AWS bill: $20–$100/month for single-instance; $200–$500/month for HA.
- Database. SQLite is free but single-node; PostgreSQL is recommended for HA. RDS or managed Postgres adds $40–$200/month depending on size.
- DERP servers. If you want your own relay fabric rather than using Tailscale’s public DERP (which Headscale can be configured to do), run DERP nodes — typically small VMs, $5–$20 each per month across a few regions.
- Certificate and DNS. Let’s Encrypt is free; DNS hosting is a few dollars.
4.2 Engineering time for initial deployment
- Typical first deployment: one to two days of a platform engineer’s time for a minimal setup.
- HA deployment with monitoring and backups: one to two weeks.
- Integration with identity provider for OIDC SSO: half a day to two days depending on IdP.
- Initial ACL authoring: half a day to one week depending on policy complexity.
4.3 Ongoing operations
- Security patching. Monthly or on CVE. Budget 2–4 hours per month.
- Upgrades. Headscale releases every few months. Each upgrade is typically 1–4 hours including testing.
- Database maintenance. Vacuum, analyse, backup verification. 1–2 hours monthly.
- Monitoring and alerting maintenance. 1–2 hours monthly.
4.4 Incident response
- First incident will cost 4–20 engineering hours depending on severity and familiarity.
- Ongoing incidents are typically less painful if the team has seen them before.
- SLAs vs self-imposed targets. Self-host means you set and hit your own availability target; there is nobody else to page.
4.5 Feature delivery
- Waiting for features. New features land in the managed product first; Headscale catches up in a subsequent release, sometimes months later.
- Porting features yourself. Contribution back to Headscale for a missing feature is genuinely open, but paid engineering time.
- Maintaining forks. If you patch locally, you maintain the patch across upstream changes.
4.6 Compliance
- Self-host means self-attestation. Your SOC 2 audit now includes the Headscale service operation. Your HIPAA BAA is your responsibility, not a vendor’s.
- Compliance tooling. Logging, SIEM integration, access-control review cadence — all yours.
- Auditor questions. Prepare to answer them about your specific deployment.
5. A fully loaded cost model
A simplified model for a 20-user team running Headscale on AWS in 2026.
| Line item | Monthly cost (USD) |
|---|---|
| EC2 t3.small for Headscale (2 AZ, for modest HA) | 30 |
| RDS PostgreSQL t3.micro Multi-AZ | 60 |
| DERP servers × 2 small VMs in different regions | 20 |
| Monitoring (self-hosted Prometheus/Grafana) | 15 |
| Backup storage (S3) | 5 |
| Data transfer (modest) | 10 |
| Infrastructure subtotal | ~140/month |
| Platform engineer time (amortised 4h/month @ $100/h loaded) | 400 |
| Total loaded cost | ~540/month |
Comparison: 20 users on a managed product at $6–$15/user/month = $120–$300/month. At 20 users, self-host is more expensive on a fully loaded basis by a factor of 2–4×.
At 200 users, the math changes. Infrastructure rises marginally (larger VM, more DERP); engineering time is similar. Managed is now $1,200–$3,000/month. Self-host at ~$600/month is 2–5× cheaper.
The break-even is between 20 and 60 users, depending on the per-user managed price and the loaded engineer rate. Below that, managed is cheaper; above it, self-host is. Adjust for your specific team.
6. When self-host wins
Five scenarios where self-host is the clear answer regardless of the cost math.
6.1 Data sovereignty
Your regulator requires the coordination plane inside a specific jurisdiction, inside your own infrastructure, or inside your VPC. Managed is off the table. Self-host.
6.2 Air-gapped or disconnected
Your environment cannot reach the public internet from the coordination plane. Managed requires internet-reachable coordination. Self-host.
6.3 Regulatory override
Your industry regulator explicitly requires self-hosted cryptographic key material and policy. Common in defence, intelligence, and some financial sectors.
6.4 Scale at which managed becomes expensive
Very large fleets — thousands of users — where managed per-user pricing dominates and the incremental ops cost of self-host is small relative to the saving.
6.5 Customisation
You need a specific feature or integration that the managed vendor will not add. Forking an open-source codebase is the only path.
7. When managed wins
Five scenarios where managed is the clear answer.
7.1 Small team with no platform engineer
A 10-person startup cannot afford to have a developer lose a week to Headscale setup. Managed’s $100/month is trivial compared to that week’s salary.
7.2 Time to first connection matters
If the evaluation is “how fast can we get remote access to staging”, five minutes of signup beats two days of self-host deployment.
7.3 Compliance attestation is your requirement, not your capability
You need a SOC 2 Type II report for an auditor. You do not have the capability to produce one for a self-hosted service. Managed comes with one in the box.
7.4 Feature velocity is the priority
You want every new feature the vendor ships the day it ships. Managed gets them first.
7.5 Consistent 24/7 operations across the globe
Your team is small and cannot staff 24/7 on-call for the coordination plane. Managed SLA covers it.
8. Hybrid patterns
Some teams run both.
- Primary managed, self-host failover. Managed is the default; a self-hosted instance is maintained for disaster recovery or specific regulated workloads.
- Self-host production, managed for developers. Developers have their own managed account for personal experimentation; production infrastructure runs on self-host.
- Per-region split. Managed in regions where compliance allows; self-host in regions where it does not.
These patterns add operational complexity. Only worth it if a specific constraint makes single-mode impossible.
9. Operational playbook for Headscale
Specific operational recommendations for a production Headscale deployment.
9.1 Infrastructure baseline
- Linux VM, minimum 2 vCPU and 4 GB RAM.
- Separate PostgreSQL instance (not SQLite in production).
- Reverse proxy (Caddy or nginx) handling TLS.
- Let’s Encrypt certificates with auto-renewal.
9.2 Database
- PostgreSQL 15 or 16.
- Daily automated backups to object storage (S3 or equivalent).
- Weekly restore tests — back up is worthless unless you have tested restore.
9.3 High availability
- Two Headscale instances behind a load balancer.
- PostgreSQL with streaming replica.
- DERP across two or more regions.
- Health checks on the load balancer level.
9.4 Observability
- Prometheus metrics endpoint exposed by Headscale.
- Grafana dashboard for key metrics: active nodes, failed auth attempts, database query latency.
- Log aggregation (Loki, ELK, or cloud-native log service).
- Alert rules for: server health, database connectivity, certificate expiry, backup failures.
9.5 Upgrades
- Tag stable versions; avoid nightly builds in production.
- Test upgrade in staging first. Rehearse rollback procedure.
- Schedule upgrade windows outside peak hours.
9.6 Key rotation
- Node keys rotate automatically on Headscale’s cadence.
- Pre-authorisation keys should be short-lived and scoped to specific users or purposes.
10. Decision framework
Five questions in sequence.
- Do you have a hard data-sovereignty or air-gap requirement? If yes → self-host.
- Is your team size below the break-even (typically 20–60 users)? If yes → managed is likely cheaper.
- Do you have in-house platform engineering capacity? If no → managed.
- Is compliance attestation your responsibility (you need to show auditors your own SOC 2)? If yes → self-host and invest in the attestation. If you can inherit from a vendor → managed.
- Is there a feature-velocity premium in your evaluation? If yes → managed. If no → either.
At the end, if the cost math, compliance requirements, and team capability all point to self-host, Headscale is a good choice for a Tailscale-compatible deployment. NetBird for an open-source-first deployment. QuickZTNA Workforce for a proprietary-with-self-host deployment that includes post-quantum on tunnels.
Further reading
- Headscale GitHub repository. Source, issue tracker, changelog.
- Headscale documentation.
- Tailscale knowledge base.
- NetBird self-host docs.
- QuickZTNA Workforce documentation.
Related reading on this blog
- The Best Tailscale Alternatives in 2026
- NetBird vs Tailscale vs QuickZTNA
- Open-Source vs Managed ZTNA: A Decision Framework
- ML-KEM-768 Explained
Try QuickZTNA
QuickZTNA is managed by default, with self-host available on the Workforce tier. If the cost math above tips toward managed for your team and post-quantum is a requirement, start on Free. If self-host is non-negotiable and open source is not required, contact sales for the Workforce self-host brief.