Engineering a Platform at Atlassian: An 8-Year Journey in Edge Infrastructure¶
Original URL: https://www.youtube.com/watch?v=55pTFVoclvE
Introduction¶
This article summarizes a detailed reflection from an engineer who spent eight years at Atlassian, building and scaling the company's edge infrastructure. The narrative covers both the technical achievements—such as creating a self-service load balancing platform using Envoy proxy and AWS—and the personal growth that came from navigating complex team dynamics, mentoring, and maintaining large‑scale systems. The insights are valuable for platform engineers, infrastructure architects, and anyone interested in the practical realities of building internal developer platforms.
Technical Foundations: The Open Service Broker¶
The engineer's first major project was an Open Service Broker—a web application that allowed internal developers to provision load‑balancing resources on demand. Built initially in Python with Flask and later migrated to FastAPI, the broker acted as an API gateway between developers and infrastructure.
Key components: - Client → FastAPI web server → sends provisioning requests to an SQS queue - Worker processes tasks asynchronously, creating DNS records, CloudFront distributions, or other AWS resources - DynamoDB stores the state of each provisioning task - The client polls the broker until the resource is ready
This architecture enabled self‑service load balancing, replacing manual requests to the platform team and reducing time‑to‑provision from days to minutes.
Building the Envoy Control Plane¶
The next major piece was an Envoy‑based management server (open‑sourced as Sovereign). The goal was to replace expensive enterprise load balancers with commodity, cloud‑native proxies that could be reconfigured dynamically.
- Sovereign is a FastAPI app that serves configuration to Envoy proxies.
- It uses templates (for clusters, routes, listeners) and context (dynamic data from the broker and other sources like S3) to generate Envoy configuration.
- As developers provision or modify their services, the context changes, and the proxies update their behavior in real time.
The system gave the team a highly flexible, programmable proxy layer that could centralize concerns like authentication, rate limiting, and access logging before traffic reached backend services.
Infrastructure Automation and Scaling¶
With the broker and control plane in place, the engineer focused on automating the proxy fleet. Key elements:
- AMI creation: Built using Packer and SaltStack to produce a standard image containing Envoy, logging agents, security hardening, and observability tools.
- CloudFormation templates deployed the proxies across 13 AWS regions, each stack including VPC, subnets, Auto Scaling Groups, security groups, and NLBs.
- The result was a fleet of 2,000+ proxy instances that automatically configured themselves using the control plane, handling traffic for Atlassian's products (Jira, Confluence, Bitbucket, Statuspage, and others).
This infrastructure provided the foundation for a multi‑tenant platform where any service could opt into advanced routing, access control, and observability without requiring each team to implement those features themselves.
Migrating Products and Centralizing Concerns¶
Once the platform was stable, the team undertook a multi‑year effort to migrate all Atlassian microservices to use the new edge infrastructure. This involved:
- Enforcing a zero‑trust model—services could no longer be accidentally public; explicit configuration through the broker was required.
- Adding sidecar services for authentication (written in Rust), authorization, and rate limiting, all dynamically configured via the control plane.
- Using network filters (e.g., HTTP Connection Manager) to add access logging, DOS protection (via CloudFront), and other cross‑cutting concerns at the proxy level rather than in each backend.
The result was a centralized security and observability layer that drastically reduced duplication of effort across hundreds of microservices.
Non-Technical Growth and Lessons¶
The engineer also reflected on the softer skills developed over the eight years:
- Diplomacy and conflict resolution: Working with diverse personalities and management styles required learning to anticipate and navigate disagreements constructively.
- Mentorship: While skilled at explaining complex systems, the engineer found true mentoring—balancing guidance with autonomy—challenging, especially as a first‑time mentor. An intern project achieved the highest rating, but the engineer modestly credits the intern's own talent and support from other colleagues.
- Maintenance and technical debt: Building software is easy; keeping it maintainable over time is hard. The engineer observed that codebases develop "churn patterns" that indicate areas of increasing complexity. Proactive documentation, onboarding, and modular design are essential to avoid degradation.
Conclusion¶
This eight‑year journey at Atlassian is a case study in building a platform from the ground up: starting with a simple broker, adding a dynamic control plane, scaling to thousands of proxies, and centralizing critical concerns. Beyond the technical architecture, the engineer's growth in communication, conflict management, and mentorship highlights the human side of platform engineering. The lessons here—both technical and personal—offer a realistic blueprint for anyone undertaking similar infrastructure transformations.