A Guide to Platform Engineering: Unlocking the Power of Automation

Introduction

Organizations today have realized that technology is an integral part of being competitive and it often forms a strategic investment. Companies differentiate on the basis of their technological execution and their investments in technology

There are several examples of this, right from banking where a technologically savvy bank that provides the right tools and capabilities of their digital platform are able to better serve the end user as well as realize operational efficiencies. The same can be seen in airlines for instance, how many times did you pull out your phone for the boarding pass and scan it for a more streamlined experience?

But at the center of this technological advantages is how fast can the software organization deliver the necessary technological innovations needed to achieve this?

The last few years we have seen an advent of several tools and cloud-native technologies. Cloud native architectures have gotten complex. There are various concepts and tools like Kubernetes, Infrastructure Provisioning APIs, Pipleline software for CI/CD and configuration management that need to be well understood. This slows down the software release process, since selecting tools and building operational infrastructure to host the software takes longer.

To build velocity in software releases, platform engineering is a key discipline. You can’t have an airplane that has a steam engine to power it

An airplane can't work with a steam engine, it needs a jet engine! (Acknowledgements: Image generated using imagine.meta.com)

Definition of Platform Engineering

Platform engineering is the process of creating a reusable workflows that help developers deploy their application faster on infrastructure.

Platform engineers which are a part of Platform Engineering team, work with tools, cloud-native projects, cloud and cloud APIs to provide a set of abstractions that enable -

  • self-service: help the developer in that organization deploy their software
  • tools-choice: meet the requirements of an organizations in terms of tools use
  • tools that meet the licensing requirements eg: apache or MIT licensed tools
  • tools that have the key capabilities required eg: If a team needs machine learning workflows, such tools are required to be a part of the platform
  • cloud-choice: ability to deploy on a cloud or a set of clouds
  • security: ensure that the underlying infrastructure is secure by design
  • compliance: ensure that compliance requirements are met
  • flexibility: how easy it is to switch a tool for another. this is often a requirement when there are multiple teams in an organization that use the work of platform engineer, but may need to use different tools for their workflows. eg: ArgoCD vs FluxCD
  • rapid deployment: reduce or eliminate wait time for developers
  • scalable platform: ability to deploy compute and storage depending on application requirement
  • high availability and disaster recovery: when an applications have stringent requirements for uptime, the infrastructure may be running in multiple zones to provide for necessary SLAs. Platform team can abstract these capabilities to make it easier for the application architect to achieve this from the infrastructure. Similarly, disaster recovery abilities can be baked into the platform
  • automation: a platform built for developers may have a front end portal or an api to facilitate automation

The set of abstractions built by the platform team can be referred to as the Internal Developer Platform that can be leveraged by the developer. Specific workflows in the Internal Developer Platform allow a developer to accelerate the release velocity for their software

Platform as a Product with Developers as its customers

More tools today are designed with automation in mind and are also API-first. These tools lend themselves well to automation or to enable automation. Eg: Infrastructure as code principles allow automating provisioning infrastructure. However, the challenge is the number of tools available for this, eg: there is Terraform (now also terratofu), pulumi, cloud SDKs, crossplane and others. Which one would you use? Which one fits well when considering the licensing requirements? Which one works well with a cloud of choice?

The DevOps approach promoted developers run operations using these tools. A developer would own a service right from development to deploying to supporting it. But as the number of tools increase, the complexity of running operations also increases, and it causes excessive load on the developer. Just because of the choice and complexity of tools, and excessive time spent on operations, the feature development velocity suffers.

Instead of the developer building workflows using tools, what if the operations engineer well versed in these tools setup an API for the developer to call? This would make it simple enough for the developer with the ability to drive infrastructure that matches it’s software flow.

This operations engineer that provides a workflow to the developer to provision infrastructure is the platform engineer. They iterate over the platform while making it more friendly for the developer while meeting other concerns needed from the platform - security, selecting tools, automation, cost optimization etc.

The platform is a product built for the teams in an organization to meet the needs of the developer. The platform engineers are well versed in tools needed to build a platform while they are deeply aware of the expectation of the developer and software teams in the organization.

What is Platform Engineering

Platform engineering is focused on building the Internal Developer Platform product. It primarily focuses on -

  • Improve Developer Experience
  • Provide choice to developers for what works best for their application
  • Build platform as a product with self-serve by developers being the goal

This is a continuous process. More tools get incorporated and developer workflows and requirements for the applications also change. The platform engineering team conducts research on the application and its infrastructure requirements by talking to developer and builds and updates the Internal Developer Platform

Platform Engineering involves having the Internal Developer Platform evolve - it has developers filing issues, releases to incorporate features and a product manager ensuring the product meets the requirements.

How Does Platform Engineering Work

Platform engineering creates golden paths using tools and workflows for developers in an organization. The team understands the application lifecycle to understand the operational requirements of deploying, scaling and maintaining the uptime.

As platform engineering applies to all development teams in an organization, the find common problems to solve across development teams and make them a part of the platform. The goal is to ensure software delivery to optimized for all teams

Typically the output of these activities is interconnection software for developers to interface operations and infrastructure. But it also involves conducting architectural reviews to inform about the platform approach to different teams while collecting feedback

Finally, a feedback look established along with multiple development teams ensure flexibility for developers to choose the tools they want declaratively

Benefits of Platform Engineering and Automation

Often, there is more than one team that needs  self-serve capabilities. An Internal Developer Platform provides automation for all the teams. This automation, while promoting self-serve saves a lot of operational overhead for each of the teams using the platform.

An automation of workflow for the developers aligns the platform to their expectations, which reduces the cognitive load on them, thereby freeing up more of their time to focus on shipping features.

Infrastructure provisioning is an integral part of deploying and releasing software. Provisioning can slow down operations in absence of effective automation to meet the developer workflows. It can also result in lost developer productivity if they are ultimately responsible for infrastructure. Platform engineering ensures that developer time is not lost on these tasks.

The benefits of platform engineering are clear and Gartner predicts over 75% of organizations will have platform engineering teams, eliminating friction between operations and developers.

Are DevOps and Platform Engineering different?

DevOps is developer running operations using tools that they have built. Platform engineering shifts the tool development from the developer to the platform engineer. The developer still is the user of the tools that drive the provisioning and operations, but do not have to work on building them.

Platform engineering also formalizes these tools development into a platform, just like development of any other software. Issues can be filed against the platform by a developer which are developed by the platform team using Infastructure as versioned code.

Who is a part of Platform Engineering Team

Platform engineering consists of several roles coming together -

  • SREs, DevOps and Product Managers
  • Automation engineers who automate parts of infrastructure or workflows
  • Infrastructure engineers that write Infrastructure as Code scripts to automate all things infrastructure needed to deploy software or enable developer workflows eg: canary release
  • Product managers who build the Internal Developer Platform and iterate over it

A developer’s view of Platform Engineering team

The developer is a beneficiary from the platform approach. For a developer, the platform provides abstractions that save time and makes them more efficient -

  • A developer relies on the platform engineering team to help provision infrastructure
  • They provide feedback to the platform engineering team to build workflows to support their day-to-day operations related to infrastructure and the expectations from infrastructure when deploying their application
  • Say an engineer needs a Kubernetes cluster to test their own application. A platform engineering team could have provided simple abstractions that can be used to declare this. The platform team product would ingest this spec and create an environment for the developer.
  • Additionally during microservices development, there are several interdependent microservices that need to be provisioned to test an application. The platform engineering team can set this up with a specific version of software
  • A Developer can release or rollback their application without the involvement of operations. The abstractions to achieve these operations are designed and built by the platform engineering team that the developer can leverage
  • Lets them focus on building the application while eliminating cognitive load associated with operations and tools

Conclusion

Platform engineering done right can increase velocity of developers, thereby making software development and delivery process more efficient.

As Platform Engineering can easily digress to doing reliability engineering, It’s necessary to focus on the mission at hand to build an efficient platform for developers. This often means a clear definition of role and responsibilities for members of the platform engineering team.

Platform engineering involves striking the right balance between developer autonomy and standardization, while ensuring security and compliance of an organization are met

Platform engineering team needs to be a mix of several skills including product development, automation and DevOps.

Cloud-Native tools bring containers and cloud-agnostic approach to software delivery. Tools like EnRoute Kubernetes Ingress API Gateway provide a way to connect and secure traffic for microservices. Coupled with platform engineering, Ingress API Gateway helps automate security and connectivity in the software delivery process.