Technology

Scaling Kubernetes Creates New Security Challenges - Here's How to Overcome Them

It shouldn’t really be surprising that the more expansive your Kubernetes environment becomes, the harder it is to manage and secure. But, while some Kubernetes security practices such as container scanning have become relatively standard, there are many blind spots and open attack vectors that are unique to Kubernetes and are not even close to being addressed by most organizations today, even those that are quite advanced in their Kubernetes adoption journeys.

Rather than diving into an endless list of nichy one-off tactics or (even worse) acronyms, we would like to take a step back and think more strategically about where Kubernetes security even fits within a cloud-native security framework.

What really is the relationship between Kubernetes, Application Services, and Infra? It seems like a simple enough question, and yet the answers can be surprisingly wide-ranging depending on who you ask.

Understanding the nuance can help teams wrap their heads around some important moving parts that have critical impacts on modern application security.

Phase One - The Lift and Shift

In the first phase of cloud adoption, most applications were either lifted and shifted to the cloud “as is” or were architected as a monolithic application. This first phase of cloud adoption remains a present reality for many organizations, especially those that saw tremendous technology-driven growth before Kubernetes became the de facto standard.

From an application versus infrastructure perspective, the boundaries in this monolithic architecture are more clearly defined. Applications are the processes running inside VMs, while everything else - namely the VMs, VPCs, network subnets, storage volumes, databases, etc. are infrastructure (abstracted layers from the perspective of application development teams).

In this phase of cloud adoption, the definition of what is an application and what is infrastructure is more or less based on which team ends up being responsible for provisioning and operating each layer.

Developers code and test the applications, while DevOps, networking and security infrastructure engineers provision and configure the underlying infrastructure including VMs, networks, storage, databases and their likes. They operate them in ways that would achieve desired application availability, resilience, security and performance goals in line with business goals.

If there is a problem, central SRE/DevOps/Network Security teams are responsible for troubleshooting the infrastructure layers, while application developers, fairly unaware of the underlying cloud architecture, troubleshoot problems within their application code.

Security engineers are responsible for configuring things like VPC isolation, public and private subnet architectures, security groups and firewalls at the network layer to secure applications at runtime. As the cloud and application architecture is relatively static, teams have the institutional knowledge about patterns and behaviors manifested in the operational logs and metrics which may be good enough for manually troubleshooting both performance and security-related scenarios.

However, businesses born today have an ever-growing need to accelerate their pace of innovation, enabled by building on top of the cloud since their inception.

Phase Two - Acceleration of Scale

Enter the second phase of cloud adoption - with microservices, containers, and Kubernetes.

In this phase of cloud adoption, teams are either starting to build their apps as cloud-native domain-based microservices or breaking apart their monoliths into a collection of microservices in the hope that this architecture shift will help teams innovate rapidly while staying agile. Kubernetes has become the de facto operating system or platform of choice for such containerized microservices with its adoption ever-increasing.

While Kubernetes presents many benefits, one of which is adding a layer of abstraction for application developers from underlying cloud infrastructure details, it remains complex to adopt and operate.

The boundaries of even defining what is an application and what is infrastructure start to get blurry when applications run on top of Kubernetes.

Kubernetes adds a whole slew of fine-grained abstractions that are more application-centric compared to coarser abstractions like VMs. An application is not only just application code anymore, but also, at the very least, six different application resources, namely - Containers, Deployments, Pods, Replicasets, Services, and Ingress.

Abstractions make systems easier to comprehend (at least on the surface), so why are Kubernetes applications so hard to secure?

Challenge One: Distributed Systems Are Harder to See and Secure

Kubernetes is inherently a distributed desired-state driven system. Kubernetes resources like containers, deployments, pods, services and so on are created and mapped to an application. However, it is often hard to understand which resources map to what application.

Kubernetes uses a label-based mechanism to maintain this mapping, however existing tools today do not carry this context over in easy-to-understand visualizations that can help discover this distributed mapping, making it difficult or nearly impossible to capture all of the context around application interactions – and vulnerabilities – that lurk within it.

Additionally, identity and RBAC create their own challenges. Teams need a framework to map user identities/roles and access permissions to each of these fine-grained abstractions, as well as a mapping of machine identities/roles and what activities they perform at runtime. Creating and maintaining this kind of mapping in a dynamic and ephemeral system is extremely challenging and must be approached with the right kind of automation in mind that can be executed consistently, without drift, and without impact on application performance.

Applications are also not monoliths anymore - they are a collection of services, 3rd party APIs, and data connectors that together encapsulate an application. Imagine how the number of containers, deployments, pods and other resource configurations to be tracked starts to scale as the number of services and APIs in an application grows.

In order to make application security work in the modern context, the first hurdle to cross is understanding which Kubernetes resource is the one that a security engineer should even be looking at. When they are faced with hundreds of distributed resources, deciding basic priorities becomes nearly impossible. Which application container should they patch for vulnerability mitigation? Or which pod should be identified within a Kubernetes Network Policy to disallow unknown outgoing traffic? This exercise alone often takes many hours of engineering time, and often still concludes with open gaps that were missed.

Challenge Two: System Data and Configuration is Ephemeral

Some resources that Kubernetes creates when an application is deployed may not live as long as certain other more permanent resources.

Pods are recreated with different names when the containers or configurations within them are updated. Kubernetes events and audit logs that contain valuable information on security events also expire after some time unless configured with a log backend; Yet, importantly, the pace at which events happen within a cluster far exceeds manual processing capacity. This makes gleaning actionable insights from container and cluster logs about insecure application behavior very challenging.

IP addresses are ephemeral resources, with pods and containers changing IP addresses every time they restart. This leads IP address-based rules for enforcing segmentation to become obsolete. Instead, these need to be defined in terms of more permanent identifiers such as application identities.

Challenge Three: Taking Action is Hard

Kubernetes makes the process of letting application developers define desired application behavior more explicit and programmable.

This means that all aspects of an application’s behavior - such as its security, lifecycle management, CPU and memory resourcing, scaling, health-checks, load balancing, and traffic behavior are programmable within a Kubernetes application definition via its many resources like pods, services, and so on. This is a very important distinction from previous VM-based platforms, where developers did not have a say in things like sizing VMs, how VM traffic was routed, or how DB connections were load-balanced.

While the Kubernetes model of making all system configuration application-centric and desired-state-driven offers many benefits, it also means that someone must define those requirements and resources in ways that are correct, consistent across application versions, and optimal.

The process of configuring these resources as YAML and JSON remains extremely manual, sprawls quickly, and is prone to errors. A lack of clear ownership between Developer, Platform and DevSecOps teams leads to risky misconfigurations and sub-optimal application behavior. Code analysis tools and infra management tools cannot tackle this either, since the configurations that optimize the application behavior need deep runtime insights that are not available to static tools.

A high severity misconfiguration could lead to a pod or set of pods being able to run in a ‘privileged’ mode, which could lead to the container within the pod gaining unauthorized access to the host level resources like filesystems. User and workload identities within Kubernetes clusters are configured by default to be grossly over-permissioned, making the likelihood of a security incident spreading through over-permissioned service connections much higher than many engineers would like to admit.

Current best practices around RBAC that allow these identities only a minimum set of permissions gets in the way of application development velocity causing friction between developers and security teams, but it doesn’t have to be that way.

While Kubernetes brings on a net new set of challenges and complexity for security teams, it also provides a flexible and programmatic way to codify security best principles along with application containers that are enforced within the Kubernetes runtime, if configured correctly. The Kubernetes runtime layer could become an effective interface between applications and cloud infrastructure where developers and security teams come together to define how applications could be Secure-By-Default using proactive and scalable techniques such as Policies-as-Code that help codify and enforce secure behavior from the get-go.

This blog post contains excerpts from our recently released book, Redefining Application Security for the Modern World, co-authored with our CTO, Priyanka Tembey and CEO, Vrajesh Bhavsar. Request a free copy while supplies last.

To learn more about how Operant can help make your Kubernetes environments Secure-by-Default, securing your modern applications from the inside out with a 5-minute, zero instrumentation install, please reach out to hello@operant.ai or request your trial.