Kubernetes Workload Identity
Applications talk to each other, more often than anything else, to get the work done. Applications/Microservices typically have a fixed pattern of communication, for e.g, a backend server will most likely talk to a database server, a log server and take ingress traffic from the frontend server. These fixed patterns of communications usually translate to access control rules and setting up these access control/authorization rules requires one to “identify” these applications aka workloads. The Principle of Least Privilege (PoLP) states that every workload must be able to access only the information and resources that are necessary for its legitimate purpose. Thus the workloads need to be identified at a granular level using a unique and entire set of attributes so that the authorization frameworks have the flexibility to put in the access control rules as desired.
Identity and Zero-Trust (Identity boundary = Trust boundary)
Zero-trust requires that the services within the network use strong authentication and authorization policies and ensure that the principle of least privilege is effectively followed even for the east-west traffic and even for the services within a particular application provider space. A strong Identity solution forms the basis to achieve this.
How is Identity handled today?
K8s Service Accounts
K8s service accounts are the popular choice in k8s-orchestrated pods for service-based identity. As an application developer, you could generate individual IAM service accounts for each application and then download and store the keys as a Kubernetes secret that you manually rotate. Every service account has its corresponding certificate which would be used for service authentication. Service account keys expire every 10 years by default. Thus if you want short-lived secrets then you need to have a system to manually rotate the secrets. The possibility of long-term key exposure and the management overhead of key inventory and rotation usually is a major hindrance in the security promise of this system. The onus of handling security falls on the shoulders of application developers who might be less equipped to handle the security primitives with their corresponding best practices. K8s ServiceAccountToken (or Projected Service Account Token) allows you to generate a token from a Service Account and specify additional constraints such as expected audience and token lifetime.
In general, ServiceAccount based solutions suffers from the following problems:
- Service accounts may not take into consideration all the attributes of the workload. For e.g, if an application developer has clusters across multiple regions then the workloads could be spawned across these regions. A service account based identity cannot provide a mechanism for finer attributes such as the location of the pod to handle the authorization.
- Service accounts or Service account tokens are valid only within a given cluster.
- No server authentication! The client is authenticated by the server using the service account credentials but there is no way for the client to authenticate the server.
- Dependency on the k8s control plane. The validation of tokens needs to be done with the K8s API server which means the data plane has an impact if there is any control plane downtime.
K8s Certificate APIs
K8s Certificate APIs allow a pod with appropriate RBAC permissions to send a CSR based on a locally generated key pair. The following call flow shows how a pod can acquire a signed certificate using k8s certificate APIs.
The “Approver” could be a user or automated tooling that can help validate the CSR and then sign the certificate. This is no simple job since the failure to validate or attest the originator of the CSR could easily lead to a big hole in the overall security. The signed certificate eventually can be used for the mTLS handshake and at the end of the handshake the pods could use the certificate information to authorize the flow.
The application developer will have to handle the following aspects of the design:
- Defining an “Approver”. On what basis will the approver attest or authorize the pod’s CSRs? Here the service provider has to put in a strong attestation framework that would authorize and allow certificate signing. The security promise of the identity will depend on this attestation framework.
- What part of the certificate would the pods use to identify each other? Pods can belong to certain nodes, cloud-service providers, location, namespace, etc and this needs to be carried as part of the certificates. The application developer has to define this aspect and make sure that all the entities adhere to this certificate format.
- Handling key rotations. The application developer has to put in place a system to rotate the keys on a periodic basis.
GKE Workload Identity: Cloud Provider managed Identity
GKE workload Identity was developed with one primary use case in mind.
“How to simplify multi-cluster GKE k8s workloads access the Google cloud services using Google Service Accounts (GSA)?”
Without GKE workload identity, typically the flow an application used to follow is:
- Download the JSON blob for GSA and add it as k8s secret. This k8s secret had to be added to every GKE cluster that has a workload accessing the Google cloud service.
- Tell the k8s workload to use the k8s secret to allow access to the Google cloud service.
The primary problem here is the same as the developer faces with any other credential burnt into the workload:
- How do I rotate the credential in case of a compromise? GSA tokens are long-lived (default 10 years).
- There could be thousands of user-managed google services, thus there could be thousands of GSA tokens that have to be managed by the developer.
GKE workload identity introduces a notion of Identity namespace. An identity namespace is enabled when the workload identity is enabled for the cluster. While the k8s-namespace is local to a given k8s cluster, the Identity namespace is valid across multiple clusters within a given project. The Identity namespace binds the k8s-namespace and k8s-service account to a given Identity.
As an example, consider the following deployment:
Note that the identity namespace is the same across the cluster since the k8s-namespace and k8s-service account names are the same. The backend pods here are part of two different clusters belonging to the same project (project-acme) and the admin has used the same namespace and service account.
Note that there is no binding to the GSA (GCP Service Account) done as of now. The binding is done through a ServiceAccount annotation.
Once this is done, one needs to bind the GSA with the identity namespace using the ‘add-iam-policy-binding’ primitive provided as part of gcloud SDK.
gcloud iam service-accounts add-iam-policy-binding \
— role roles/iam.workloadIdentityUser \
— member “serviceAccount:project-acme.svc.id.goog[backendns/roleX]”
In short, GKE workload identity allows one to bind the k8s-namespace/service-account name to an Identity-namespace. This identity namespace could then be bound with the GSA thus enabling the k8s-workloads to use that GSA indirectly through their namespace+service-account name.
Using Shared Secrets
This solution comes up very often in the Identity discussion. What if we have a secrets manager such as Vault or k8s secrets manager that can keep the credentials and provide access to those credentials to the right applications. The Vault can be shared across multiple clusters and thus does not have to operate within the auspices of a given k8s cluster. For e.g., a MySQL client can retrieve the access credentials to the MySQL server from the Vault to access the MySQL server.
Fundamentally this does not solve the problem of Identity since it is still not established as to which application should be provided access to the credentials. How to put authorization rules for credential access? In the MySQL example above, how would the Vault know which application is the MySQL client? Thus a shared-secrets manager does not solve the problem of Identity. Keeping shared secrets securely in safe storage and limiting access to it is a different problem and one which also requires a strong notion of application identity.
To sum up
- Getting Certificate Management right is hard, getting Identity handling right is even harder.
- Given the thrust towards Zero-Trust Architecture, Identity will play a central role in any organization that intends to be securely online.
- Application Identity is intricately tied to several other problems (such as Secrets Management, Authorization, etc) and thus there is no excuse for not doing it right.
- There are several ways to get Identity wrong. Leave it to the experts. Developers should not be dealing with Security elements such as Authn, Authz, Accounting, Secrets management. These should be architectural elements transparently available for any application.
I help fin-tech digital product teams to create amazing experiences by crafting top-level UI/UX.