How we secured calls between microservices at ManoMano

How we secured calls between microservices at ManoMano

Another step towards zero trust for ManoMano’s information system.

No matter how strong and high your peripheral protections may be, someone, someday, will manage to overcome them. You should therefore never consider your backend to be a fundamentally secure place.

If, like us, you have opted for a microservices-based architecture, it means that someone, one day, will take control of one of your components and will try to exploit every opportunity to steal your data or damage your system. Your mission is therefore twofold: on the one hand, to do everything to make their task difficult, if not impossible, and on the other hand, to put in place tools to detect any suspicious activity.

In this article, I will focus on the first part by explaining to you how, at ManoMano, we limit the possibility for a hacker to navigate as he pleases in our system. Explaining how we monitor will be the subject of another article.

In a microservices-based architecture, it’s all about the call tree where clients follow paths offered by service providers. It’s a constructive and positive way to look at it, but from a hacker’s perspective, it’s also an opportunity to explore all of your APIs from the moment they’ve managed to break into the system. So the first thing to do is to limit the possibilities of calls between components to those legitimately accepted by the functional architecture and to reject all the others.

In this way, a hacker who has taken control of a component is limited in his movements and in the APIs he can query to those that this component has been authorized to use. Accessing a particular API will require a lot more effort from him since he will have to take the right entry point, the one that will allow him to end up ultimately at the API or backend he is looking for. Finally, this will require on his part a perfect preliminary knowledge of your microservices architecture: things then get slightly complicated for him.

Limit the possibilities of calls

ManoMano has adopted Istio as their service mesh to secure communication between microservices by using its authorization policies. Istio provides all the necessary components for microservices identification, authentication, and authorization, including a Certificate Authority (CA), an API server for distributing policies and secure naming information, and sidecar and perimeter proxies that function as Policy Enforcement Points (PEPs). Envoy is used to implement PEPs, and the control plane manages configuration from the API server to configure the data plane PEPs. The architecture is illustrated in this diagram provided by Istio documentation:

The Istio identity model uses service identity to authenticate and authorize communication between services. Each service instance has its own unique identity, which can be used to establish trust between services in the mesh.

Service accounts are used as service identities in our Kubernetes platform, while other identities such as user accounts or custom service accounts can be used in other platforms. By using service identities, Istio can provide fine-grained control over access to services, enforce policies, and provide audit trails.

Istio uses X.509 certificates to provision strong identities to every workload, and automate key and certificate rotation at scale. Istio agents work with istiod to sign certificate signing requests (CSRs) and generate provides all the necessary components for identification, authentication, certificates. Envoy requests the certificate and key from the Istio agent via the Envoy secret discovery service (SDS) API. The Istio agent monitors the certificate’s expiration and performs periodic certificate and key rotation.

Istio provides two types of authentication:

peer authentication for service-to-service authenticationand request authentication for end-user authentication.

Peer authentication uses mutual TLS for transport authentication with strong identity, security, and key management automation. Request authentication supports JSON Web Token (JWT) validation and custom/OpenID Connect providers. Authentication policies are stored in the Istio config store and Istiod keeps them up-to-date for each proxy.

ManoMano has adopted Kubernetes to automate the deployment, scaling, and management of containerized applications. Istio therefore uses service accounts to identify workloads and provide mTLS connections. At the moment, we haven’t connected our Keycloak user authorization server to Istio, so we just give it service-to-service authorization but no per request authorization.

Service-to-service authorization policies

Istio’s access control is enforced through authorization policies, which are specified using YAML files. For example:

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: httpbin
namespace: foo
spec:
selector:
matchLabels:
app: httpbin
version: v1
action: ALLOW
rules:
– from:
– source:
principals: [“cluster.local/ns/default/sa/sleep”]
– source:
namespaces: [“dev”]
to:
– operation:
methods: [“GET”]
when:
– key: request.auth.claims[iss]
values: [“https://accounts.google.com”]

These policies define rules for controlling access to specific microservices or paths within a microservice, based on various criteria but at ManoMano, we base it only on calling microservice identity. When a request is received by an Envoy proxy, it is processed through a chain of filters, including the authorization filter. This filter consults the appropriate authorization policy to determine whether the request should be allowed or denied based on the defined rules.

Istio’s authorization features are available after installation, and there’s no need to enable them explicitly. However, to enforce access control for your workloads, you’ll need to apply an authorization policy. If there are no authorization policies applied to a workload, Istio will allow all requests. Conversely, as soon as an authorization policy is applied to a component, all calls not explicitly authorized are now prohibited. This is an extremely important rule to know when deploying a security policy based on this model.

Authorization policies in Istio support three actions: ALLOW, DENY, and CUSTOM. You can apply multiple policies with different actions to secure access to your workloads.

To configure an authorization policy in Istio, you create an AuthorizationPolicy custom resource with a selector, action, and rules. The selector specifies the target, the action specifies allow or deny, and the rules specify when to trigger the action based on sources, operations, and conditions. Matching schemas can be used for fields, and exclusion matching is available for negative conditions. Additionally, the “deny by default” behavior applies if there are no ALLOW policies, and the source section can be left empty to allow access for both authenticated and unauthenticated identities.

At ManoMano, each development team responsible for a client microservice has the ability to push an authorization policy into a service provider’s GIT repository. It results in a merge request which, if accepted, will push the client authorization policy in the Istio infrastructure. But in order to simplify the deployment process for developers and avoid the need for them to support a new technology, ManoMano has decided to integrate the technology into the existing Infrastructure as Code library as a dedicated extension for dependency management.

Spinak-IaC

Spinak-IaC is our next-generation Infrastructure as a Code library built on top of CDK-Terraform and CDK8s to manage application infrastructure. It provides a curated experience to reduce cognitive load and infrastructure complexity, while also embedding security best-practices by default.

With Spinak-IaC, we improve developer experience, speed up time to production, reduce misconfiguration and infrastructure deployment issues, and get started quickly with a turnkey solution that is secure and ready to go. It allows us to code our infrastructure the same way we code any other application, and provides a powerful and fully customizable pipeline based on our guidelines to help our developers get started with ease.

SpinaK has a dependency and authorization system that distinguishes between local and external components. Local components are owned by the current project, while external components belong to other projects. Dependency management involves configuring a component with the necessary information, such as environment variables, to access another component using the configureAccess function:

target.configureAccess(client);

Additionally, SpinaK has added an authorization mechanism to explicitly grant access to specific components between different projects in a controlled and secure way. Each component has its own specific authorization process, but SpinaK provides a common way to manage it by calling a function starting with “grant” on the target component from within the owning project.

target.grantAccess(client);

The SpinaK service-to-service authorization feature requires the use of Istio as the service mesh provider, with the authorization feature enabled.

In a scenario where there are two projects (project A and project B) and three services (service A, service B, and service C), service A from project A needs to call both service B from project B and service C from project A.

To achieve this, the following configuration is required:

Service A must be configured by an ExternalService B object using the function extServiceB.configureAccess(serviceA) to know how to contact Service B.Service A must also be configured by a service C object using the function serviceC.configureAccess(serviceA) to know how to contact Service C.Service C must grant access to Service A using the function serviceC.grantAccess(serviceA, {…options}).Service B must grant access to Service A using ExternalService A object using the function serviceB.grantAccess(extServiceA, {…options}).

Below the corresponding SpinaK configuration.

Project A configuration

const app = new App();

const serviceA = new EksApi(app, …);
const serviceC = new EksApi(app, …);
const extServiceB = new ExternalEksComponent(app, …);

extServiceB.configureAccess(serviceA); // Injection of the variables needed to contact service B from service A

serviceC.configureAccess(serviceA);
serviceC.grantAccess(serviceA, {
resources: [
{
paths: [‘/api/*’],
methods: [‘GET’]
}
]
});

app.synth();

Project B configuration

const app = new App();

const serviceB = new EksApi(app, …);

const extServiceA = new ExternalEksComponent(app, …);

serviceB.grantAccess(extServiceA, {
resources: [
{
paths: [‘/api/*’],
methods: [‘GET’]
}
]
});

app.synth();

There are additional factors to consider in achieving the goal of authorizing every service to consume another. The adoption process will result in some services running inside the Istio mesh, which will be authenticated and known by Istio, while others will run outside the mesh and be anonymous from the Istio perspective. This distinction will have significant implications for the ability to authorize services.

Deployment

When we have implemented this new security model, Istio was not deployed to all of our components. Some were still using our old service mesh. Therefore, we have considered three types of components: those whose consumers and themselves were all managed by Istio, those who were not managed by Istio at all, and those who were managed by Istio themselves but at least one consumer was not.

The first category is the simplest in terms of security policy. In this category, each component has migrated on Istio and has the ability to deploy an authorization policy. Once all consumers of a service have migrated to Istio, the implementation of authorization policies in a service provider must be done either for all of them at the same time or not at all. This is because upon deploying the first policy, all other calls not explicitly defined by it will be denied, potentially breaking your environment. The service provider itself then operates in STRICT mode, which means that it only accepts mTLS connections. This strict security policy is the target to achieve optimal service security.

The second category is simply a candidate to be integrated into the first and is therefore included in the Istio deployment roadmap. However, to achieve this goal, it will be necessary to temporarily pass through the third category. In this third category, the Istio deployment is partial, in the sense that the service provider is already supported by Istio, but not all of its consumers. For this, Istio proposes to operate this component in PERMISSIVE mode, which means that it will accept mTLS connections from consumers already supported by Istio and non-secure connections for those who are not yet supported. This is a kind of boundary mode.

However, it is important to be cautious when deploying authorization policies on a service provider in this mode, as once such a policy is in place, any unauthorized calls, whether by this policy or another, will be automatically rejected. This will inevitably be the case for components that are not yet supported by Istio, as these policies are based on the identity of a component, which can only be known once the consumer is supported by Istio. A workaround would be to implement a very permissive authorization policy that allows all incoming calls, but this would not be very useful and could create problems in the future. It is therefore preferable to do nothing until all consumers have migrated.

Conclusion

This management of authorized peer-to-peer call paths has substantially increased the difficulty for a hacker who breaks into our backend to compromise our system. This new architecture represents an important step forward in the implementation of our zero trust policy which will require a long phase of adjustment and deployment. However, it calls for other steps, such as superimposing authorization policies based on user requests, or entrusting the validation of internal tokens to Istio rather than to our in-house libraries.

We ❤️ learning and sharing

If you’d like to get in touch on any of the subjects above or about QA in general, I’m always reachable through my LinkedIn profile. Drop me a line! Whether you had a similar or totally different experience, I’d love to hear about it.

How we secured calls between microservices at ManoMano was originally published in ManoMano Tech team on Medium, where people are continuing the conversation by highlighting and responding to this story.

Leave a Reply

Your email address will not be published. Required fields are marked *