内容简介:TheSupertubes approach to handling Kafka ACLs in Kubernetes provides a clearer way of seeing what's actually happening by introducing a logical separation of ACL components under the names:That way we get reusable parts that help maintain the system in the
TheSupertubes approach to handling Kafka ACLs in Kubernetes provides a clearer way of seeing what's actually happening by introducing a logical separation of ACL components under the names:
KafkaACL KafkaRole KafkaResourceSelector
That way we get reusable parts that help maintain the system in the long term, allowing us to handle ACLs with a declarative approach, and overcoming the difficulties inherent in handling ACLs in a Kubernetes environment.
We rely on Istio service mesh to be the foundation of ourSupertubes Kafka clusters because it provides seamless security checks for traffic between Kafka components and clients from outside the cluster. It accomplishes this automatically, through mutual TLS authentication with builtin certificate rotation and management, and is actually faster than Kafka's builtin TLS implementation .
For further benefits of running Kafka inside an Istio service mesh check our The benefits of integrating Apache Kafka with Istio blog post.
In an environment like this, handling Kafka ACLs can be difficult.
While you could use kafka-acls.sh
, the traditional solution, over time it can be difficult to operate a Kafka cluster inside a Kubernetes cluster and keep the ACLs up to date.
When working in such an environment, there are two ways to set ACLs.
You can set them from outside of Kubernetes, meaning you will have to setup a certificate in order to maintain access to the cluster. Or you can set them from inside the cluster by executing into a pod that already contains kafka-acls.sh
, which isn't ideal either from a usability perspective.
To complicate things, ACLs are difficult to follow. We wanted something that separates
- What somebody can access
- When somebody has access to it
- and Who the somebody is that we're giving permissions to.
Before getting into the nitty-gritty, let's take a look at the following example ACL configurations for the Kafka Schema Registry
.
If we used kafka-acls.sh
, for instance, we would have to execute the following commands against our cluster.
bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf \ --add --allow-principal 'User:schema-registry' --allow-host '*' \ --producer --consumer --topic _schemas --group schema-registry bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf \ --add --allow-principal 'User:schema-registry' --allow-host '*' \ --operation DescribeConfigs --topic _schemas bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf \ --add --allow-principal 'User:schema-registry' --allow-host '*' \ --operation Describe --topic _schemas bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf \ --add --allow-principal 'User:schema-registry' --allow-host '*' \ --operation Read --topic _schemas bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf \ --add --allow-principal 'User:schema-registry' --allow-host '*' \ --operation Write --topic _schemas bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf \ --add --allow-principal 'User:schema-registry' --allow-host '*' \ --operation Describe --topic __consumer_offsets bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf \ --add --allow-principal 'User:schema-registry' --allow-host '*' \ --operation Create --cluster kafka-cluster
So what we are doing here exactly?
Quite a few things are happening here.
First, we have set both consumer
and producer
access to the topic _schemas
and consumer group schema-registry
for the User:schema-registry
service principal. So far so good.
In the second through fifth commands, we set DescribeConfigs
, Describe
, Read
, and Write
permissions to the same topic.
Lastly, we set Describe
to the __consumer_offsets
topic, alongside the Create
operation for the kafka cluster
.
As you can see, with some repetition configuring ACLs quickly becomes a relatively labor-intensive job, which is the perfect way to make typos and mistakes - something we absolutely do not want when talking about security.
Let's take a look at how we would solve this problem.
We provide a way that is not just declarative
- in line with the GitOps and the Configuration as Code trends of today - but is also easier to maintain, since we provide many reusable shortcuts in the form of KafkaRoles
and KafkaResourceSelectors
.
KafkaResourceSelectors - Beginning the What
Let's start with figuring out what
we're trying to protect with the authorization.
This is something that's getting used very frequently, over and over again. Just think about the example given above.
We used the topic __schemas
five
times over the course of seven commands, and that was just one principal.
Elevating this topic into its own CR gives us the flexibility to reuse it, not just making our lives easier but making the solution more error resilient as well.
KafkaResourceSelectors are filters for one or more Kafka Resources of the same type. These types can be any of the following:
topic group transactionalId cluster
Here are two examples:
apiVersion: kafka.banzaicloud.io/v1beta1 kind: KafkaResourceSelector metadata: name: schemas-topic namespace: kafka spec: type: topic name: _schemas pattern: literal
Here, we're selecting the topic
named _schemas
. By saying that the pattern is literal
we're making the determination that it should be an exact match for the name
field.
You can also use pattern prefixed
to suggest that the name
field act as a prefix, creating even more versatile selectors by grouping together multiple topics in the process simultaneously.
This is especially handy if you have a lot of smartly named topics, and you don't want to create a selector for every single one of them.
apiVersion: kafka.banzaicloud.io/v1beta1 kind: KafkaResourceSelector metadata: name: consumer-offsets-topic namespace: kafka spec: type: topic name: __consumer_offsets pattern: literal
ResourceSelectors
are a tool that help us to keep from repeating ourselves over and over again, every time we'd like to refer to a Kafka resource.
They also provide a centralized place to modify and track all our resources handled by ACL.
With a simple
kubectl get KafkaResourceSelector
command we can see how quickly which resource is covered by our ACLs, making it easy to spot if we're missing something.
KafkaRoles - Providing the When
Roles
have been part of access control systems for years. We don't have to look very hard for an example, since Kubernetes RBAC works in practically the same way.
It builds on the common principle we discussed in KafkaResourceSelectors
, easing our job by encouraging reusability and helping us make the process more clear, followable, and therefore highly maintainable.
KafkaRoles
provide a way to easily group multiple ACL permissions into one single reusable resource.
Let's look at our examples:
apiVersion: kafka.banzaicloud.io/v1beta1 kind: KafkaRole metadata: name: consumer spec: topic: # operations on topics operations: allow: - read - describe group: # operations on consumer groups operations: allow: - read
apiVersion: kafka.banzaicloud.io/v1beta1 kind: KafkaRole metadata: name: producer spec: topic: operations: allow: - write - describe - create transactionalId: operations: allow: - write - describe
consumer
and producer
permissions are so frequently necessary that kafka-acls.sh
has built-in flags to handle them.
We wanted to take the next step and provide a way to create your own custom groups of ACL operations.
The two KafkaRoles
listed above consumer
and producer
are deployed by default when youinstall Supertubes, meaning they're ready to use, and you do not have to manually apply them to the cluster.
In spec
you can specify your allow
and deny
permissions under the same four resource types listed in ResourceSelectors
:
topic group transactionalId cluster
Also, it's important to note that roles are not mandatory. They are a reusable tool to ease headaches that might arise from overuse of copy-paste design patterns, saving you from a handful of bugs in the long run but also helping you keep a tight grip on access control to your Kafka cluster.
Imagine one day wanting to add a read
operation to all producers. Unlike with kafka-acls.sh
, you just have to add read
under the producer
role, and it will automatically get propagated to every KafkaACL
CR that references it.
So how does all this come together?
KafkaACL - Providing the Who
The KafkaACL
custom resource is at the heart of this system; it provides a binding between a subject and the two other components we've discussed
, KafkaRole
and KafkaResourceSelector
.
Through it, you can tell a system which principal you would like to apply to the permissions defined in Role
,
and which kafka resource you'd like to grant access to through ResourceSelector
.
Let's take a look.
apiVersion: kafka.banzaicloud.io/v1beta1 kind: KafkaACL metadata: name: schema-registry namespace: kafka spec: kind: User name: schema-registry clusterRef: name: kafka namespace: kafka acls: - topic: operations: allow: - read - write - describe - describe_configs resourceSelectors: - name : schemas-topic namespace: kafka - topic: operations: allow: - describe resourceSelectors: - name : consumer-offsets-topic namespace: kafka - cluster: allow: - create roles: - name: consumer resourceSelectors: - name: schemas-topic namespace: kafka - name: producer resourceSelectors: - name: schemas-topic namespace: kafka
That's a lot of YAML, but it's really quite simple, so stay with me. If we go through spec
, the first thing we have to provide is the subject
of the ACL.
kind: User name: schema-registry
Here, you need to give a kind
which is User
or anything else that the configured Kafka authorizer (see authorizer.class.name
kafka config for details) supports, then a name
, which should be the exact name of the service account we'd like to give the permissions to.
Wait, service account?
Yes. Basically, we authenticate Kafka clients using their Kubernetes namespaces and service accounts. The power of combining Kafka, Istio, and Kubernetes really shines through here, sprinkled with some WebAssembly Envoy filters. The best part is that you don't actually need to know the minutiae of how this works, because Supertubes handles all of it behind the scenes for you. If that piqued your curiosity and you want to know more, check out our previous blog that goes into more detail.
Back to our example. The next thing you have to do is determine which KafkaCluster
you'd like to bind the ACL to.
A lot of our users use multiple Kafka clusters on a single Kubernetes cluster, which helps you separate ACLs from each other.
clusterRef: name: kafka namespace: kafka
Then comes the lion's share of the YAML.
acls: - topic: operations: allow: - read - write - describe - describe_configs resourceSelectors: - name : schemas-topic namespace: kafka - topic: operations: allow: - describe resourceSelectors: - name : consumer-offsets-topic namespace: kafka - cluster: operations: allow: - create
acls
let you define inline permissions
for your principal. Seem familiar? It's the same way we define them in KafkaRoles
.
If you remember, we said that Roles
are optional, and acls
is exactly the reason why. KafkaRoles
are reusable sections of acls
, so you can use them again somewhere else later, avoiding duplication in the process.
That being said, acls
are still very useful; here we're using them in one-off operations, so that we don't need to create Roles
with only one operation in them.
In resourceSelectors
we can provide a list of selectors, telling the system What
resource we are trying to bind the operations to.
Note that the cluster
part is without a resourceSelector
. That's because, in this case, the clusterRef
at the top of the CR is being used as an anchor for the operation.
Last but not least comes the optional part, providing a place where we can reference and use our KafkaRoles
CRs.
Notice that Roles
are without a namespace, so you can reuse them across your other KafkaClusters
.
roles: - name: consumer resourceSelectors: - name: schemas-topic namespace: kafka - name: producer resourceSelectors: - name: schemas-topic namespace: kafka
In the diagram below you can see how it all comes together.
Remember, both consumer
and producer
roles are on the cluster and are ready to be used when youinstall Supertubes. The only thing you have to do is set the authorizer in your KafkaCluster
CR
readOnlyConfig: | authorizer.class.name=kafka.security.authorizer.AclAuthorizer allow.everyone.if.no.acl.found=false
and create the KafkaACL
and the KafkaResourceSelector
CRs. The latter, of course, is reusable,
and helps you out in the long run.
One thing that you might have noticed is that this solution provides no way of setting a host
field where you can specify the IP from which the principal can access resources. The reason for that is simple if you think about it.
In Kubernetes, pods do not have a permanent IP address; they move around the cluster constantly and in accordance with a variety of factors, mainly resource allocation quotas and what the Scheduler thinks is the best place in any given moment for the pod. But that's great! One of the reasons we love Kubernetes is this kind of flexibility, and the fact that the vast majority of the time, we don't even have to think about IPs in the cluster - making our job that much easier.
Filtering on the client IP address is also not that great an idea if the client comes from outside the cluster.
IPs can be spoofed, making it easy to walk around the problem.
So what can we do?
Istio provides many ways of denying a client's access to the service mesh, in the process denying access to the Kafka cluster as well. The ultimate solution relies on certificates for authentication of the client. We already do that, but currently it only works the one way. We can give certificates to trusted clients, but Istio does not support revoking those certificates as of right now. Until Istio provides support for certificate revocation lists, we can instead, as a work around, set the expiration of a given certificate to as soon as possible and renew it only when required - a good security practice in its own right.
The feedback we get from our customers and from the community is overwhelmingly positive. I'd like to thank all of you for reaching out to us and sharing your thoughts and helping to shape the future of Supertubes in ways we hadn't even begun to think about. We are not done yet though, and are continuously improving and developing new features and capabilities. We always have new ideas that we're eager to show you. Here is a little sneak peek of what's next on our roadmap:
- Observability and management UI leveraging Istio telemetry data and the ability to drill down into the route cause of anomalies . Thanks to Istio we can provide data and telemetry about the state of the Kafka cluster that wasn't previously possible.
- Envoy protocol filter-based audits as an extension of the Envoy Kafka protocol filter
AboutSupertubes
Banzai Cloud Supertubes (Supertubes) is the automation tool for setting up and operating production-ready Kafka clusters on Kubernetes, leveraging a Cloud-Native technology stack.Supertubes includes Zookeeper, the Banzai Cloud Kafka operator, Envoy, Istio and many other components that are installed, configured, and managed to operate a production-ready Kafka cluster on Kubernetes. Some of the key features are fine-grained broker configuration, scaling with rebalancing, graceful rolling upgrades, alert-based graceful scaling, monitoring, TLS with automatic certificate renewal and multiple options for disaster recovery.
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Python网络数据采集
米切尔 (Ryan Mitchell) / 陶俊杰、陈小莉 / 人民邮电出版社 / 2016-3-1 / CNY 59.00
本书采用简洁强大的Python语言,介绍了网络数据采集,并为采集新式网络中的各种数据类型提供了全面的指导。第一部分重点介绍网络数据采集的基本原理:如何用Python从网络服务器请求信息,如何对服务器的响应进行基本处理,以及如何以自动化手段与网站进行交互。第二部分介绍如何用网络爬虫测试网站,自动化处理,以及如何通过更多的方式接入网络。一起来看看 《Python网络数据采集》 这本书的介绍吧!