Quantcast
Channel: CodeSection,代码区,网络安全 - CodeSec
Viewing all articles
Browse latest Browse all 12749

Anatomy of a Kafka CVE

$
0
0

Anatomy of a Kafka CVE

With the recent disclosure of CVE-2018-1288 that affected Kafka, we decided to release a blog post with a few more details about the issue.

Affected releases:

First of all, you should upgrade to one of the following releases or greater:

1.0.1 0.11.0.3 0.10.2.2

Kafka 1.1 and 2.0 are not affected as they were released after the issue was fixed.

Note that 0.10.1, 0.10.0 and 0.9.0 are vulnerable to the issue and have not been fixed. So, if you are still running one of these old releases, you should update to something more recent.

This issue has been present since the Authorizer interface was added, back in 0.9.0.0 (November 2015)!

Background:

Inter-broker requests are messages exchanged between brokers in a cluster. They are used not only to signal a change of state (a broker shutting down or new cluster metadata) but also for data replication.

In order to replicate data within a cluster, Kafka brokers use Fetch requests like normal consumers. As described in the Kafka protocol , Fetch requests contain a “replica_id” field which is set to the broker id when used for replication, whereas normal consumers always set this field to -1.

When using an Authorizer, all Kafka inter-broker messages require the sender to have an ACL of ClusterAction on the Cluster resource.

Details of the issue:

Kafka was not checking authorizations correctly when handling Fetch requests for replication (inter-broker). This allowed malicious consumers to impersonate brokers. With the default KafkaConsumer, a small code change is required to forge Fetch requests that contain a broker id and hence confuse the cluster.

When impersonating brokers, a malicious Consumer can cause a number of issues:

Render cluster metadata completely incoherent as it can make brokers that are out-of-sync, offline or do not exist appear in-sync Use replication quotas to bypass user quotas and impact replication traffic The discovery:

The issue was discovered by Edoardo Comar and myself while we were pair programming on the 16th of January 2018. That same day, we reported the issue to the Kafka security team and suggested a fix.

At the time, we were in the process of deleting some custom logic we had added to Kafka to check permissions and instead reimplement it using the Authorizer interface. While doing that, we realized that when Kafka was handling Fetch requests that had a “replica_id” not equal to -1, hence appearing as coming from a broker, it was not checking if the sender had the required ACL (ClusterAction on Cluster).

The fix:

The fix is simply to validate that Fetch requests that appear to come from a broker have the necessary ClusterAction on the Cluster resource ACL.

We followed the Apache Security process and submitted a fix that was merged into trunk and 1.0 on the 18th of January. We backported the fix to the 0.11 and 0.10.2 branches as well.

Note that the process requires the fix not to have a matching JIRA issue and the PR to be labelled as a ‘MINOR’ fix, without mentioning security flaws in the comments.

Follow ups:

Following this discovery, we reviewed all other ACL checks to ensure no other code path had a similar issue and fortunately we could not find any.

Many new requests have been added in the past couple of years, each adding some ACLs. We decided to fully document all the ACL checks that are performed for all requests. We published this data in the Kafka wiki so everyone can refer to it when adding some more or when implementing their own Authorizer.

Reviewing this data, we noticed a few inconsistencies as ACLs have been added organically and we never had a full view of them all. We reported these inconsistencies back to the developer mailing list . For the most pressing issue (in our eyes) we opened KIP-277 and I’m happy to say it has been voted, merged and will be in Kafka 2.0.0.

Finally, we feel like the delay between the discovery of the issue and the public disclosure took too long, over 6 months. The Apache Security process works best when releases are published in a timely manner following the security report.

IBM Message Hub is Apache Kafka as a service for IBM Cloud. You can get started at https://console.bluemix.net/docs/services/MessageHub/index.html#messagehub .


Viewing all articles
Browse latest Browse all 12749

Trending Articles