End-To-End Data Security Using Apache Kafka

End-To-End Data Security Using Apache Kafka

150 150 VOLANSYS
Share

For the people who still do not know about Kafka – it is scalable, developer friendly and highly fault-tolerant messaging system, widely used for building distributed applications.

Kafka is rapidly gaining its position in Apache Hadoop enterprise deployments and has become the popular messaging bus in many other Big Data technology solutions as well. At present, Kafka is widely used across the top notch web based services like LinkedIn, Twitter, Cisco, SAP, PayPal, DropBox, AirBnB, etc.

What makes Kafka so popular across so many companies and services? Well, it is the capacity of messaging volumes while maintaining lower latency and ensuring security with a fast paced, reliable algorithm that makes Kafka stand out. Kafka is suitable for solutions based on log data, Big Data Applications, Sensor and Device data, Monitoring and Processing of streams, call data records, Real Time Monitoring and Analysis, Asynchronous applications, Fraud and Security, Bridge to the cloud and much more.

  • Security Options in Apache Kafka

In the versions older than 0.9, security was achieved by maintaining access at the network level, which was not a good option when we use the multi-tenant cluster for the larger application. Consequently, securing Kafka has been one of the most requested features. Security is one of the most important dimensions in today’s world where everyone wants access to every data of world.

Kafka community added a number of features that can be used to increase Kafka cluster security. Addressing security threats are crucial in today’s world as it is threatened by a wide variety of cyber-attacks and here, Apache Kafka can become a good choice for an enterprise messaging system. Following security options are available in Kafka.

  • Authentication & Authorization

Kafka provides the Authentication and Authorization mechanism with various options. SSL is the most popular method.

Authentication

  • SSL Certificate support for 1-way(broker only) or 2-way (broker and client) authentication
  • SASL challenge/response support via Kerberos
  • Mix-n-match: SSL for wire-level encryption, SASL for Authentication

Authorization

  • Access control lists
    • Operations: Read, Write, Create, Describe, ClusterAction, ALL
    • Resources: Topic, cluster, ConsumerGroup

The overall solution with Authentication and Authorization looks like below:

Apache_Kafka_Security_101

source: confluent.io

  • Data Encryption

The benefits to setup encryption on Kafka message is to increase confidentiality assurance for the messages being sent. The original messages are encrypted using a key before transmitted to Kafka. On the receiver side, the consumer decrypts the message to get an actual message. In this case, Kafka is never exposed to the clear-text messages. Below is a Data encryption algorithm diagram.

apacke-kafka-end-end-security

  • Encryption Algorithm for Kafka

Since Kafka is a high volume and low latency message broker, we need a fast and secure encryption algorithm which can encrypt a huge amount of data in no time. The obvious choice in such scenario is AES (Advanced Encryption Standard) mainly because of popularity and hardware support which is easily available. Modern Intel and AMD processors support AES encryption/decryption natively within the CPU, which is faster than AES software implementations. The problem with AES in our context is its symmetric cipher which means there is only one key which is used for encryption as well as decryption. This leads to the question how a secure key exchange between producer and consumer can be accomplished? The short answer is: it is not possible if you do not have a secure channel.

  • Performance and Space Analysis with Encryption Algorithm

When we add end-to-end encryption using an algorithm, message processing time will be increased and CPU usage will be high.

Our development team at VOLANSYS has tried to measure the performance in case of encrypted and non-encrypted message exchange on different systems (live Kafka server, development server, developer’s laptops and VMs) using 500-bytes message envelope.

We ran the producer API with encryption, and we found that enabling encryption took 25% to 30% longer time to send messages to Kafka. While using encryption, the CPU usage increased by 40% to 55%.

We analyzed this by sending encrypted and non-encrypted messages to Kafka and we found 5-10% more space efficient in the case of encryption.

  • What VOLANSYS can offer with Apache Kafka?

VOLANSYS Technologies is an enterprise-ready cloud solution provider and business software development company which has worked extensively on the Apache Kafka platform. Our data streaming solutions developed on Apache Kafka helps digital business communication in real time. Streaming real-time data helps a business to stay equipped and future driven, our Apache Kafka solutions are all equipped to excel with high fidelity streaming and security features. Does your business needs the edge of real-time streaming solutions? We can offer you highly customized enterprise ready streaming solutions on Apache Kafka. To know more drop us a mail at business@volansys.com or call us at +1 510 358 4310.

chandani-patel-volansys

About Author: Chandani Patel

Chandani is working i­n software technologies as a ­system designer and s­ystem architect. She h­as been continuously ­pushing herself to bring noticeable changes in cloud field. She­ has persistently stayed updated with the ­trends, although they ­tend to fade away with time. You can connect with her on Linked­In chandanipatel

Share
YOU MIGHT ALSO LIKE

Sign Up For Newsletter

Sign up to get updates on our latest blogs, case studies, events, news about Connected products and solutions.