5 Kafka Management Tips and Tricks

What exactly is Kafka?

Apache Kafka is an open source project that allows for robust distributed processing of continuous data streams. It is used in production by hundreds of companies around the world, including Netflix, Twitter, Spotify, Uber, and many others.

Stream processing applications can use geographically distributed data streams thanks to the technology architecture and implementation, which makes it very dependable and available. While Kafka is simple to use, it can be complex to optimize.

Here are our first five pointers to help you improve your Kafka system and get ahead!

Logs

Kafka provides a large number of log setup options. The defaults are generally reasonable, but most users will need to change at least a few things to suit their needs. You must consider policy retention, clean-ups, compaction, and compression.

Requirements for Hardware

When tech teams first start playing with Kafka, they often merely ‘ballpark’ the hardware by spinning up a large server and hoping it works. Kafka does not require a large amount of resources. Because it’s built for horizontal scaling, you can get away with using rather inexpensive commodity hardware.

Unless you’re using SSL and compressing logs, the CPU doesn’t need to be too powerful. The more cores you have, the better your parallelization will be. If you do need compression we recommend using LZ4 codec for best performance in most circumstances.

Memory: For heap space, Kafka works best with at least 6 GB of memory. The remainder will be sent to the OS page cache, which is critical for client throughput. Kafka can work with less RAM, but it won’t be able to manage a lot of data. For significant output on use cases, at least 32 GB is recommended.

SSDs will not provide significant value due to Kafka’s sequential disk I/O architecture. NAS should not be used. Multiple drives in a RAID configuration can be effective.

Network and file system: If at all possible, use XFS and maintain your cluster in a single datacenter. The more network bandwidth available, the better.

Zookeeper

We could write an entire article about ZooKeeper alone. It’s a versatile piece of software that may be used for service discovery as well as a variety of distributed configuration scenarios.

ZooKeeper should not be installed in a high-volume production setting.

Thanks to Docker’s widespread adoption, many businesses now employ this shortcut. If you take the proper measures, this is fine for development environments or even minor production deployments. When you have a larger system, you run the risk of losing more of your infrastructure if a single server fails. It’s also bad for security because Kafka and ZooKeeper are likely to have quite different sets of customers, and you won’t be able to isolate them well.

Use no more than five ZooKeeper nodes unless you have a compelling reason to do so.

One node is sufficient for a development environment. You should employ the same number of nodes in your staging environment as in production. For a normal Kafka cluster, three ZooKeeper nodes should enough. If you have a large Kafka deployment, expanding to five ZooKeeper nodes to improve latency can be worth it, but keep in mind that this will place greater demand on the nodes. • Aim for the shortest possible delay.

Make use of servers with plenty of network bandwidth. Use the right drives for the job, and keep logs on a separate disk. Isolate the ZooKeeper process and make sure swap is turned off. Make sure your instrumentation dashboards track latency.

Redundancy and Replication

When thinking about redundancy with Kafka, there are a few factors to consider. The replica on factor is the first and most noticeable. Although we believe Kafka defaults to 2, we believe that for most production needs, 3 is the best option. It will allow you to lose a broker without losing your mind. Your system will continue to function even if a second broker fails independently. You must also consider datacenter racks zones in addition to replica on factor.

Topic Configuration

How you configure your topics will have a big impact on the performance of your Kafka cluster. In general, you should treat topic settings as immutable because changes to partition count or replica on factor might be quite painful. If you need to make a significant change to a topic, creating a new one is generally the best option. Always run new topics through their paces in a staging environment first.

Start at 3 for replica on factor, as previously said. If you have a long message to process, see if you can either divide it up into ordered bits (simple using par on keys) or just give pointers to the actual data (links to S3 for example). If you absolutely must handle larger messages, make sure the producer’s compression is enabled. The 1 GB default log segment size should suffice (if you are sending messages larger than 1 GB, reconsider your use case). The next section discusses partition count, which is possibly the most essential setting.

The competitive advantage of Instaclustr:

Organizations using Instaclustr-managed Kafka are choosing an experienced provider distinguished by more than 20 million node hours under management and available technical teams with deep Kafka-specific expertise, thanks to the addition of Manged Kafka to the suite of solutions available through Instaclustr’s Open Source-as-a-Service platform.

The managed Kafka service follows the same rigorous provisioning and management principles as the Instaclustr platform’s other prominent open source technologies, such as Apache Cassandra, Apache Spark, and Elassandra. Advanced data technologies underpin Instaclustr Managed Apache Kafka, ensuring easy scalability, excellent performance, and continuous availability. Instaclustr also offers a SOC2 certified Kafka managed service to customers, ensuring secure data management and protecting client privacy.

What is Instaclustr?

Instaclustr is a social media platform that allows you to share photos and videos

Instaclustr is an Open Source-as-a-Service firm that provides scalability and reliability. We provide database, analytics, search, and messaging services in an automated, proven, and trusted managed environment. We allow businesses to concentrate their own development and operational resources on developing cutting-edge customer-facing solutions.

You May Also Like

About the Author: Prak