A brief discussion on using FlinkKafkaProducer to implement Exactly Once semantics on the sink side

Abstract In some important flink data processing scenarios, it is necessary to implement Exactly Once data processing. Exactly Once means that when flink processes data, it can ensure that data is not lost and data is not repeated. The entire flink processing link is roughly divided into three links: Source -> Transform -> Sink. Selecting […]

CVE-2023-25194 Kafka JNDI injection analysis

Apache Kafka Clients Jndi Injection Vulnerability description Apache Kafka is a distributed data stream processing platform that can publish, subscribe, store and process data streams in real time. Kafka Connect is a tool for scalable, reliable streaming of data between Kafka and other systems. An attacker can use any Kafka client based on SASL JAAS […]

ZooKeeper+Kafka+ELK+Filebeat cluster construction to realize large batch log collection and display

The general process: After collecting the logs of the nginx server (web-filebeat) through filebeat, store them in the cache server kafka, and then logstash to the kafka server to retrieve the corresponding logs. After processing, they are written to the elasticsearch server and displayed on kibana. 1. Cluster environment preparation 4c/8G/100G 10.10.200.33 Kafka + ZooKeeper […]

kafka practice-hot data display

1 Real-time streaming computing 1.1 Concept Streaming computing generally has high real-time requirements. At the same time, the target calculation is generally defined first, and then the calculation logic is applied to the data after the data arrives. At the same time, in order to improve calculation efficiency, incremental calculation is often used instead of […]

KafkaConsumer consumption logic

Version: kafka-clients-2.0.1.jar Previously, I wanted to write a plug-in to modify the logic of the kafkaConsumer consumer and filter some messages based on headers. Therefore, we need to understand how kafkaConsumer pulls out consumption messages, and confirm whether filtering out messages before consumption will have any impact. The following is the relevant source code, explained […]

Configure kafka cluster on windows10

1. Introduction to kafka Kafka is an open source stream processing platform developed by Apache and written in Scala and Java. Kafka is a high-throughput distributed publish-subscribe messaging system that can process all behavioral record data of consumers in the website 2. kafka download kafka address 3. kafka cluster configuration 1. Zookeeper configuration # Licensed […]

Using kafka.3.6.0 on windows

1. Introduction to kafka Kafka is an open source stream processing platform developed by Apache and written in Scala and Java. Kafka is a high-throughput distributed publish-subscribe messaging system that can process all behavioral record data of consumers in the website 2. kafka download kafka address 3. kafka configuration 1. Zookeeper configuration # Licensed to […]

Kafka JNDI injection analysis (CVE-2023-25194)

Apache Kafka Clients Jndi Injection Vulnerability description Apache Kafka is a distributed data stream processing platform that can publish, subscribe, store and process data streams in real time. Kafka Connect is a tool for scalable, reliable streaming of data between Kafka and other systems. An attacker can use any Kafka client based on SASL JAAS […]