Kafka内外网隔离配置及安装使用

admin
2023-08-13 / 0 评论 / 227 阅读 / 正在检测是否收录...
温馨提示:
本文最后更新于2023年08月13日,已超过394天没有更新,若内容或图片失效,请留言反馈。

环境版本说明

环境:CentOS7

版本:JDK1.8、Zookeeper-3.4.14、Kafka2.12-1.0.2

JDK安装

  1. JDK1.8安装

    rpm -ivh jdk-8u261-linux-x64.rpm
  2. 环境变量配置

    vim /etc/profile

    最后一行加上配置

    export JAVA_HOME=/usr/java/jdk1.8.0_261-amd64
    export PATH=$PATH:$JAVA_HOME/bin
  3. jdk验证

    java -version

Zookeeper安装

  1. 安装

    tar -zxf zookeeper-3.4.14.tar.gz -C /opt
  2. 环境变量配置

    vim /etc/profile

    ZOOKEEPER_PREFIX指向Zookeeper的解压目录

    export ZOOKEEPER_PREFIX=/opt/zookeeper-3.4.14

    将Zookeeper的bin目录添加到PATH中

    export PATH=$PATH:$ZOOKEEPER_PREFIX/bin

    设置环境变量ZOO_LOG_DIR,指定Zookeeper保存日志的位置

    export ZOO_LOG_DIR=/var/zookeeper/log

    使配置生效

    source /etc/profile
  3. 修改zookeeper配置

    cd conf
    cp zoo_sample.cfg zoo.cfg

    修改配置文件zoo.cfg

    vi zoo.cfg

    修改zookeeper数据存放位置配置

    修改前

    dataDir=/tmp/zookeeper

    修改后

    dataDir=/var/zookeeper/data
  4. 启动zookeeper

    进入zookeeper安装目录

    =/var/zookeeper/bin

    启动zookeeper

    zkServer.sh start

    查看zookeeper状态

    zkServer.sh status

Kafka安装

  1. 安装kafka

    tar -zxf kafka_2.12-1.0.2.tgz -C /opt
  2. 修改环境变量

    vi /etc/profile

    最后加上配置

    export KAFKA_HOME=/opt/kafka_2.12-1.0.2
    export PATH=$PATH:$KAFKA_HOME/bin

    使配置生效

    source /etc/profile
  3. 修改kafka配置

    修改server.properties配置文件

    vi server.properties
    1. 修改链接zookeeper地址:123行

      修改前

      zookeeper.connect=localhost:2181

      修改后

      zookeeper.connect=localhost:2181/mykafka

      说明:表示在zookeeper目录下会创建一个mykafka节点

    2. 修改消息持久化目录:60行

      修改前

      log.dirs=/tmp/kafka-logs

      修改后

      log.dirs=/var/kafka/kafka-logs
    3. 创建持久化目录文件夹

      mkdir -p /var/kafka/kafka-logs
  4. 脚本说明

    cd /opt/kafka_2.12-1.0.2/bin

    image-20230805193837517

    说明:

    kafka-topics.sh  操作主题的
    kafka-server-start.sh   kafka启动
    kafka-server-stop.sh kafka关闭
    kafka-console-consumer.sh  命令行里使用消费者
    kafka-console-producer.sh 命令行里面使用的生产者
  5. 启动kafka

    注意是进入到bin目录下的

    kafka-server-start.sh ../config/server.properties
  6. 客户端测试

    使用zookeeper客户端登录zookeeper,复制启动窗口执行zkCli.sh。(注意:必须复制服务器启动窗口来执行,不然没有脚本。)

    zkCli.sh
    1. 查看zookeeper的根节点

      image-20230805194107774

      ls /
    2. 查看mykafka

      ls /mykafka

      image-20230805194252247

      说明:

      cluster 集群
      controller 控制器
      controller_epoch 控制器纪元数据
      brokers broker
      admin 管理者
      isr_change_notification 同步的副本
      consumers 消费者
      log_dir_event_notification log_dir事件通知
      latest_producer_id_block 最后一个生产者
      config 配置
    3. 客户端退出zookeeper

      quit
  7. 关闭kafka

    kafka-server-stop.sh

主题

  1. 后台启动kafka

    kafka-server-start.sh -daemon /opt/kafka_2.12-1.0.2/config/server.properties
  2. 查看kafka进程信息

    ps aux | grep kafka
  3. 主题脚本使用帮助,直接执行脚本,显示主题脚本使用参数。

    kafka-topics.sh

    使用参数如下:

    [root@default-dev bin]# kafka-topics.sh 
    Create, delete, describe, or change a topic.
    Option                                   Description                            
    ------                                   -----------                            
    --alter                                  Alter the number of partitions,        
                                               replica assignment, and/or           
                                               configuration for the topic.         
    --config <String: name=value>            A topic configuration override for the 
                                               topic being created or altered.The   
                                               following is a list of valid         
                                               configurations:                      
                                                 cleanup.policy                        
                                                 compression.type                      
                                                 delete.retention.ms                   
                                                 file.delete.delay.ms                  
                                                 flush.messages                        
                                                 flush.ms                              
                                                 follower.replication.throttled.       
                                               replicas                             
                                                 index.interval.bytes                  
                                                 leader.replication.throttled.replicas 
                                                 max.message.bytes                     
                                                 message.format.version                
                                                 message.timestamp.difference.max.ms   
                                                 message.timestamp.type                
                                                 min.cleanable.dirty.ratio             
                                                 min.compaction.lag.ms                 
                                                 min.insync.replicas                   
                                                 preallocate                           
                                                 retention.bytes                       
                                                 retention.ms                          
                                                 segment.bytes                         
                                                 segment.index.bytes                   
                                                 segment.jitter.ms                     
                                                 segment.ms                            
                                                 unclean.leader.election.enable        
                                             See the Kafka documentation for full   
                                               details on the topic configs.        
    --create                                 Create a new topic.                    
    --delete                                 Delete a topic                         
    --delete-config <String: name>           A topic configuration override to be   
                                               removed for an existing topic (see   
                                               the list of configurations under the 
                                               --config option).                    
    --describe                               List details for the given topics.     
    --disable-rack-aware                     Disable rack aware replica assignment  
    --force                                  Suppress console prompts               
    --help                                   Print usage information.               
    --if-exists                              if set when altering or deleting       
                                               topics, the action will only execute 
                                               if the topic exists                  
    --if-not-exists                          if set when creating topics, the       
                                               action will only execute if the      
                                               topic does not already exist         
    --list                                   List all available topics.             
    --partitions <Integer: # of partitions>  The number of partitions for the topic 
                                               being created or altered (WARNING:   
                                               If partitions are increased for a    
                                               topic that has a key, the partition  
                                               logic or ordering of the messages    
                                               will be affected                     
    --replica-assignment <String:            A list of manual partition-to-broker   
      broker_id_for_part1_replica1 :           assignments for the topic being      
      broker_id_for_part1_replica2 ,           created or altered.                  
      broker_id_for_part2_replica1 :                                                
      broker_id_for_part2_replica2 , ...>                                           
    --replication-factor <Integer:           The replication factor for each        
      replication factor>                      partition in the topic being created.
    --topic <String: topic>                  The topic to be create, alter or       
                                               describe. Can also accept a regular  
                                               expression except for --create option
    --topics-with-overrides                  if set when describing topics, only    
                                               show topics that have overridden     
                                               configs                              
    --unavailable-partitions                 if set when describing topics, only    
                                               show partitions whose leader is not  
                                               available                            
    --under-replicated-partitions            if set when describing topics, only    
                                               show under replicated partitions     
    --zookeeper <String: urls>               REQUIRED: The connection string for    
                                               the zookeeper connection in the form 
                                               host:port. Multiple URLS can be      
                                               given to allow fail-over.    

    注意:REQUIRED为必选参数,如上面的--zookeeper链接地址。

  4. 查看所有可用主题

    kafka-topics.sh --zookeeper localhost:2181/mykafka --list
  5. 创建主题

    kafka-topics.sh --zookeeper localhost/mykafka --create --topic topic_1 --partitions 1 --replication-factor 1

    说明:zookeeper端口可省略,使用的是默认的。

    --topic: 创建主题名字

    --partitions:创建几个分区,便于横向扩展。

    --replication-factor:一个分区创建几个副本,高可用。

    注意:当只有一个服务一个broker时,是没有意义的,当服务宕机了,数据也没了。因此--replication-factor副本必须在不同的kafka服务器上,才能实现高可用。

    再次查看可用主题

    kafka-topics.sh --zookeeper localhost:2181/mykafka --list

    image-20230805202249086

  6. 查看主题详细信息

    kafka-topics.sh --zookeeper localhost/mykafka --describe --topic topic_1

    image-20230805202303000

    说明:topic_1有一个0号分区,在0号服务器上。

  7. 实例

    创建一个topc_2主题,5个分区,每个分区1个副本.

    kafka-topics.sh --zookeeper localhost/mykafka --create --topic topic_2 --partitions 5 --replication-factor 1

    查看可用主题

    kafka-topics.sh --zookeeper localhost:2181/mykafka --list

    image-20230805202917366

    查看topic_2主题详情

    kafka-topics.sh --zookeeper localhost/mykafka --describe --topic topic_2

    image-20230805203001867

    说明:5个分区都在0号服务器上。

消费

消费脚本使用帮助:

kafka-console-consumer.sh

使用参数:

The console consumer is a tool that reads data from Kafka and outputs it to standard output.
Option                                   Description                            
------                                   -----------                            
--blacklist <String: blacklist>          Blacklist of topics to exclude from    
                                           consumption.                         
--bootstrap-server <String: server to    REQUIRED (unless old consumer is       
  connect to>                              used): The server to connect to.     
--consumer-property <String:             A mechanism to pass user-defined       
  consumer_prop>                           properties in the form key=value to  
                                           the consumer.                        
--consumer.config <String: config file>  Consumer config properties file. Note  
                                           that [consumer-property] takes       
                                           precedence over this config.         
--csv-reporter-enabled                   If set, the CSV metrics reporter will  
                                           be enabled                           
--delete-consumer-offsets                If specified, the consumer path in     
                                           zookeeper is deleted when starting up
--enable-systest-events                  Log lifecycle events of the consumer   
                                           in addition to logging consumed      
                                           messages. (This is specific for      
                                           system tests.)                       
--formatter <String: class>              The name of a class to use for         
                                           formatting kafka messages for        
                                           display. (default: kafka.tools.      
                                           DefaultMessageFormatter)             
--from-beginning                         If the consumer does not already have  
                                           an established offset to consume     
                                           from, start with the earliest        
                                           message present in the log rather    
                                           than the latest message.             
--group <String: consumer group id>      The consumer group id of the consumer. 
--isolation-level <String>               Set to read_committed in order to      
                                           filter out transactional messages    
                                           which are not committed. Set to      
                                           read_uncommittedto read all          
                                           messages. (default: read_uncommitted)
--key-deserializer <String:                                                     
  deserializer for key>                                                         
--max-messages <Integer: num_messages>   The maximum number of messages to      
                                           consume before exiting. If not set,  
                                           consumption is continual.            
--metrics-dir <String: metrics           If csv-reporter-enable is set, and     
  directory>                               this parameter isset, the csv        
                                           metrics will be output here          
--new-consumer                           Use the new consumer implementation.   
                                           This is the default, so this option  
                                           is deprecated and will be removed in 
                                           a future release.                    
--offset <String: consume offset>        The offset id to consume from (a non-  
                                           negative number), or 'earliest'      
                                           which means from beginning, or       
                                           'latest' which means from end        
                                           (default: latest)                    
--partition <Integer: partition>         The partition to consume from.         
                                           Consumption starts from the end of   
                                           the partition unless '--offset' is   
                                           specified.                           
--property <String: prop>                The properties to initialize the       
                                           message formatter.                   
--skip-message-on-error                  If there is an error when processing a 
                                           message, skip it instead of halt.    
--timeout-ms <Integer: timeout_ms>       If specified, exit if no message is    
                                           available for consumption for the    
                                           specified interval.                  
--topic <String: topic>                  The topic id to consume on.            
--value-deserializer <String:                                                   
  deserializer for values>                                                      
--whitelist <String: whitelist>          Whitelist of topics to include for     
                                           consumption.                         
--zookeeper <String: urls>               REQUIRED (only when using old          
                                           consumer): The connection string for 
                                           the zookeeper connection in the form 
                                           host:port. Multiple URLS can be      
                                           given to allow fail-over. 

REQUIRED:必填参数

unless old consumer is used:使用老消费者

--bootstrap-server <String: server to    REQUIRED (unless old consumer is       
  connect to>                              used): The server to connect to.  

only when using old consumer:使用旧消费时使用

--zookeeper <String: urls>               REQUIRED (only when using old          
                                           consumer): The connection string for 
                                           the zookeeper connection in the form 
                                           host:port. Multiple URLS can be      
                                           given to allow fail-over.

消费者消费

链接Kafka服务端,当有多台Kafka时,只需要链接其中一台服务即可。注意,消费者消费端口是9092了。

kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic topic_1

说明:

--bootstrap-server localhost:9092:指定kafka服务地址端口

--topic topic_1:指定消费的主题

测试界面像是卡住了,不管。

消费者API使用参数配置说明

image-20230806103335594

生产

生产脚本使用帮助

kafka-console-producer.sh

使用参数:

Read data from standard input and publish it to Kafka.
Option                                   Description                            
------                                   -----------                            
--batch-size <Integer: size>             Number of messages to send in a single 
                                           batch if they are not being sent     
                                           synchronously. (default: 200)        
--broker-list <String: broker-list>      REQUIRED: The broker list string in    
                                           the form HOST1:PORT1,HOST2:PORT2.    
--compression-codec [String:             The compression codec: either 'none',  
  compression-codec]                       'gzip', 'snappy', or 'lz4'.If        
                                           specified without value, then it     
                                           defaults to 'gzip'                   
--key-serializer <String:                The class name of the message encoder  
  encoder_class>                           implementation to use for            
                                           serializing keys. (default: kafka.   
                                           serializer.DefaultEncoder)           
--line-reader <String: reader_class>     The class name of the class to use for 
                                           reading lines from standard in. By   
                                           default each line is read as a       
                                           separate message. (default: kafka.   
                                           tools.                               
                                           ConsoleProducer$LineMessageReader)   
--max-block-ms <Long: max block on       The max time that the producer will    
  send>                                    block for during a send request      
                                           (default: 60000)                     
--max-memory-bytes <Long: total memory   The total memory used by the producer  
  in bytes>                                to buffer records waiting to be sent 
                                           to the server. (default: 33554432)   
--max-partition-memory-bytes <Long:      The buffer size allocated for a        
  memory in bytes per partition>           partition. When records are received 
                                           which are smaller than this size the 
                                           producer will attempt to             
                                           optimistically group them together   
                                           until this size is reached.          
                                           (default: 16384)                     
--message-send-max-retries <Integer>     Brokers can fail receiving the message 
                                           for multiple reasons, and being      
                                           unavailable transiently is just one  
                                           of them. This property specifies the 
                                           number of retires before the         
                                           producer give up and drop this       
                                           message. (default: 3)                
--metadata-expiry-ms <Long: metadata     The period of time in milliseconds     
  expiration interval>                     after which we force a refresh of    
                                           metadata even if we haven't seen any 
                                           leadership changes. (default: 300000)
--old-producer                           Use the old producer implementation.   
--producer-property <String:             A mechanism to pass user-defined       
  producer_prop>                           properties in the form key=value to  
                                           the producer.                        
--producer.config <String: config file>  Producer config properties file. Note  
                                           that [producer-property] takes       
                                           precedence over this config.         
--property <String: prop>                A mechanism to pass user-defined       
                                           properties in the form key=value to  
                                           the message reader. This allows      
                                           custom configuration for a user-     
                                           defined message reader.              
--queue-enqueuetimeout-ms <Integer:      Timeout for event enqueue (default:    
  queue enqueuetimeout ms>                 2147483647)                          
--queue-size <Integer: queue_size>       If set and the producer is running in  
                                           asynchronous mode, this gives the    
                                           maximum amount of  messages will     
                                           queue awaiting sufficient batch      
                                           size. (default: 10000)               
--request-required-acks <String:         The required acks of the producer      
  request required acks>                   requests (default: 1)                
--request-timeout-ms <Integer: request   The ack timeout of the producer        
  timeout ms>                              requests. Value must be non-negative 
                                           and non-zero (default: 1500)         
--retry-backoff-ms <Integer>             Before each retry, the producer        
                                           refreshes the metadata of relevant   
                                           topics. Since leader election takes  
                                           a bit of time, this property         
                                           specifies the amount of time that    
                                           the producer waits before refreshing 
                                           the metadata. (default: 100)         
--socket-buffer-size <Integer: size>     The size of the tcp RECV size.         
                                           (default: 102400)                    
--sync                                   If set message send requests to the    
                                           brokers are synchronously, one at a  
                                           time as they arrive.                 
--timeout <Integer: timeout_ms>          If set and the producer is running in  
                                           asynchronous mode, this gives the    
                                           maximum amount of time a message     
                                           will queue awaiting sufficient batch 
                                           size. The value is given in ms.      
                                           (default: 1000)                      
--topic <String: topic>                  REQUIRED: The topic id to produce      
                                           messages to.                         
--value-serializer <String:              The class name of the message encoder  
  encoder_class>                           implementation to use for            
                                           serializing values. (default: kafka. 
                                           serializer.DefaultEncoder) 

注意:REQUIRED必填参数

生产者链接kafka服务

kafka-console-producer.sh --broker-list localhost:9092 --topic topic_1

说明:

--broker-list:指定broker,如果有很多太kafka服务器,只需要指定2个地址接口,这里只有一台kafka服务器,只指定一个。

--topic:指定要发送消息到那个topic主题。

此时生产者窗口也像卡住了,说明进入了发送消息界面。

生产者API使用参数配置说明

image-20230806103540617

image-20230806103636518

消息发送接收测试

注意:提示如下信息检查主题名称是否错误

WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 8

可查看可用主题:明确消费主题和生产者是否是使用的一个主题。

kafka-topics.sh --zookeeper localhost:2181/mykafka --list

消息的发送与接受

image-20230806103448344

测试结果:

image-20230805212246536

注意:当关闭消费者后,生产者继续发送消息,当生产者重新链接后,只能接受到后面生产者重新发送的消息。

消费历史消息

如果要消费以前的消息可以指定参数--from-beginning

kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic topic_1 --from-beginning

查看持久化数据

ca /var/kafka/kafka-logs

image-20230805213006649

可以看到很多偏移量的文件。

服务端参数配置

主要是服务器kafka配置文件server.properties的配置

zookeeper.connect

image-20230806161325006

该参数用于配置kafka要链接的Zookeeper/集群地址。

它的值是一个字符串,使用都好分割zookeeper的多个地址。Zookeeper的单个地址是host:port形式的,可以在最后添加kafka在zookeeper中的根节点路径。

例如:

 zookeeper.connect=192.1681.1:2181,192.1681.2:2181,192.1681.3:2181,192.1681.4:2181/mykafka

最好服务器地址数量过半,后面zookeeper存放路径/mykafka写一个就好了。

listeners

image-20230806162541253

用于指定当前Broker向外发布服务的地址和端口。

与advertised.listeners配合,用于做内外网隔离。

注意:端口号也是可以修改的。

内外网隔离配置

listeners

用于配置broker监听的URL以及监听器名称列表,使用逗号隔开多个URL及监听器名称。

例如:服务器有2个ip,ip如下

image-20230806163506717

则配置如下:

image-20230806172401223

listeners配置:注意监听器名称不能相同,端口不能相同。PLAINTEXT代表了都使用PLAINTEXT协议,也代表了监听器的名称,但是名称又不能相同,因此使用映射配置参数listner.sercurity.protocol.map。

加上上面配置后,启动还是会报错,因此必须加上如下配置:

image-20230806172606989

整体说明:注意kafka是使用的PLAINTEXT协议

listener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT    
listeners=INTERNAL://10.0.2.15:9000,EXTERNAL://192.168.56.30:9001
advertised.listeners=EXTERNAL://192.168.56.30:9001
inter.broker.listener.name=EXTERNAL

listener.security.protocol.map:解决监听协议、监听器名称一样问题。

listeners:集群多台服务地址

advertised.listeners:暴露给消费者生产者,以及节点间通讯使用的地址和端口(需要将该地址发布到zookeeper供客户端使用,如果客户端使用的地址与listeners配置不同),而另一个地址端口INTERNAL://10.0.2.15:9000则可用于内部管理kafka。

inter.broker.listener.name:暴力给消费者生产者使用监听的监听器名称,如果要暴露多个直接可以逗号隔开多加几个,前提是listeners里面有的。

listener.security.protocol.map

用于内外网隔离配置,监听器名称和安全协议的映射配置。

例如:将内外网隔离,即使他们都使用SSL,上面的配置问题就可以加下面参数解决。

listener.security.protocol.map=INTERNAL:SSL,EXTERNAL:SSL

说明:

INTERNAL,EXTERNAL:代表监听器名称

SSL:代表都是使用SSL协议。

加上这个参数配置,就可以解决上面的监听器名称和协议冲突的问题了。

注意:每个监听器的名称只能在map中出现一次。如果监听器名称代表的不是安全协议,必须配置listener.security.protocol.map。

每个监听器必须使用不同的网络端口。

查看zoopeeper信息:

客户端脚本链接zookeeper:

zkCli.sh

查看kafka信息

get /mykafka/brokers/ids/0

image-20230806174813999

查看可用主题:

kafka-topics.sh --zookeeper localhost:2181/mykafka --list

broker.id

该属性用于唯一标记一个kafka的Broker,它的值是一个任意integer值。

当kafka以分布式集群部署使用时,非常重要。最好该值只跟该Broker所在的物理主机有关的,如主机名为host1.yanxizhu.com,则broker.id=1,如果主机名为192.168.56.30,则broker.id=30.

log.dir

通过该属性的值,指定kafka在磁盘上保存消息的日志片段的目录。

它时一组用逗号分隔的本地文件系统路径。

如果指定了多个路径,那么broker会根据”最少使用“原则,把同一个分区的日志片段保存到同一个路径下。

broker会往拥有最少数目分区的路径新增分区,而不是往拥有最小磁盘空间的路径新增分区。

1

评论 (0)

取消