设为首页 加入收藏

TOP

kafka监控(JMX_exporter+prometheus+Grafana)
2019-03-28 02:27:19 】 浏览:65
Tags:kafka 监控 JMX_exporter prometheus Grafana

kafka本身自带的机制

kafka使用Yammer Metrics来记录JMX数据。

<dependency>
  		<groupId>com.yammer.metrics</groupId>
  		<artifactId>metrics-core</artifactId>
  		<version>2.2.0</version>
</dependency>

https://www.cnblogs.com/caizhenghui/p/9132414.html
这个套路理论上不仅仅适用于Kakfa,而是适用于所有提供JMX暴露端口,并能够注入java agent的方法。

JMX Exporter

我们需要JMX Exporter做java agent来获得更全面的内部数据。相应的,我们需要修改启动命令到下面这个形式。

java -javaagent:./jmx_prometheus_javaagent-0.3.1.jar=8080:config.yaml -jar yourJar.jar

官方提供的一个配置实例是

---
startDelaySeconds: 0
hostPort: 127.0.0.1:1234
username: 
password: 
jmxUrl: service:jmx:rmi:///jndi/rmi://127.0.0.1:1234/jmxrmi
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false
whitelistObjectNames: ["org.apache.cassandra.metrics:*"]
blacklistObjectNames: ["org.apache.cassandra.metrics:type=ColumnFamily,*"]
rules:
  - pattern: 'org.apache.cassandra.metrics<type=(\w+), name=(\w+)><>Value: (\d+)'
    name: cassandra_$1_$2
    value: $3
    valueFactor: 0.001
    labels: {}
    help: "Cassandra metric $1 $2"
    type: GAUGE
    attrNameSnakeCase: false

我们实际用来监控kafka的配置文件就非常简单了

hostPort: 127.0.0.1:9095
lowercaseOutputName: true

whitelistObjectNames如果不配置,默认导出所有MBean。blacklistObjectNames如果不设置,默认为空。
我们可以通过访问JMX Exporter暴露的端口来获取所有可检测的metrics。其中9095也就是kafka进程自己的JMX暴露端口。我们通过JMX Exporter将监控数据转移到了9990端口。直接访问9990端口

curl http://localhost:9990/metrics

输出如下

# HELP jmx_config_reload_failure_total Number of times configuration have failed to be reloaded.
# TYPE jmx_config_reload_failure_total counter
jmx_config_reload_failure_total 0.0
# HELP jvm_buffer_pool_used_bytes Used bytes of a given JVM buffer pool.
# TYPE jvm_buffer_pool_used_bytes gauge
jvm_buffer_pool_used_bytes{pool="direct",} 5319596.0
jvm_buffer_pool_used_bytes{pool="mapped",} 2.39075298E9
# HELP jvm_buffer_pool_capacity_bytes Bytes capacity of a given JVM buffer pool.
# TYPE jvm_buffer_pool_capacity_bytes gauge
jvm_buffer_pool_capacity_bytes{pool="direct",} 5319596.0
jvm_buffer_pool_capacity_bytes{pool="mapped",} 2.39075298E9
# HELP jvm_buffer_pool_used_buffers Used buffers of a given JVM buffer pool.
# TYPE jvm_buffer_pool_used_buffers gauge
jvm_buffer_pool_used_buffers{pool="direct",} 26.0
jvm_buffer_pool_used_buffers{pool="mapped",} 241.0
# HELP jvm_info JVM version info
# TYPE jvm_info gauge
jvm_info{version="1.8.0_102-b14",vendor="Oracle Corporation",runtime="Java(TM) SE Runtime Environment",} 1.0
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 43.65
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.547018059742E9
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 338.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 65535.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 6.16871936E9
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 9.24340224E8
# HELP jvm_gc_collection_seconds Time spent in a given JVM garbage collector in seconds.
# TYPE jvm_gc_collection_seconds summary
jvm_gc_collection_seconds_count{gc="G1 Young Generation",} 35.0
jvm_gc_collection_seconds_sum{gc="G1 Young Generation",} 0.57
jvm_gc_collection_seconds_count{gc="G1 Old Generation",} 0.0
jvm_gc_collection_seconds_sum{gc="G1 Old Generation",} 0.0
# HELP jvm_threads_current Current thread count of a JVM
# TYPE jvm_threads_current gauge
jvm_threads_current 69.0
# HELP jvm_threads_daemon Daemon thread count of a JVM
# TYPE jvm_threads_daemon gauge
jvm_threads_daemon 48.0
# HELP jvm_threads_peak Peak thread count of a JVM
# TYPE jvm_threads_peak gauge
jvm_threads_peak 70.0
# HELP jvm_threads_started_total Started thread count of a JVM
# TYPE jvm_threads_started_total counter
jvm_threads_started_total 84.0
# HELP jvm_threads_deadlocked Cycles of JVM-threads that are in deadlock waiting to acquire object monitors or ownable synchronizers
# TYPE jvm_threads_deadlocked gauge
jvm_threads_deadlocked 0.0
# HELP jvm_threads_deadlocked_monitor Cycles of JVM-threads that are in deadlock waiting to acquire object monitors
# TYPE jvm_threads_deadlocked_monitor gauge
jvm_threads_deadlocked_monitor 0.0
# HELP jvm_classes_loaded The number of classes that are currently loaded in the JVM
# TYPE jvm_classes_loaded gauge
jvm_classes_loaded 5240.0
# HELP jvm_classes_loaded_total The total number of classes that have been loaded since the JVM has started execution
# TYPE jvm_classes_loaded_total counter
jvm_classes_loaded_total 5240.0
# HELP jvm_classes_unloaded_total The total number of classes that have been unloaded since the JVM has started execution
# TYPE jvm_classes_unloaded_total counter
jvm_classes_unloaded_total 0.0
# HELP kafka_server_replicamanager_value Attribute exposed for management (kafka.server<type=ReplicaManager, name=UnderReplicatedPartitions><>Value)
# TYPE kafka_server_replicamanager_value untyped
kafka_server_replicamanager_value{name="UnderReplicatedPartitions",} 0.0
# HELP kafka_server_brokertopicmetrics_count Attribute exposed for management (kafka.server<type=BrokerTopicMetrics, name=BytesInPerSec><>Count)
# TYPE kafka_server_brokertopicmetrics_count untyped
kafka_server_brokertopicmetrics_count{name="BytesInPerSec",} 502099.0
kafka_server_brokertopicmetrics_count{name="MessagesInPerSec",} 4206.0
# HELP kafka_server_brokertopicmetrics_oneminuterate Attribute exposed for management (kafka.server<type=BrokerTopicMetrics, name=BytesInPerSec><>OneMinuteRate)
# TYPE kafka_server_brokertopicmetrics_oneminuterate untyped
kafka_server_brokertopicmetrics_oneminuterate{name="BytesInPerSec",} 5438.520538936058
kafka_server_brokertopicmetrics_oneminuterate{name="MessagesInPerSec",} 45.54372300762534
# HELP kafka_server_brokertopicmetrics_meanrate Attribute exposed for management (kafka.server<type=BrokerTopicMetrics, name=BytesInPerSec><>MeanRate)
# TYPE kafka_server_brokertopicmetrics_meanrate untyped
kafka_server_brokertopicmetrics_meanrate{name="BytesInPerSec",} 8391.850324711393
kafka_server_brokertopicmetrics_meanrate{name="MessagesInPerSec",} 70.28899468899631
# HELP kafka_controller_kafkacontroller_value Attribute exposed for management (kafka.controller<type=KafkaController, name=OfflinePartitionsCount><>Value)
# TYPE kafka_controller_kafkacontroller_value untyped
kafka_controller_kafkacontroller_value{name="OfflinePartitionsCount",} 0.0
kafka_controller_kafkacontroller_value{name="ActiveControllerCount",} 0.0
# HELP kafka_server_replicafetchermanager_value Attribute exposed for management (kafka.server<type=ReplicaFetcherManager, name=MaxLag, clientId=Replica><>Value)
# TYPE kafka_server_replicafetchermanager_value untyped
kafka_server_replicafetchermanager_value{name="MaxLag",clientId="Replica",} 0.0
# HELP kafka_server_brokertopicmetrics_fifteenminuterate Attribute exposed for management (kafka.server<type=BrokerTopicMetrics, name=BytesInPerSec><>FifteenMinuteRate)
# TYPE kafka_server_brokertopicmetrics_fifteenminuterate untyped
kafka_server_brokertopicmetrics_fifteenminuterate{name="BytesInPerSec",} 488.174726685667
kafka_server_brokertopicmetrics_fifteenminuterate{name="MessagesInPerSec",} 4.0869490140718785
# HELP kafka_server_brokertopicmetrics_fiveminuterate Attribute exposed for management (kafka.server<type=BrokerTopicMetrics, name=BytesInPerSec><>FiveMinuteRate)
# TYPE kafka_server_brokertopicmetrics_fiveminuterate untyped
kafka_server_brokertopicmetrics_fiveminuterate{name="BytesInPerSec",} 1400.7117556545422
kafka_server_brokertopicmetrics_fiveminuterate{name="MessagesInPerSec",} 11.727110166511496
# HELP jmx_scrape_duration_seconds Time this JMX scrape took, in seconds.
# TYPE jmx_scrape_duration_seconds gauge
jmx_scrape_duration_seconds 0.018793424
# HELP jmx_scrape_error Non-zero if this scrape failed.
# TYPE jmx_scrape_error gauge
jmx_scrape_error 0.0
# HELP jvm_memory_bytes_used Used bytes of a given JVM memory area.
# TYPE jvm_memory_bytes_used gauge
jvm_memory_bytes_used{area="heap",} 6.28849256E8
jvm_memory_bytes_used{area="nonheap",} 5.4556888E7
# HELP jvm_memory_bytes_committed Committed (bytes) of a given JVM memory area.
# TYPE jvm_memory_bytes_committed gauge
jvm_memory_bytes_committed{area="heap",} 1.073741824E9
jvm_memory_bytes_committed{area="nonheap",} 5.57056E7
# HELP jvm_memory_bytes_max Max (bytes) of a given JVM memory area.
# TYPE jvm_memory_bytes_max gauge
jvm_memory_bytes_max{area="heap",} 1.073741824E9
jvm_memory_bytes_max{area="nonheap",} -1.0
# HELP jvm_memory_bytes_init Initial bytes of a given JVM memory area.
# TYPE jvm_memory_bytes_init gauge
jvm_memory_bytes_init{area="heap",} 1.073741824E9
jvm_memory_bytes_init{area="nonheap",} 2555904.0
# HELP jvm_memory_pool_bytes_used Used bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_used gauge
jvm_memory_pool_bytes_used{pool="Code Cache",} 1.7908864E7
jvm_memory_pool_bytes_used{pool="Metaspace",} 3.2436808E7
jvm_memory_pool_bytes_used{pool="Compressed Class Space",} 4211216.0
jvm_memory_pool_bytes_used{pool="G1 Eden Space",} 4.64519168E8
jvm_memory_pool_bytes_used{pool="G1 Survivor Space",} 3145728.0
jvm_memory_pool_bytes_used{pool="G1 Old Gen",} 1.6118436E8
# HELP jvm_memory_pool_bytes_committed Committed bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_committed gauge
jvm_memory_pool_bytes_committed{pool="Code Cache",} 1.8481152E7
jvm_memory_pool_bytes_committed{pool="Metaspace",} 3.2899072E7
jvm_memory_pool_bytes_committed{pool="Compressed Class Space",} 4325376.0
jvm_memory_pool_bytes_committed{pool="G1 Eden Space",} 6.73185792E8
jvm_memory_pool_bytes_committed{pool="G1 Survivor Space",} 3145728.0
jvm_memory_pool_bytes_committed{pool="G1 Old Gen",} 3.97410304E8
# HELP jvm_memory_pool_bytes_max Max bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_max gauge
jvm_memory_pool_bytes_max{pool="Code Cache",} 2.5165824E8
jvm_memory_pool_bytes_max{pool="Metaspace",} -1.0
jvm_memory_pool_bytes_max{pool="Compressed Class Space",} 1.073741824E9
jvm_memory_pool_bytes_max{pool="G1 Eden Space",} -1.0
jvm_memory_pool_bytes_max{pool="G1 Survivor Space",} -1.0
jvm_memory_pool_bytes_max{pool="G1 Old Gen",} 1.073741824E9
# HELP jvm_memory_pool_bytes_init Initial bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_init gauge
jvm_memory_pool_bytes_init{pool="Code Cache",} 2555904.0
jvm_memory_pool_bytes_init{pool="Metaspace",} 0.0
jvm_memory_pool_bytes_init{pool="Compressed Class Space",} 0.0
jvm_memory_pool_bytes_init{pool="G1 Eden Space",} 5.6623104E7
jvm_memory_pool_bytes_init{pool="G1 Survivor Space",} 0.0
jvm_memory_pool_bytes_init{pool="G1 Old Gen",} 1.01711872E9
# HELP jmx_config_reload_success_total Number of times configuration have successfully been reloaded.
# TYPE jmx_config_reload_success_total counter
jmx_config_reload_success_total 0.0

prometheus

https://prometheus.io/
下载安装请参考上述网址。这可以说是一个专门为监控而生的数据存储系统,同时具备主动拉取数据,存储数据,数据触发报警的功能。
在这里插入图片描述

prometheus的主要概念是metrics,类似oracle中的表。一个metrics中可能包含多个时间序列(time series)。时间默认是GMT。
启动命令如下,默认端口是9090。

./prometheus --config.file=prometheus.yml --web.listen-address=:8080

我的配置文件的内容如下

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  eva luation_interval: 15s # eva luate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically eva luate them according to the global 'eva luation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['10.237.78.22:9990','10.237.78.21:9990','10.237.78.10:9990']

启动完成后,我们便可以在网页端访问了。
我的prometheus是很久以前搭的,有一些无关数据,而我不知道如何清除指定的metrics。所以我直接删除data目录下的所有内容后重启。
在这里插入图片描述

Grafana

grafana似乎只能用root权限进行启动。默认端口我只知道通过修改防火墙来更改。。。。

】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
上一篇基于kafka的oracle实时同步 下一篇【原创】探讨kafka的分区数与多线..

最新文章

热门文章

Hot 文章

Python

C 语言

C++基础

大数据基础

linux编程基础

C/C++面试题目