使用telegraf作为守护程序发送kubernetes吊舱/容器的度量

ifmq2ha2  于 2021-06-06  发布在  Kafka
关注(0)|答案(1)|浏览(663)

首先,我想清楚地了解一些事情,如果我在kubernetes集群中运行telegraf守护程序,它将收集pods的度量?或者它将收集物理节点的度量?
我在我的测试kubernetes集群中创建了一个telegraf守护程序,它在我的笔记本电脑hyperv下运行,基于这个kubernetes集群安装:
我想收集豆荚的指标,但它没有到达Kafka机器。我在日志中看到这个错误:

2019-05-08T02:36:35Z I! Starting Telegraf 1.9.2
2019-05-08T02:36:35Z I! Using config file: /etc/telegraf/telegraf.conf
2019-05-08T02:46:36Z E! [agent] Failed to connect to output kafka, retrying in 15s, error was 'kafka: client has run out of available brokers to talk to (Is your cluster reachable?)'

这是守护程序集定义文件:

apiVersion: v1
kind: ConfigMap
metadata:
  name: telegraf
  namespace: monitoring
  labels:
    k8s-app: telegraf
data:
  telegraf.conf: |+
    [global_tags]
      env = "$ENV"
    [agent]
      hostname = "$HOSTNAME"
      interval = "60s"
      round_interval = true
      metric_batch_size = 1000
      metric_buffer_limit = 10000
      collection_jitter = "0s"
      flush_interval = "10s"
      flush_jitter = "2s"
      precision = ""
      debug = false
      quiet = true
      logfile = ""

    [[outputs.kafka]]
      brokers = ["10.121.63.5:9092", "10.121.63.18:9092", "10.121.62.64:9092", "10.121.62.80:9092", "10.121.63.22:9092"]
      topic = "telegraf-measurements-json"
      client_id = "golangsarama__1.18.0__serverinfra__telegraf"
      routing_tag = "host"
      version = "0.11.0.2"
      compression_codec = 2
      required_acks = 1
      data_format = "json"

    [[inputs.cpu]]
      percpu = true
      totalcpu = true
      collect_cpu_time = false
      report_active = false
    [[inputs.disk]]
      ignore_fs = ["tmpfs", "devtmpfs", "devfs"]
    [[inputs.diskio]]
    [[inputs.kernel]]
    [[inputs.mem]]
    [[inputs.processes]]
    [[inputs.swap]]
    [[inputs.system]]
    [[inputs.docker]]
      endpoint = "unix:///var/run/docker.sock"
    [[inputs.kubernetes]]
      url = "https://192.168.213.18:6443"
      insecure_skip_verify = true

---

# Section: Daemonset

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: telegraf
  namespace: monitoring
  labels:
    k8s-app: telegraf
spec:
  selector:
    matchLabels:
      name: telegraf
  template:
    metadata:
      labels:
        name: telegraf
    spec:
      containers:
      - name: telegraf
        image: docker.io/telegraf:1.9.2
        resources:
          limits:
            memory: 500Mi
          requests:
            cpu: 500m
            memory: 500Mi
        env:
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: "HOST_PROC"
          value: "/rootfs/proc"
        - name: "HOST_SYS"
          value: "/rootfs/sys"
        - name: ENV
          valueFrom:
            secretKeyRef:
              name: telegraf
              key: env
        volumeMounts:
        - name: sys
          mountPath: /rootfs/sys
          readOnly: true
        - name: proc
          mountPath: /rootfs/proc
          readOnly: true
        - name: docker-socket
          mountPath: /var/run/docker.sock
        - name: utmp
          mountPath: /var/run/utmp
          readOnly: true
        - name: config
          mountPath: /etc/telegraf
      terminationGracePeriodSeconds: 30
      volumes:
      - name: sys
        hostPath:
          path: /sys
      - name: docker-socket
        hostPath:
          path: /var/run/docker.sock
      - name: proc
        hostPath:
          path: /proc
      - name: utmp
        hostPath:
          path: /var/run/utmp
      - name: config
        configMap:
          name: telegraf

这是我创建守护程序的文章。
这是豆荚:

NAMESPACE     NAME                                 READY   STATUS    RESTARTS   AGE
default       nginx-65f88748fd-jztrz               1/1     Running   0          7d18h
kube-system   coredns-fb8b8dccf-rl48l              1/1     Running   0          7d18h
kube-system   coredns-fb8b8dccf-x8fvx              1/1     Running   0          7d18h
kube-system   etcd-k8s-master                      1/1     Running   2          7d18h
kube-system   kube-apiserver-k8s-master            1/1     Running   2          7d18h
kube-system   kube-controller-manager-k8s-master   1/1     Running   0          7d18h
kube-system   kube-flannel-ds-amd64-96tsl          1/1     Running   0          7d18h
kube-system   kube-flannel-ds-amd64-b884r          1/1     Running   0          7d18h
kube-system   kube-flannel-ds-amd64-pdqmq          1/1     Running   0          7d18h
kube-system   kube-proxy-42k2g                     1/1     Running   0          7d18h
kube-system   kube-proxy-77pw9                     1/1     Running   0          7d18h
kube-system   kube-proxy-n5mbs                     1/1     Running   0          7d18h
kube-system   kube-scheduler-k8s-master            1/1     Running   2          7d18h
monitoring    telegraf-dvtcl                       1/1     Running   5          117m
monitoring    telegraf-n2mqz                       1/1     Running   5          117m

tcpdump显示从守护程序发送的内容:

09:52:59.002901 IP 192.168.1.10.45546 > sdsfdsf.XmlIpcRegSvc: Flags [S], seq 3040818525, win 28200, options [mss 1410,sackOK,TS val 158999344 ecr 0,nop,wscale 7], length 0
E..<2.@.@......

y?...#..?5]......n(._.........
        z#0........................
09:52:59.002901 IP 192.168.1.10.45546 > sdsfdsf.XmlIpcRegSvc: Flags [S], seq 3040818525, win 28200, options [mss 1410,sackOK,TS val 158999344 ecr 0,nop,wscale 7], length 0
E..<2.@.@......

y?...#..?5]......n(._.........

但我在我们的grafana Jmeter 盘上什么也看不到。如果我在节点上安装一个独立的基于rpm的telegraf,它就会发出,我可以看到度量。但我对pod指标很好奇。

8yparm6h

8yparm6h1#

来自telegraf的这个错误仅仅意味着没有连接到配置中的代理数组中的10类ip代理范围。取决于你如何设置网络和路由,你可能只是有一个简单的路由问题,以那些私人IP拥有你的Kafka集群。

相关问题