首頁猿問 Prometheus...

Prometheus Exporter - 直接檢測與自定義收集器

浮云間 2023-07-10 14:31:36

我目前正在為遙測網絡應用程序編寫一個 Prometheus 導出器。我已閱讀此處的文檔“編寫導出器”，雖然我了解實現自定義收集器以避免競爭條件的用例，但我不確定我的用例是否適合直接檢測?；旧希W絡指標通過網絡設備通過 gRPC 進行流式傳輸，因此我的導出器只需接收它們，而不必有效地抓取它們。我使用以下代碼直接檢測：我使用 promauto 包聲明我的指標以保持代碼緊湊：package metricsimport (? ? "github.com/lucabrasi83/prom-high-obs/proto/telemetry"? ? "github.com/prometheus/client_golang/prometheus"? ? "github.com/prometheus/client_golang/prometheus/promauto")var (? ? cpu5Sec = promauto.NewGaugeVec(? ? ? ? prometheus.GaugeOpts{? ? ? ? ? ? Name: "cisco_iosxe_iosd_cpu_busy_5_sec_percentage",? ? ? ? ? ? Help: "The IOSd daemon CPU busy percentage over the last 5 seconds",? ? ? ? },? ? ? ? []string{"node"},? ? )下面是我如何簡單地設置 gRPC 協議緩沖區解碼消息的指標值：cpu5Sec.WithLabelValues(msg.GetNodeIdStr()).Set(float64(val))最后，這是我的主循環，它基本上處理我感興趣的指標的遙測 gRPC 流：for {? ? ? ? req, err := stream.Recv()? ? ? ? if err == io.EOF {? ? ? ? ? ? return nil? ? ? ? }? ? ? ? if err != nil {? ? ? ? ? ? logging.PeppaMonLog(? ? ? ? ? ? ? ? "error",? ? ? ? ? ? ? ? fmt.Sprintf("Error while reading client %v stream: %v", clientIPSocket, err))? ? ? ? ? ? return err? ? ? ? }? ? ? ? data := req.GetData()? ? ? ? msg := &telemetry.Telemetry{}? ? ? ? err = proto.Unmarshal(data, msg)? ? ? ? if err != nil {? ? ? ? ? ? log.Fatalln(err)? ? ? ? }? ? ? ? if !logFlag {? ? ? ? ? ? logging.PeppaMonLog(? ? ? ? ? ? ? ? "info",? ? ? ? ? ? ? ? fmt.Sprintf(? ? ? ? ? ? ? ? ? ? "Telemetry Subscription Request Received - Client %v - Node %v - YANG Model Path %v",? ? ? ? ? ? ? ? ? ? clientIPSocket, msg.GetNodeIdStr(), msg.GetEncodingPath(),? ? ? ? ? ? ? ? ),? ? ? ? ? ? )? ? ? ? }? ? ? ? }}我使用 Grafana 作為前端，到目前為止，在關聯 Prometheus 公開的指標與直接在設備上檢查指標時，還沒有看到任何特定的差異。所以我想了解這是否遵循 Prometheus 最佳實踐，或者我仍然應該采用自定義收集器路線。

查看完整描述

1 回答

鳳凰求蠱

TA貢獻1825條經驗獲得超4個贊

您沒有遵循最佳實踐，因為您正在使用您鏈接到的文章所警告的全局指標。使用您當前的實現，在設備斷開連接后（或者更準確地說，直到您的導出器重新啟動），您的儀表板將永遠顯示 CPU 指標的一些任意且恒定的值。

相反，RPC 方法應該維護一組本地指標，并在方法返回后將其刪除。這樣，當設備斷開連接時，設備的指標就會從抓取輸出中消失。

這是執行此操作的一種方法。它使用包含當前活動指標的地圖。每個映射元素都是一個特定流的一組指標（我理解它對應于一個設備）。一旦流結束，該條目就會被刪除。

package main

import (

"sync"

"github.com/prometheus/client_golang/prometheus"

)

// Exporter is a prometheus.Collector implementation.

type Exporter struct {

// We need some way to map gRPC streams to their metrics. Using the stream

// itself as a map key is simple enough, but anything works as long as we

// can remove metrics once the stream ends.

sync.Mutex

Metrics map[StreamServer]*DeviceMetrics

}

type DeviceMetrics struct {

sync.Mutex

CPU prometheus.Metric

}

// Globally defined descriptions are fine.

var cpu5SecDesc = prometheus.NewDesc(

"cisco_iosxe_iosd_cpu_busy_5_sec_percentage",

"The IOSd daemon CPU busy percentage over the last 5 seconds",

[]string{"node"},

nil, // constant labels

)

// Collect implements prometheus.Collector.

func (e *Exporter) Collect(ch chan<- prometheus.Metric) {

// Copy current metrics so we don't lock for very long if ch's consumer is

// slow.

var metrics []prometheus.Metric

e.Lock()

for _, deviceMetrics := range e.Metrics {

deviceMetrics.Lock()

metrics = append(metrics,

deviceMetrics.CPU,

)

deviceMetrics.Unlock()

}

e.Unlock()

for _, m := range metrics {

if m != nil {

ch <- m

}

// Describe implements prometheus.Collector.

func (e *Exporter) Describe(ch chan<- *prometheus.Desc) {

ch <- cpu5SecDesc

}

// Service is the gRPC service implementation.

type Service struct {

exp *Exporter

}

func (s *Service) RPCMethod(stream StreamServer) (*Response, error) {

deviceMetrics := new(DeviceMetrics)

s.exp.Lock()

s.exp.Metrics[stream] = deviceMetrics

s.exp.Unlock()

defer func() {

// Stop emitting metrics for this stream.

s.exp.Lock()

delete(s.exp.Metrics, stream)

s.exp.Unlock()

}()

for {

req, err := stream.Recv()

// TODO: handle error

var msg *Telemetry = parseRequest(req) // Your existing code that unmarshals the nested message.

var (

metricField *prometheus.Metric

metric prometheus.Metric

)

switch msg.GetEncodingPath() {

case CpuYANGEncodingPath:

metricField = &deviceMetrics.CPU

metric = prometheus.MustNewConstMetric(

cpu5SecDesc,

prometheus.GaugeValue,

ParsePBMsgCpuBusyPercent(msg), // func(*Telemetry) float64

"node", msg.GetNodeIdStr(),

)

default:

continue

}

deviceMetrics.Lock()

*metricField = metric

deviceMetrics.Unlock()

}

return nil, &Response{}

}

反對回復 2023-07-10

1 回答
0 關注
176 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

Prometheus Exporter - 直接檢測與自定義收集器

Prometheus Exporter - 直接檢測與自定義收集器

1 回答

添加回答