Contents

Mastering Alerts in Grafana: Kubernetes Provisioning and Slack Notifications

Mastering Alerts in Grafana: Kubernetes Provisioning and Slack Notifications webp image

Grafana, combined with Prometheus and Loki, provides a comprehensive, widely adopted open-source observability stack for monitoring and observability. These tools, while certainly powerful, require skills in configuration and management. Even the most advanced monitoring and observability systems are ineffective without finely tuned alerts and notifications.

What’s the point of the monitoring system if you can’t react on time when your application encounters downtime?

Since the Grafana 9.1 release, it is possible to provision the entire Grafana Alerting stack as a code using Kubernetes or Terraform.

In this tutorial, I will share how to configure alerts in Grafana based on specific log keywords and illustrate how to receive notifications in the Slack channel using file provisioning in Kubernetes. Let’s get started.

Prerequisites

Install Grafana using kube-prometheus-stack Helm Chart. Deploy Loki to your cluster using loki-stack Helm Chart.

Enable file provisioning for Grafana Alerting

File provisioning with Kubernetes uses ConfigMaps. Grafana Alerting sidecar container, which runs next to Grafana container in a single pod, scans the ConfigMaps. When the ConfigMap with the label grafana_alert is found, provisioned resources are loaded into Grafana.

Here is a preview of how such Configmap looks. Under the data field, you add templates of Grafana Alerting resources.

apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-alerting
  namespace: monitoring
  labels:
    grafana_alert: "1"
data:
  grafana-alert.yaml: |-

Grafana Alerting sidecar container is, by default, disabled in the Grafana Helm Chart. To enable it, add this code to your kube-prometheus-stack Helm Chart values:

grafana:
  sidecar:
    alerts:
      label: grafana_alert
      enabled: true
      searchNamespace: ALL
      labelValue: "1"

Deploy the Helm Chart and verify the alerting sidecar runs along Grafana.

Grafana Alerting resources

Let's explore key concepts in Grafana Alerting:

  • alert rules - These rules define the conditions that trigger an alert. In this case, we use Loki as a data source. This allows us to generate alerts based on specific log keywords such as error or exception.
  • contact point - This includes the configuration for sending notifications. In our setup, Slack serves as the contact point where alert notifications are dispatched.
  • notification policy - Notification policies can aggregate and route multiple alerts to different recipients. For example, you can route various alerts to the same contact point based on labels.
  • notification template - With notification templates, you can create custom messages and reuse them in alert notifications.

Alert rules

The easiest way to provision Grafana Alert is to create them in Grafana UI and export them in a YAML file.

To create alerts, go to the Alerting view and open the Alert Rules page. Click the Create Alert Rule button.

In Step 1 provide your alarm name.

In Step 2 use Loki as a Datasource.

In our setup, we want to receive an alert when the application throws an exception. Add first expression A. Add query shown below that will extract from logs message exception:

Create a query:

count_over_time({app="demo-app"} | json message="message" |~ `exception` [5m])

We are using the count_over_time function, which counts the entries for each log stream within the given range.

image7

Add a second expression B of type Reduce. Add a third expression C of type Threshold. For the input specify expression B, and for the is above specify 0. This configuration will trigger a notification each time the exception keyword is detected in the logs. Set it as the alert condition.

image1new

In Step 3, choose your folder where to store the alert and evaluation group. Set the evaluation period to 0 seconds so the alert will fire immediately.

image6

In step 4, add the alarm summary and, if available, the runbook URL. For the alarm description, use the extracted message label from the log. You can also specify the dashboard and panel associated with this alert rule. Add them in the Link the dashboard and panel view.

image2

In Step 5, add labels, for example, app=demo-app and environment=production. With labels, you can route your alert notifications to different contact points.

image3

Now, you can export the alert rule and pass it to the template inside the ConfigMap.

How to export Grafana alert rules:

Go to https://grafana-url/api/v1/provisioning/alert-rules/{alert_rule_ID}/export and copy the rule.

To get the alert rule ID, open the alert rule details:

image1

After exporting the alert rule, delete it in the Grafana UI in order to provision it as a code. If you attempt to provision an alert template via code with the same name as an existing alert you've created in Grafana UI without deleting it, you will encounter an error.

Slack message template

Now, it’s time to create reusable templates for Slack messages.

Create two templates: one for the Slack message and one for the title.

Message template:

templates:
  - orgID: 1
    name: slack.message
    template: |-
      {{ define "slack.print_alert" -}}
      *Alert*: {{ .Labels.alertname }}
      {{ if .Annotations -}}
      *Summary*: {{ .Annotations.summary}}
      *Description*: {{ .Annotations.description }}
      {{ end -}}
      *Log message*: {{ index .Labels "message" }}
      *Explore logs:* https://grafanaURL.com/explore?orgId=1
      {{ if .DashboardURL -}}
      *Go to dashboard:* {{ .DashboardURL }}
      {{- end }}
      {{ if .PanelURL -}}
      *Go to panel:* {{ .PanelURL }}
      {{- end }}

      *Details:*
      {{ range .Labels.SortedPairs -}}
      - *{{ .Name }}:* `{{ .Value }}`
      {{ end -}}

      {{ if .SilenceURL -}}
      *Silence this alert:* {{ .SilenceURL }}
      {{- end }}
      {{- end }}

      {{ define "slack.message" -}}
      {{ if .Alerts.Firing -}}
      {{ len .Alerts.Firing }} firing alert(s):
      {{ range .Alerts.Firing }}
      {{ template "slack.print_alert" . }}
      {{ end -}}
      {{ end }}

      {{- end }}

Let's break down the components of the Slack message template for better understanding. Starting from the top, we find an alert name, a summary, and description. Following that, the extracted log message included in the template.

Next, there is a link to the Explore logs page in Grafana.

If configured in the alert rule, you'll also have the option to navigate to specific dashboards and panels related to the alert.

The message includes any remaining labels for further details.

Lastly, you'll find a link that allows you to silence the alert quickly when necessary.

Title template

templates:
  - orgID: 1
    name: slack.title
    template: |-
      {{ define "slack.title" -}}
      [{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] Grafana Alerting Notification
      {{- end -}}

Contact point

Our contact point will be the Slack channel #grafana-alerts. To add Slack as a contact point, first create a simple Slack app and enable incoming webhooks. Next, create the Incoming Webhook for your Slack channel. Follow this instruction for more details. Here is the example configuration:

contactPoints:
  - orgId: 1
    name: grafana-alerts
    receivers:
      - uid: grafana
        type: slack
        settings:
          username: grafana_bot
          url: incoming_webhook_url
          title: |
            {{ template "slack.title" . }}
          text: |
            {{ template "slack.message" . }}

Notification policy

The root notification policy is a default receiver for alerts that don’t match the specified labels. Specify your default channel or set Grafana email notifications.
Define a child notification policy to route all alerts with the label app=demo-app to #grafana-alerts Slack channel.

policies:
  - orgId: 1
    receiver: default-channel-name
    group_by: ['...']
    routes:
      - receiver: grafana-alerts
        object_matchers:
          - ['app', '=', 'demo-app']

Put all resources into a single ConfigMap or create a separate one for each alerting resource.

The ConfigMap with all defined resources should look like this:

kind: ConfigMap
metadata:
  name: grafana-alerting
  namespace: monitoring
  labels:
    grafana_alert: "1"
data:
  grafana-alerting.yaml: |-
    apiVersion: 1
    templates:
      - orgID: 1
        name: slack.message
        template: |-
          # add here your notification template
    policies:
      - orgId: 1
        receiver: default-channel-name
        group_by: ['...']
        routes:
          - receiver: grafana-alerts
            object_matchers:
              - ['app', '=', 'demo-app]

    contactPoints:
      - orgId: 1
        name: grafana-alerts
        receivers:
          - uid: grafana
            type: slack
            settings:
              username: grafana_bot
              url: incoming_webhook_url
              title: |
                {{ template "slack.title" . }}
              text: |
                {{ template "slack.message" . }}

      groups:
        - orgId: 1
          name: demo-app
          folder: Log filtering
          interval: 5m
          rules:
            - title: Exception in demo-app
            # add here exported alert rules 

Apply those manifests and go to the Grafana Alerting page to confirm the correct provisioning of all resources. They should be labeled as Provisioned.

Note: Provisioned resources are immutable - you can’t edit them from the UI.

Next, navigate to the contact points page and test the contact point:

image8

If you receive a test notification in your Slack channel, congratulations, you have set it up correctly. You are ready now to receive alerts.

Here's a preview of the message sent to the Slack channel. This is a test notification, so it's a shortened version without a log message or a link to the dashboard or panel.

image5

Summary

Here's a valuable lesson from my experience: when you start setting up Grafana Alerting, first create all your essential resources within Grafana UI. Then, use the Export button to save them. Since the 10.1 release, this feature now includes not only alert rules but also contact points and notification policies for export directly from the UI. In my early days with Grafana Alerting, these export capabilities weren't available, and I found myself spending extra time troubleshooting YAML files to pinpoint errors. This tip can save you valuable time and potential frustration in your setup process.

Nevertheless, the Kubernetes provisioning is a great option - you can seamlessly define your monitoring and observability solutions within your GitOps repository. Grafana Alerting offers flexibility and customizability in alerting configuration, which is the key to quick reactions for firing alerts. This guide will help you better understand Grafana Alerting concepts and configure basic alerts for a Slack channel.

Reviewed by: Mariusz Walczyk

Blog Comments powered by Disqus.