Troubleshoot Kubernetes Monitoring
This section includes common errors encountered while installing and configuring Kubernetes Monitoring components. If you used the easy configuration with Grafana Kubernetes Monitoring Helm chart, refer to Helm chart overview for more information.
Resolve missing Efficiency data
If your Efficiency view shows no data, it could be due to missing Node Exporter metrics. Navigate to Configuration in the main menu, and click the Cluster status tab to determine what is not being reported.
Resolve missing metrics
After configuration, if you are missing metrics even though the Metrics status tab under Configuration is showing the configuration is set up as you intended, check your configuration for an incorrectly configured label for the Node exporter instance.
Make sure the Node exporter instance
label is set to the Node name. The labels for kube-state-metrics node
and Node exporter instance
must contain the same values.
Resolve update error
If you attempted to upgrade Kubernetes Monitoring with the Update button on the Settings tab under Configuration and received an error message, complete the following instructions.
Warning
When you uninstall Grafana Alloy, this deletes its associated alert and recording rule namespace. Alerts added to the default locations are also removed. Save a copy any customized item if you modified the provisioned version.
- Click Uninstall.
- Click Install to reinstall.
- Complete the instructions in Configure with Grafana Kubernetes Monitoring Helm chart.
Resolve duplicate metrics
View the Cardinality page in the app to narrow down where your active series are originating from.
OpenShift support
With OpenShift’s default SecurityContextConstraints
(scc
) of restricted
(refer to the scc
documentation for more info), you may run into the following errors while deploying Grafana Alloy using the default generated manifests:
msg="error creating the agent server entrypoint" err="creating HTTP listener: listen tcp 0.0.0.0:80: bind: permission denied"
By default, the Alloy StatefulSet container attempts to bind to port 80
, which is only allowed by the root user (0
) and other privileged users. With the default restricted
SCC on OpenShift, this results in the preceding error.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 3m55s (x19 over 15m) daemonset-controller Error creating: pods "grafana-agent-logs-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used, spec.containers[0].securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000650000, 1000659999], spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]
By default, the Alloy DaemonSet attempts to run as root user and also attempts to access directories on the host (to tail logs). With the default restricted
SCC on OpenShift, this results in the preceding error.
To solve these errors, use the hostmount-anyuid
SCC provided by OpenShift, which allows containers to run as root and mount directories on the host.
If this does not meet your security needs, create a new SCC with the required tailored permissions, or investigate running Agent as a non-root container, which goes beyond the scope of this troubleshooting guide.
To use the hostmount-anyuid
SCC, add the following stanza to the alloy
and alloy-logs
ClusterRoles:
. . .
- apiGroups:
- security.openshift.io
resources:
- securitycontextconstraints
verbs:
- use
resourceNames:
- hostmount-anyuid
. . .