While the New Relic Kubernetes OpenTelemetry Collector is designed to be robust and reliable, issues can still arise. This troubleshooting document provides troubleshooting steps for common problems you might encounter.
General Collector Pod Issues
Check out the logs of the Collector pod that's experiencing issues. Run this command:
$kubectl logs <otel-pod-name> -n newrelic
To enable detailed DEBUG
level logging for troubleshooting, set the verboseLog
parameter to true
in the nr-k8s-otel-collector
Helm chart.
Metric collection failures
Problem: Metrics are not being collected or sent to New Relic.
Troubleshooting:
Verify scrape configurations: Ensure your
prometheus
receiver configurations within the collector's configuration (extraConfig
ordefault
) are correct.bash$kubectl describe configmap prometheus-config -n monitoringCheck pod annotations: If you're using Prometheus service discovery, confirm that your application pods have the correct
prometheus.io/scrape=true
annotations.bash$kubectl get pods --namespace=[your-namespace] --show-labels | grep 'prometheus.io/scrape=true'Test network connectivity: Ensure the collector pod can reach the metric endpoints.
bash$kubectl exec [prometheus-pod-name] -- curl <http://service:port>
Configuration overrides not taking effect
Problem: Custom configurations are not properly applied.
troubleshooting:
Review your
values.yaml
: Double-check yourvalues.yaml
file for typos or incorrect indentation in theextraConfig
section.bash$cat helm-charts/charts/nr-k8s-otel-collector/values.yaml | grep extraConfigValidate applied
ConfigMaps
: The Helm chart generatesConfigMaps
from yourvalues.yaml
. Inspect the resultingConfigMap
to ensure your custom settings are present.bash$kubectl describe configmap [collector-configmap-name] -n monitoring
Collector failing to start
Problem: The OpenTelemetry collector pod fails to initialize or crashes repeatedly.
Troubleshooting:
Inspect pod logs: The most common first step. Look for specific error messages that indicate misconfigurations or missing dependencies.
bash$kubectl logs [collector-pod-name] --namespace=monitoringVerify environment variables: Ensure required environment variables are correctly injected.
bash$kubectl exec [collector-pod-name] -- env | grep -i [variable-name]
Network failures
Problem: The collector cannot communicate or send data to New Relic.
Troubleshooting:
Check DNS resolution: Ensure the collector pod can resolve service names or New Relic endpoints.
bash$kubectl exec [collector-pod-name] -- nslookup service-nameRun connectivity tests: Test connectivity to internal services or external New Relic endpoints.
bash$kubectl exec [collector-pod-name] -- curl -I <http://service-name:port>Review network policies: If you have strict network policies in your cluster, ensure they allow traffic for the OpenTelemetry Collector pods to internal services and external New Relic endpoints.
bash$kubectl describe networkpolicy -n [namespace]
Support
If you have issues with the OpenTelemetry observability for Kubernetes, refer to:
- Issues section on GitHub for any similar problems or consider opening a new issue.