• /
  • EnglishEspañol日本語한국어Português
  • Log inStart now

Understand and monitor the service and application layer

In your kubernetes system, each pod contains services and applications that provide the actual functionality that your kubernetes system supports. The system could support computation, a web app, or anything inbetween.

Your system might be healthy as a whole, but individual applications and services might fail or throw errors. The following steps guide you through a general strategy to monitor and triage your applications and services:

Triage your application

This page shows you a general overview of all the instances of that application within your Kubernetes cluster. There are various useful charts and graphs here, but take a close look at the activity stream on the far right. This will highlight any important performance events of those applications. Increase the time range as necessary to gather a full view of the performance history.

Only you can decide what's acceptable, but multiple events a day indicates you could improve performance. For example, in the image above there are multiple Apdex warnings within just a few hours. Apdex warnings indicate a degraded user experience.

The main overview dashboard for an APM service in a Kubernetes cluster

Identify the cause of performance issues

Scroll down until you see four graphs. On the top left of each graph, select the dropdown and set the graphs to the following:

  • Service error rate

  • Service throughput

  • Service response time

  • Container restart count

    The main overview dashboard for an APM service in a Kubernetes cluster

    The first three graphs will show you the health of your applications. The restart count graph helps you correlate if your performance has any affect on your general pod health.

    In the screenshot above we can note a few things:

  • The error rate stays at zero, which means errors are not affecting performance

  • The service throughput spikes extremely often

  • The service response time regularly fluctutes close to 70ms

  • The container restart graph stays at zero, which means the performance of my applications is not causing critical failures in my cluster

    In this case, you can identify throughput and response time as the key indicators of your degraded performance. There are many ways to solve these from optimizing the application itself or just throwing more CPU power at the containers hosting the application.

What's next?

Now that you've learned how to use New Relic to monitor Kubernetes, you can explore our other tutorials:

  • Is your app running slow? Learn how to triage and diagnose latency in your app with our My app is slow tutorial.
  • If you have a peak demand day coming up, learn how New Relic can help you with capacity planning.
  • Do you want to create high quality alerts? Our alerts tutorial can help you set up an alerting system.

Previous step

Monitor Kubernetes deployments and pods.

Copyright © 2024 New Relic Inc.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.