One of the most important things you can do to maintain a high quality production environment is to make sure you have the web telemetry you need to detect and resolve poor user experience. This guide goes over making sure you're getting the data you need to optimize your browser monitoring. We'll help you ensure that you're:
Getting the most value from the data you're collecting
Seeing opportunities to optimize your service using your reported data
Able to quickly triage and troubleshoot issues
Getting the data you need to create real-time business KPI dashboards
Step 1 of 7
Tune your browser application naming and sub-account placement
First, you'll need to ensure your browser naming and data organization are in place. If needed, you can change the name of your browser application following the renaming guide. If you have data from multiple environments reporting into one browser application, you can create new Browser apps and update the JavaScript snippet in your pages to report the right app.
Remember to keep the following in mind as you check your browser monitoring organization:
Web application instrumentation from different environments (dev/qa/production) should report into different browser applications.
Which environment a browser application supports (Such as Dev, QA, and production environments).
The purpose of a browser application (customer facing, internal facing, website, website component, region or regions, etc).
Step 2 of 7
Tune JavaScript errors
Next, you'll need to work with your JavaScript errors, which negatively impact user experience and SEO by disrupting the page load process, displaying errors, and preventing the user from completing an action. First, make sure JavaScript errors are being captured using the UI or NRQL.
Open your web application in Browser. Open the Errors view under the left hand menu and verify that you can see JavaScript errors. If your application doesn't get a lot of traffic you may need to go back 24 hours or longer to see errors.
Run the following query:
SELECT count(*) FROM JavaScriptError WHERE appName = 'MyApp' SINCE 1 WEEK AGO
A 0 count means that no JavaScript errors have been captured.
You can check all your web applications in a sub account by running the following:
SELECT count(*) FROM JavaScriptError FACET appName LIMIT MAX SINCE 1 WEEK AGO
Web applications not present in the results have not reported JavaScript errors.
Making sure your browser agent is up to date. Newer browser versions may capture JavaScript errors previously overlooked for one reason or another.
Making sure the browser agent is places in the <HEAD/> tag of your pages. You can use Chrome developer tools to verify this.
Once you've ensured your JS errors are reporting, check that they also have event logs. The event log shows the browser interactions, AJAX calls, and traces that led up to a JS error. This can help you troubleshoot the root cause of errors.
To check you are capturing event logs, go to the JS Errors tab. Check several different errors to verify event logs appear.
Finally, make sure your JavaScript errors have stack traces.
Check several errors via the JS Errors tab. Stack traces will appear under the error event log.
Run the following query:
SELECT count(*) FROM JavaScriptError WHERE appName = 'MyApp' AND stackTrace IS NOT NULL AND stackTrace NOT LIKE '' SINCE 1 WEEK AGO
A 0 count means that no JavaScript errors have been captured.
You can check all your web applications in a sub account by running the following:
SELECT count(*) FROM JavaScriptError WHERE stackTrace IS NOT NULL AND stackTrace NOT LIKE '' FACET appName LIMIT MAX SINCE 1 WEEK AGO
Web applications not present in the results don't have JavaScript errors with stack traces.
Follow these instructions to troubleshoot missing stack traces. Or, follow these instructions if you can see stack tracks but can't expand them.
Step 3 of 7
Check page view grouping
Next, check the page view grouping. Page URLs in the Page views UI are automatically grouped to help you manage page performance better. The algorithm that determines the automatic grouping runs when your web app is instrumented for the first time. If your web traffic today is much different from when the app was first instrumented, you may be seeing too few groups.
Check the Page views UI for your app by selecting it from the left hand menu. If what you see looks a lot the screen shot below, make a note and follow the instructions in this guide on how to address it.
Run the following query:
SELECT count(*) from PageView WHERE appName = 'MyApp' AND browserTransactionName LIKE '*.*.*%/%' or browserTransactionName LIKE '%.%.%/*/*/*/%' or browserTransactionName LIKE '%.%.%/*/*/*' or browserTransactionName LIKE '%.%.%/*/*/%' FACET pageUrl limit 100 SINCE 1 WEEK AGO
The results show you which page URLs may be over grouped for your app.
You can check all your web applications in a sub account by running the following:
SELECT count(*) from PageView WHERE browserTransactionName LIKE '*.*.*%/%' or browserTransactionName LIKE '%.%.%/*/*/*/%' or browserTransactionName LIKE '%.%.%/*/*/*' or browserTransactionName LIKE '%.%.%/*/*/%' FACET browserTransactionName, pageUrl limit 100 SINCE 1 WEEK AGO
The results will give you the same data for multiple apps.
After you check your page views, you should do the same for your AJAX call grouping. AJAX calls are grouped to make it is easier to navigate them at scale. Sometimes there are so many AJAX calls that navigating them by individual request URL becomes difficult. Use the UI or a NRQL query to check if you need to adjust AJAX grouping.
Check AJAX grouping for your app by selecting it from the left hand menu and grouping by groupedRequestUrl. If what you see looks a lot the screen shot below, make a note and follow the instructions in this guide on how to address it.
Run the following query:
SELECT count(*) FROM JavaScriptError WHERE appName = _your app name_ AND stackTrace IS NOT NULL AND stackTrace NOT LIKE '' SINCE 1 WEEK AGO
A 0 count means that no JavaScript errors have been captured.
You can check all your web applications in a sub account by running the following:
SELECT count(*) FROM JavaScriptError WHERE stackTrace IS NOT NULL AND stackTrace NOT LIKE '' FACET appName LIMIT MAX SINCE 1 WEEK AGO
Web applications not present in the results don't have JavaScript errors with stack traces.
Next, enable distributed tracing in Browser to help you improve AJAX performance by tracing requests to the backend all the way to the final endpoint. Tracing information is useful for understanding which applications are impacting user experience. You can use this information to address services issues yourself or delegate to the team responsible.
Step 6 of 7
Set up deployments
Next, use NerdGraph to track changes in your web application so you can see the impact of changes you make against performance KPIs, conversions, and user engagement.
Step 7 of 7
Add custom attributes
Use custom attributes to filter and group data. Though custom attributes are optional, you can get a lot of value from using them. Below are the most commonly recommended attributes, though you may find you want to add more:
Recommended for all sites that have identifiable users. Follow the convention described in the errors inbox documentation to be able to identify the number of users impacted by errors and know which ones.
Measure the experience of a specific customer to meet SLAs or dig into support requests.
Additional custom attributes for retailers
Track conversion revenue real time. Measure the impact of cart abandonment or issues during checkout.
Track items purchased real time. Measure the impact of cart abandonment or issues during checkout.
Capture how many users come to your site as the result of an ad campaign or a promotion. Measure the impact of a promotion on conversions.
Capture store to gather information about click-to-collect performance. Measure the performance of in-store shopping web applications.
Useful if the product ID isn;t already captured in the page URL. Use this information to know which product pages aren't performing well. Know which product pages receive the most traffic and which receive the least.
Realizing the value
Like the process of monitoring services, your observability program will benefit through a dedicated team function that thinks critically about its expectations of return for its investment in effort. The following section outlines an approach for estimating the costs and benefits you should expect by incorporating web instrumentation into your observability practice.
Investments
Ensure all developers are familiar with New Relic agent SDKs and platform capabilities.
Cost model: Dependent on your company's developer FTE model and project estimation.
Estimation: Typically a number of hours for a developer to become effective using New Relic instrumentation features.
Initial: 16 hours training / exploration
Recurring: 4 hours/Q review
Per developer a yearly investment of 16-40 hours training to develop core skills and maintain skills currency for New Relic platform
The development effort required to implement and maintain instrumentation within a project.
Cost model: Dependent on your company's developer FTE model and project estimation.
Estimation: This tends to be dependent on the scope of the project and the amount of instrumentation work required.
Initial: 8 hours per developer per service
Recurring: 4 hours/Q maintenance
Per developer a project estimation of 16-32 hours developing and maintaining web instrumentation
Benefits
Our doc on alert quality delivers significant benefit to the operations team by ensuring the alert notifications from variant system performance are dealt with swiftly. This improves delivery and resource allocation during incident remediation.
An effective instrumentation practice federated into your observability program will greatly improve your team's ability to create meaningful alerts.
KPIs:
Volume: incident count
Volume: accumulated incident duration
Volume: mean-time-to-close (MTTC)
User engagement: mean time to investigate
Outcomes:
Less alert noise
Greater alert and incident responsiveness
Less unknown root cause
Increased operations productivity
Improved service delivery
Improving your web quality will have a direct impact on the key financial metrics for your service. This will require that you have a well rationalized financial model for your application. Typically this return can be projected by associating a currency value for each percent improvement on a core web quality measure like errors or apdex attainment.
As your investment in service instrumentation increases, you should see improved attainment on your service quality measures.
KPI: Service quality (business KPI)
Outcomes:
Decreased number of user impacting errors
More performant and resilient service components
By providing better telemetry from your web service instances, your delivery organization should be able to more quickly detect volatility or downtime and remediate faster. This will lead to better overall service delivery KPIs and decrease episodes of outage or degradation.
Cost can be associated with the amount of time it takes to detect, investigate and remediate an incident. This might be related to the value the web service provides your organization that will be lost during an event, or may be related to the general cost to deal with the poor behavior.