Our Apache Hadoop integration monitors the performance of your Hadoop cluster and applications.
After setting up our Apache Hadoop, we give you a dashboard for your Apache Hadoop metrics.
To get data into New Relic, install our infrastructure agent. Our infrastructure agent collects and ingests data so you can keep track of your app's performance. The version should be 1.10.7 or higher to support NRI-Flex integration.
You can install the infrastructure agent two different ways:
- Our guided install is a CLI tool that inspects your system and installs the infrastructure agent alongside the application monitoring agent that best works for your system. To learn more about how our guided install works, check out our Guided install overview.
- If you'd rather install our infrastructure agent manually, you can follow a tutorial for manual installation for Linux, Windows, or macOS.
Flex comes bundled with the New Relic infrastructure agent. To create a flex configuration file follow these steps:
Create a file named
nri-flex-hadoop-config.ymlin this path:bash$/etc/newrelic-infra/integrations.d
Use our configuration template to update the fields
YOUR_DOMAINin the created file named
nri-flex-hadoop-config.yml. The value on the
event_typeis used to store metrics on the NRDB.
EVENT_TYPE1can be updated to
EVENT_TYPE2can be updated to
nri-flex-hadoop-config.ymlfile should look like this:integrations:- name: nri-flex# interval: 30sconfig:name: hadoopMetricsapis:- event_type: EVENT_TYPE1commands:# run any command, you could cat .json file, or run some commands that produce a json output# the example just calls an API that returns json- run: curl -s https://YOUR_DOMAIN:9870/jmx #json output is retrieved from this command- event_type: EVENT_TYPE2commands:- run: curl -s https://YOUR_DOMAIN:8088/jmx?qry=Hadoop:*
You can use our log forwarding to forward Apache Hadoop logs to New Relic.
On Linux machines, your log file named
logging.yml should be present in this path:
After creating the log file, add the following script to the
logs:- name: hadoop_secondarynamenode_logfile: /usr/local/hadoop/logs/hadoop-hadoopuser-secondarynamenode-hadoop-master.logattributes:logtype: hadoop_secondarynamenode_logs- name: hadoop_resourcemanager_logfile: /usr/local/hadoop/logs/hadoop-hadoopuser-resourcemanager-hadoop-master.logattributes:logtype: hadoop_hadoop_resourcemanager_logs- name: hadoop_namenode_logfile: /usr/local/hadoop/logs/hadoop-hadoopuser-namenode-hadoop-master.logattributes:logtype: hadoop_namenode_logs
Before you can start reading your data, use the instructions in our infrastructure agent docs to restart your infrastructure agent.
$sudo systemctl restart newrelic-infra.service
In a couple of minutes, your application will send metrics to one.newrelic.com.
You can choose our pre-built dashboard template named
Apache Hadoop to monitor your Apache Hadoop server metrics. Follow these steps to use our pre-built dashboard template:
- From one.newrelic.com, go to the + Add data page.
- Click on Dashboards.
- In the search bar, type
- The Apache Hadoop dashboard should appear. Click on it to install it.
Your Apache Hadoop dashboard is considered a custom dashboard and can be found in the Dashboards UI. For docs on using and editing dashboards, see our dashboard docs.
Here is a NRQL query to check the active users from the resource manager:
SELECT latest(activeUsers)FROM HadoopResourceManagerSample
Here is a NRQL query to view the number of active clients from the name node:
SELECT latest(numActiveClients)FROM HadoopNameNodeSample
To learn more about building NRQL queries and generating dashboards, check out these docs: