Example of using YARN (Yet Another Resource Negotiator) in Java:

Category : Hadoop | Sub Category : Hadoop Concepts | By Prasad Bonam Last updated: 2023-07-12 10:58:12 Viewed : 75

Example of using YARN (Yet Another Resource Negotiator) in Java:

Here is an example of using YARN (Yet Another Resource Negotiator) in Java:

Lets consider a simple YARN application that calculates the sum of numbers using multiple containers (tasks) running in parallel.

Mapper Class:

import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; import org.apache.hadoop.util.Tool; import org.apache.hadoop.util.ToolRunner; import org.apache.hadoop.yarn.conf.YarnConfiguration; import org.apache.hadoop.yarn.util.Apps; import java.io.IOException; public class YarnExampleMapper extends Mapper<Object, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String[] numbers = line.split("\s+"); for (String number : numbers) { word.set(number); context.write(word, one); } } public static void main(String[] args) throws Exception { Configuration conf = new YarnConfiguration(); conf.set("fs.defaultFS", "hdfs://localhost:9000"); // Set HDFS URI conf.set("yarn.resourcemanager.hostname", "localhost"); // Set YARN ResourceManager hostname YarnExampleMapper mapper = new YarnExampleMapper(); int res = ToolRunner.run(conf, mapper, args); System.exit(res); } }

In this example, the YarnExampleMapper class extends Mapper and overrides the map() method to split the input text into numbers and emit each number with a count of 1.

The main() method in the YarnExampleMapper class sets the Hadoop and YARN configurations, including the HDFS URI and the YARN ResourceManager hostname.

To run this YARN application, you need to package the classes into a JAR file and submit it to your YARN cluster using the yarn jar command:

yarn jar yarn-example.jar YarnExampleMapper <input_file_path> <output_directory_path>

Replace yarn-example.jar with the name of your JAR file, <input_file_path> with the path to your input file, and <output_directory_path> with the desired output directory path.

YARN will distribute the input file among multiple containers, each running an instance of the YarnExampleMapper, and perform the map operation in parallel. The results will be aggregated, and the output will be stored in the specified output directory.

Make sure to adjust the HDFS URI and the YARN ResourceManager hostname according to your cluster configuration.

This is a basic example to demonstrate running a YARN application. In real-world scenarios, you may have multiple stages (mappers, reducers) and more complex data processing requirements.

Related Articles

Leave a Comment: