The Apache Hadoop YARN (Yet Another Resource Negotiator) is a cluster management technology. It is one of the key features in the second-generation Hadoop 2.0 version. The Apache Software Foundation’s open source distributed process framework
The idea of YARN is to split resource management and job scheduling/monitoring into separate. The idea is to have a global Resource Manager (RM) and per-application Application Master (AM). An application is either a single job or a DAG of jobs.
The Resource Manager and the Node Manager from the data computation framework. The Resource Manager arbitrates resources among all the applications in the system. The Node Manager framework handles containers, monitoring. Their resource usage (CPU, memory, disk, network) and reporting the same to the Resource Manager/Scheduler.
The per-application Application Master is the effect. Resource Manager and working with the Node Manager(s) to execute and check the tasks.
The Resource Manager has two main components: Scheduler and Applications Manager.
The Scheduler handles various running applications subject to familiar constraints of capacities, queues etc. That it performs no monitoring or tracking of status for the application. No guarantees about restarting failed tasks either due to application failure or hardware failures. The Scheduler performs its scheduling function based the resource requirements of the applications. It does so based on the abstract notion of a resource Container. This incorporates elements such as memory, CPU, disk, network etc.
The Scheduler handles partitioning the cluster resources among the various queues, applications etc. The current schedulers such as the capacity scheduler and fair scheduler.
The Applications Manager handles accepting job-submissions. The first container for executing the application specific Application Master. It provides the service for restarting the Application Master container on failure. The per-application Application Master has the responsibility resource containers. Monitoring for progress the Scheduler, tracking their status.
Map Reduce in hadoop-2.x maintains API compatibility with the previous stable release (hadoop-1.x). This means that all Map Reduce jobs should still run unchanged on top of YARN with just a recompile.