Hadoop Training in Hyderabad
Kosmik is one of the best IT training institutes in Hyderabad, Provides Hadoop training in Hyderabad. we providing Online & Classroom training in Hyderabad. We are providing lab facilities with complete real-time training. Training is based on complete advance concepts. So that you can get easily "hands-on experience". We will give 100% job assistance.
Course Content
- 
- 
- Introduction to Hadoop
 - Hadoop Availability
 - Advantages and disadvantages
 - Scaling
 - Introduction to Big Data
 - What is big data technology?
 - Big data opportunities and challenges
 - Characteristics of big data analytics
 
 
 - 
 
Introduction to Hadoop course
- 
- 
- Hadoop Distributed File System (HDFS)
 - Difference between Hadoop and SQL database
 - Industrial applications of Hadoop
 - Data locality concept
 - Hadoop architecture tutorial
 - Map Reduce and HDFS.
 - Using the Hadoop single node image
 - Hadoop Distributed File System
 - HDFS is designed for streaming data access
 - Data nodes, Name nodes, and Blocks
 - What is Hadoop Federation?
 - Hadoop commands with examples
 - Basic file system operations in Hadoop
 - Anatomy of File Read & write
 - Hadoop custom block placement
 - Configuration settings file extension
 - Difference between f-image and edit log
 - How to add data nodes in Hadoop
 - How to decommission a Data Node dynamically
 - FSCK Utility
 - Overriding log back configurations
 - HDFS Federation
 - Zookeeper force leader election
 
 
 - 
 
Map Reduce
- 
- 
- Functional programming examples
 - Map Reduce explained simply
 - Hadoop Map-Reduce architecture
 - Anatomy of a Map Reduce Job Run
 - Hadoop job status command line
 - Shuffling and Sorting
 - Splits, Partition, Record reader, Types of partitions and Combiner
 - Optimization Techniques Speculative Execution, Slots
 - Types of Counters and Schedules
 - Difference between Old API and New API at code and Architecture Level
 - Getting the data from RDBMS into HDFS using Custom data types
 - Distributed Cache and Hadoop Streaming
 
 
 - 
 
YARN
- 
- 
- Sequential file and map file organization
 - Hadoop compression codec example
 - Map side Join with Distributed Cache
 - Types of Input and Output Formats
 - Handling small files using Combine file Input Format
 
 
 - 
 
Map or Reduce Programming – Java Programming
- 
- 
- Sorting files using Hadoop Configuration API discussion
 - How to use grep command in Hadoop
 - DB input format example
 - Job dependency API discussion questions
 - Input Format & slip API discussion
 - The custom comparator in Hadoop
 
 
 - 
 
NoSQL
- 
- 
- Acid vs base properties
 - Cap theorem example
 - No SQL database list
 - Columnar Databases in Detail
 - Bloom Filters and Compensation
 
 
 - 
 
HBase
- 
- 
- Install HBase on Hadoop cluster
 - HBase basic concepts
 - HBase vs relational database
 - Master and Region Servers
 - HBase Operations through Shell and Programming and HBase overview
 - Catalog Tables
 - Block Cache and sharing
 - Splits
 - DATA Modeling
 - JAVA API and Rest Interface
 - HBASE Counters & filters
 - Large Loading and Coprocessors
 
 
 - 
 
Pre-requisites for Hadoop training in Hyderabad
- 
- 
- To learn Hadoop in any of the Hadoop training in Hyderabad, when we have sound knowledge in Core Java concepts, it must understand the foundations about Hadoop.
 - Important concepts in Java will be provided by us to get into the Actual concepts of Hadoop training in Hyderabad's.
 - Foundation of Java is very much important for effective Hadoop training institutes in Hyderabad technologies.
 - Having a good idea about Pig programming will make Hadoop run easier. Also, Hive can be useful in performing Data warehousing.
 - Basic knowledge on Unix Commands also needed for day to day execution of the Software.
 
 
 - 
 
Hive
- 
- 
- Installation
 - Introduction to HIVE
 - Hive Services, Hive Shell, Hive Server and Hive Web Interface
 - Meta store
 - OLTP vs OLAP
 - Working with Tables
 - Complex data types and Primitive data types
 - Working with Partitions
 - User Defined Functions
 - Hive bucketing without partition
 - Dynamic Partition
 - Differences between sorts by distribute by and order by
 - Bucketing and Sorted Bucketing with Dynamic partition
 - RC file format
 - Views and indexes
 - Map side joins
 - Options for compressing data stored in the hive
 - Dynamic sub station of Hive and Different ways of running Hive
 - Hive update example
 - Log analysis using Hive
 - Accessing base tables using Hive
 
 
 - 
 
Pig
- 
- 
- Installation
 - Different types of executions
 - Grunt Shell
 - Pig Latin commands
 - Data processing cycle
 - Schema on reading tools
 - MAP Schema, BAG Schema, and Tuple schema
 - Loading and Storing
 - Filtering
 - Grouping and Joining
 - Debugging commands
 - Validations and types of casting in Pig
 - Working with Functions
 - User Defined Functions
 - Splits and Multi query execution
 - Error handling, flatten and order by
 - Parameter Substitution
 - Nested For Each
 - User Defined Functions, Dynamic Invokers, and Macros
 - How to access HBASE use PIG.
 - Pig JSON loader example
 - Piggy Bank
 
 
 - 
 
SQOOP
- 
- 
- Installation
 - Import Data.
 - Incremental Import
 - Free Form Query Import
 - Export data to HBASE, HIVE, and RDBMS
 
 
 - 
 
CATALOG
- 
- 
- Installation.
 - Overview of CATALOG.
 - About Hcatalog with Map Reduce, HIVE and PIG.
 - Hands-on Exercises
 
 
 - 
 
FLUME
- 
- 
- Installation
 - Introduction to Flume
 - Flume Agents like Sources, Channels, and Sinks
 - Concepts of Log User information using Java program into HDFS, HBASE
 - Flame Commands
 
 
 - 
 
- 
- More ecosystems: HUE
 
 
Oozie
- 
- 
- Workflow Schedulers, Coordinators, and Bundles.
 - Workflow to show how to schedule Sqoop Job, Hive, PIG, and Map Reduce
 - Zoo Keeper
 - HBASE Integration with HIVE and PIG.
 - Phoenix
 - Proof of concept
 
 
 - 
 
SPARK
- 
- 
- Introduction
 - Linking with Spark
 - Initializing Spark
 - Using the Shell
 - Resilient Distributed Datasets
 - Parallelized Collections
 - External Datasets
 - RDD Operations
 - Basics, Passing Functions to Spark
 - Working with Key-Value Pairs
 - Transformations
 - Actions
 - RDD Persistence
 - Which Storage Level to Choose?
 - Removing Data
 - Shared Variables
 - Broadcast Variables
 - Accumulators
 - Deploying to a Cluster
 - Unit Testing
 - Migrating from pre1.0 Versions of Spark
 
 
 - 
 

