YARN Architecture


Hadoop YARN Training Institute in Hyderabad Kphb.


YARN Architecture



The Apache Hadoop YARN (Yet Another Resource Negotiator) is a cluster management technology. It is one of the key features in the second-generation Hadoop 2.0 version.  The Apache Software Foundation’s open source distributed process framework

The idea of YARN is to split resource management and job scheduling/monitoring into separate. The idea is to have a global Resource Manager (RM) and per-application Application Master (AM). An application is either a single job or a DAG of jobs.


Resource Manager



The Resource Manager and the Node Manager from the data computation framework. The Resource Manager arbitrates resources among all the applications in the system. The Node Manager framework handles containers, monitoring. Their resource usage (CPU, memory, disk, network) and reporting the same to the Resource Manager/Scheduler.


Application Master



The per-application Application Master is the effect.  Resource Manager and working with the Node Manager(s) to execute and check the tasks.




The Resource Manager has two main components: Scheduler and Applications Manager.




The Scheduler handles various running applications subject to familiar constraints of capacities, queues etc.  That it performs no monitoring or tracking of status for the application. No guarantees about restarting failed tasks either due to application failure or hardware failures. The Scheduler performs its scheduling function based the resource requirements of the applications. It does so based on the abstract notion of a resource Container. This incorporates elements such as memory, CPU, disk, network etc.

The Scheduler handles partitioning the cluster resources among the various queues, applications etc. The current schedulers such as the capacity scheduler and fair scheduler.


Applications Manager


The Applications Manager handles accepting job-submissions.  The first container for executing the application specific Application Master. It provides the service for restarting the Application Master container on failure. The per-application Application Master has the responsibility resource containers. Monitoring for progress the Scheduler, tracking their status.

Map Reduce in hadoop-2.x maintains API compatibility with the previous stable release (hadoop-1.x). This means that all Map Reduce jobs should still run unchanged on top of YARN with just a recompile.


Hadoop YARN Training Institute in Hyderabad Kphb



Learn Hadoop from certified Hadoop trainers. Advanced Hadoop training from real-time experts in Hyderabad.  kosmik technologies provide online Hadoop YARN Training Institute in Hyderabad  Kphb.




Hive Architecture

 Hive Training centers in Hyderabad Kphb


Hive Architecture


The Apache Hive is a data warehouse infrastructure. It built on top of Hadoop for providing data summarization, query, and analysis. Hive developed by Facebook. Apache Hive is now used and developed by other companies such as Netflix


Major Components of Hive





The UI User Interface. The user interface for users to submit queries and other operations to the system


The Driver receives the quires from UI. This component session handles and provides execute and fetch APIs modeled on JDBC/ODBC interfaces



The component that parses the query. It does semantic analysis on the different query blocks and query expressions. They generate the table and partition metadata looked up from the Meta store



The component that stores all the structure information. The partitions in the warehouse including column and column type information. The serializes and desterilizes read and write data HDFS files where the data store

Execution Engine

  The component which executes the execution plan created by the compiler. The plan is a DAG of stages. The execution engine manages stages of the plan and executes system components


Hadoop Hive Architecture





Step 1: 


The UI calls the execute interface to the Driver

Step 2: 


The Driver creates query and sends the query to the compiler to generate an execution plan

Step 3&4: 


The compiler needs the metadata so send a request for getting Metadata.  It receives the send Metadata request from MetaStore

Step 5:


This metadata checks the expressions in the query tree based on query predicates. The plan generated by the compiler is a DAG of stages with each stage being either a map/reduce job. A metadata operation or an operation on HDFS

Step 6: 


The execution engine submits these stages to appropriate components. It associated with the table or intermediate outputs use to read the rows from HDFS files

Step 7&8&9:


   the contents of the temporary file read by the execution engine. HDFS as part of the fetch call from the Driver


Hive Training centers in Hyderabad Kphb:

Learn Hadoop from certified Hadoop trainers. Advanced Hadoop training from real-time experts in Hyderabad. kosmik technologies provide online  Hive Training Centers in Hyderabad  Kphb.


AutoIT with Selenium | Selenium Coaching in Kukatpally

Selenium Coaching in Kukatpally

AutoIT with Selenium


Selenium is an open source tool that is design by automates web based applications. But to handle window GUI and non HTML popup in application. AutoIT is must as this Selenium doesn’t handle by window based activity.

AutoIt v3 is freeware. The combination of mouse movement, keystrokes and window control manipulation. To automate a task which is not possible by selenium web driver.

AutoIt Features


1) Easy to learn

2) Simulates keystrokes

3) Simulate mouse movements

4) Scripts can be compiling into standalone executable

5) Windows Management

6) Windows Controls

7) Detailed help file and large community

In short, any windows, mouse & keystrokes simulation. Which we cannot handle with Selenium that can handle with AutoIt. All we need to do is to use the script in Selenium which is generating with the help of AutoIt tool.

Moving ahead we will learn how to upload a file in selenium web driver using autoIT. Here we need three tools in to this.

  • Selenium Web driver
  • AutoIT editor and element identifier
  • The window that you want to automate


How to AutoIT download and install


1): Go to this link.

2): Dropdown on ‘Autoit’ and ‘Autoit Editor’ Hover.




 3) Click ‘AutoIT’ Downloads option.




 4): Download “Autoit” and then clicking on ‘Download Autoit’ button




5): Now download “Autoit editor” by clicking on ‘Downloads’ button.




 6): Click on the link as display below.




After download you will get two setup file as shown in below screen, first one is AutoIt version 3 setup. Second one is Scitautoit3.




 7): For Installing AutoIT-Click on both AutoIT setup one by one.


8): After successful installation – open up AutoIT Editor.




Go to ‘C: \Program Files


And click on ‘SciTE.exe’ file, the AutoIT editor opens as shown in below screen.




 9): Now opens element Identifier.


Go to ‘C: \Program Files (x86)\AutoIt3’




Selenium Coaching in Kukatpally


And click on ‘Au3Info.exe’ file, the element identifier opens as shown in below screen.



Importance of Hadoop EcoSystems | Hadoop Training in Hyderabad KPHB

Hadoop Training in Hyderabad KPHB

Importance of Hadoop EcoSystems 


Hadoop ecosystem is quite interest to envision. How we could adopt this within the realms of DevOps. Hadoop managed by Apache Foundation is a powerful open source platform written in java. That is capable of process large amount of heterogeneous data sets. It is design to scale up from single server to thousands of machines. Each offer local computation and storage and has become an in demand technical skill. Hadoop is an Apache top level project built and used by a global community of contributors and users.

The following sections provide information on most popular components:

Map Reduce: Framework for easy to write applications which process big amount of data. In parallel on large clusters of commodity hardware in a reliable, fault tolerant manner. In program, there are two functions which are most common in Map Reduce.

 Map Task: Input to convert it into divide smaller parts and distribute on other worker nodes. All worker nodes solve their own small problem and give answer to the master node.

Reduce Task: Master node combines all answers come from worker. some form of output which is answer of our big distributed problem in node and forms 

HDFS: HDFS is a distributed file-system that provides high throughput access to data. When data is push to HDFS.


Here are the main components of HDFS


Name Node: It maintains the name system directories and files. Manages the blocks which are present on the Data Nodes.

Data Nodes: They are the slaves which are deploy on each machine and provide the actual storage. They are responsible for serve clients wants read and write requests

Secondary Name Node: It is handled for perform periodic checkpoints. In the event of Name Node failure, you can restart the Name Node use the checkpoint.

Hive: Hive is part of the Hadoop ecosystem and provides an SQL like interface to Hadoop.

The main building blocks of Hive are

Metastore: To store the metadata about columns, partition and system catalogue.

Driver: To manage the lifecycle of a HiveQL statement

Query Compiler: To compiles HiveQL into a directed acyclic graph.

Execution Engine: To execute the tasks in proper order which are produce by the compiler.

Hive Server: To provide a Thrift interface and a JDBC / ODBC server.

HBase: As said earlier, HDFS works on write once and read many times pattern, but this isn’t a case always.  HBase built on top of HDFS and distributed on column oriented database.


Here are the main components of HBase:


HBase Master: It is handles for negotiate load balance across. All Region Servers and maintains the state of the cluster. It is not part of the actual data storage or retrieval path.


Hadoop Training in Hyderabad KPHB

Region Server: It is deploy on each machine and hosts data and processes I/O requests.


Sqoop Architecture | Hadoop Training Institute in KPHB Hyderabad

Sqoop Architecture | Hadoop Training Institute in KPHB Hyderabad

What is Sqoop?


Sqoop is a tool designed for efficient bulk data between Hadoop and external data stores. Such as relational databases, enterprise data warehouses


It is import data from external data stores into Hadoop Distributed File System. Like Hive and HBase. Similar, Sqoop can also be use to extract data from Hadoop and export it to external data stores. Such as relational hadoop databases, enterprise warehouses. It is works with relational databases such as Teradata, Netezza, Oracle, MySQL, Postgres etc.


Silent features like

Full Load

Incremental Load

Parallel import/export

Import results of SQL query


Connectors for all major RDBMS Databases

Kerberos Security Integration

Load data direct into Hive/Hbase

Support for Accumulo


Sqoop Architecture


 Provides command line interface to the end users. Sqoop can also be access using Java APIs. Sqoop command submitted by the end user is parse Sqoop and launches Hadoop Map only job to import. Just imports and exports the data it does not do any aggregations.


This is parses the arguments provided in the command line and prepares the Map job. Map job launch many mappers depends on the number defined by user in the command line. For Sqoop import, each mapper task will be assign with part of data to import based on key. Sqoop distributes the input data among the mappers equal to get high performance. Then each mapper creates connection with the database. Use JDBC and fetches the part of data assigned by Sqoop and writes it into HDFS. Based on the option provided in the command line.

Hadoop Training Institute in KPHB Hyderabad

 Architecture is a client-side tool, which is tight coupled with the Hadoop cluster. Client fetches initiated by the Sqoop command. The metadata of tables, columns, and data types, according to the connectors. The import or export is translated to a Map only Job program. The data in parallel between the databases and Hadoop to load. Clients should have the appropriate connector and driver for the execution of the process.


Report Parameters


 MSBI Training Institute in Hyderabad Kphb

Report Parameters


Reporting Services uses report parameters. The properties   much more. Report parameters enable you to control report data. Connect related reports together, and vary report presentation. It uses report parameters in paginated reports. The create in Report Builder and Report Designer. And also in mobile reports you create in SQL Server Mobile Report Publisher.


Control Paginated and Mobile Report Data



  • Filter paginated report data by writing dataset queries that contain variables.
  • Shared dataset to a paginated report. The query does not change.
  • Create mobile reports in SQL Server Mobile Report Publisher for more information.
  • Enable users to specify values to customize the data in a paginated report.


Connect Related Reports


  • Use parameters to relate main reports to drill through reports to sub reports


Vary Report Presentation


  • Send commands to a report server to the of a report.  Create mobile reports in SQL Server Mobile Report Publisher
  •  Provide a Boolean parameter whether to expand or collapse all nested row groups in a table.
  • Enable users to customize report data. The appearance by including parameters in an expression.



Viewing a Report with Parameters



  1.  The report viewer toolbar displays a prompt and default value for each parameter.
  2.  The prompt Select the Date appears next to the text box.
  3. The parameter @Show All is data type Boolean. Use the radio buttons to specify True or False.
  4.  On the report viewer toolbar, click this arrow to show or hide the parameters pane.
  5.  The parameter @Category Quota is data type Float, so it takes a numeric value.
  6.  If all parameters have default values, the report runs on the first view.


MSBI Training Institute in Hyderabad Kphb:

Get the best Online MSBI Training in-depth from certified faculty. we offer classroom and Online Courses for MSBI.kosmik technologies  provides  MSBI Training Institute in Hyderabad Kphb.

Creating Keyword & Frameworks with Selenium | Selenium testing KPHB

Selenium testing KPHB

Creating Keyword & Hybrid Frameworks

Frameworks assist to structure our code and maintenance easy. Without frameworks we could place total our code. Which is neither reusable nor readable and data in same place. To produce beneficial outcomes for using of frameworks. Like increase code reusage, higher portability, reduced script maintenance cost etc

There are three types of frameworks created by Selenium WebDriver to automate manual testcases

Data Driven Test Framework

Keyword Driven Test Framework

Hybrid Test Framework


Data Driven Test Framework


Test data is produce from some external files. Like excel, csv, XML or some database table in data driven framework.


Keyword Driven Test Framework


All the operations and instructions are register in some external file. Like excel worksheet in keyword driven test framework. Here is how the fulfilled framework looks like




If you can see it’s 5 steps of framework. Let’s refer it step wise in depth 

Selenium testing KPHB

Step 1) the driver script Execute.java can call ReadKosmikExcelFile.java


ReadKosmikExcelFile.java has t0 read data from an excel for POI script 


Step 2) ReadKosmikExcelFile.java should read data from TestCase.xlsx


Here how the sheet focus like:




Observe to the keywords written in excel file, the framework can perform the operation on UI.


For example, we want to click a button ‘Login’. State, our excel should have a keyword ‘Click’. Now the AUT will have hundreds of button on a page. To know a Login button, in excel input Object Name as login Button & object type as name. The Object Type will be xpath, name css or any other value


Step 3) ReadKosmikExcelFile.java can form this data to the driver script Execute.java


Step 4) for all our UI web elements we want to create an object repository.  Where should we place their element locator (like xpath, name, css path, class name etc.)




Execute.java could read the entire Object Repository and store in a variable


To read this object repository we want a Read Object class. Which has a getObjectRepository method to read it.




Step 5) the driver will pass the data from Excel & Object Repository to UIOperation class


UIOperation class has functions to perform actions. State to keywords like CLICK, SETTEXT etc mentioned in the excel


UIOperation class is a java class which has to implementation of the code. To perform operations on web elements




The complete project should focus like:




Analysis Services Scripting Language

SSAS Online Training Hyderabad

Analysis Services Scripting Language


The Analysis Services Scripting Language is an extension to XMLA. It adds an object definition language and command language.  For creating and managing Analysis Services structures on the server. The custom application to communicate with Analysis Services over the XMLA protocol.

The Analysis Services Scripting Language DDL defines the structure of Analysis Services objects. Such as cubes, dimensions, and mining models of Analysis Services objects to data sources. The DDL also persists the definition of Analysis Services objects. Applications use the DDL to create, alter, deploy, and describe Analysis Services objects

Usage Scenarios




A developer designs a set of cubes by using the Development Studio design tools. It saves the definition as part of a project. The developer is not conformed to using the design tools. But can also open the cube definition files to edit the XML, which uses the format described in this section.




A database administrator (DBA) uses the SQL Server Management Studio to edit XML. To creating and altering Analysis Services objects in the same way the DBA uses. The SQL Server DDL to create and alter Microsoft SQL Server objects.




The schema specification uses the XML namespace.




The definition of an XML Schema definition language (XSD) schema.  The Analysis Services object definition language works by the definition of the schema elements.




Extensibility of the object definition language schema means of an Annotation element. That includes in all objects. This element can contain valid XML from any XML namespace

  • The XML can contain only elements.
  • Each element must have a unique name. The value of Name reference the target namespace.

The contents of the Annotation tag can expose as a set of Name/Value pairs.

Comments and white space within the Annotation tag. That are not enclosed with a child element may not preserve.  All elements must read-write; read-only elements ignored.

The object definition language schema is close.  That the server does not allow substitution of derived types for elements schema. So, the server accepts only the set of elements. It defined here no other elements or attributes. Unknown elements cause the Analysis Services engine to raise an error.


SSAS Online Training Hyderabad :


Learn MSBI from certified MSBI trainers. Advanced MSBI training from real-time experts in Hyderabad.  kosmik technologies  provide online SSAS Online Training Hyderabad.


Tableau Data Joining & Data Blending | Tableau Training institute in KPHB

Tableau Training institute in KPHB


Data Joining


Data joining is a common need in any data analysis. We may need to join data from many sources or join data from different tables in a single source. It is provides to join the table by using the data pane available by under. Data menu in the Edit Data Source.

Creating a Join

Let’s consider the data source sample superstore. To create a join between Orders and Returns table. We have gone to the Data menu and then choose the option Edit Data Source. Next we drag the two tables, orders and Returns to the data pane. 

The below diagram displays the creation of inner join between orders



Editing Join Type

The type of join which Table creates automatic could change manual. We click on the middle of two circles showing join. A popup window appears below which displays the four types of joins available.

In the below diagram we see the inner and left outer join as the available joins.



Editing Join Fields

We should change the fields forming by clicking on the Data Source option for join condition. Available in the join popup window. While selecting the field we can also search for the field we are looking for using a search text box.




Data Blending

Data Blending is a powerful feature in Tableau. It is use when there is relate data in more data sources to analyze together in a single view. As an example which you want, The Present sales data is in a relational database. Then to compare actual sales to target sales.  We can blend the data based on common dimensions to get access to the Sales Target measure. The two sources involved in data blending are referring as Primary data sources.

Preparing Data for Blending

Tableau has two inbuilt data sources named Sample superstore. mdb which we will use to illustrate data blending. Let’s first load the sample coffee chain to tableau and look at its metadata. Go to the menu Data. Browse for the sample coffee chain new Data Source file which is a MS access database file. The below diagram displays the different tables and joins available in the file.



Adding Secondary Data Source

Next we add the secondary data source named by again following the steps Data. New Data Source and choosing this data source. Both the data sources now appear on the Data window as shown below.



Blending the Data

Now we can integrate the data from both above common dimension for based on sources. Note that a small chain image appears next to the dimension named State. This indicates the common dimension between the two data sources.  

Tableau Training institute in KPHB

We select the bullet chart option from Show me to get the bullet chart below. It showed how the profit ratio varies for each state in both the superstore and coffee chain shops.



Data Mining Extensions


 Data Mining Training in Hyderabad Kphb

Data Mining Extensions


The Data Mining Extensions (DMX) is a query language for Data Mining Models. It supported by Microsoft’s SQL Server Analysis Services product. The DMX used to create and train data mining models, and to browse, manage, and predict against them.


Microsoft OLE DB for Data Mining Specification



The data mining features to follow the Microsoft OLE DB for Data Mining specification

The Microsoft OLE DB for Data Mining specification defines the following

  • A structure to hold the information that defines a data mining model.
  • A language for creating and working with data mining models.

The specification defines the basis of data mining model virtual object. The data mining model object encapsulates is known as particular mining model.   SQL table, with columns, data types, and meta-information structure the data mining model. This structure lets you use the DMX language, which is an extension of SQL, to create and work with models.


DMX Statements


The DMX statements to create, process, delete, copy, browse, and predict against data mining models. There are two types of statements in DMX.


  • Data Definition Statements
  • Data Manipulation  Statements
  • Query Fundamentals



The Data Definition Statements


The Data definition statements in DMX to create and define new mining structures and models. They import and export mining models and mining structures. The drop existing models from a database. Data definition statements in DMX are part of the data definition language (DDL)

  • Create new data mining models and mining structures – CREATE MINING STRUCTURE, CREATE MINING MODEL
  • Delete existing data mining models and mining structures – DROP MINING STRUCTURE, DROP MINING MODEL
  • Export and import mining structures – EXPORT, IMPORT
  • Copy data from one mining model to another – SELECT INTO


Data Manipulation Statements


The Data manipulation statements in DMX to work with existing mining models. They browse the models and to create predictions against them. Data manipulation statements in DMX are part of the data manipulation language (DML)

  • Train mining models – INSERT INTO
  • Browse data in mining models – SELECT FROM
  • Make predictions using mining model – SELECT  FROM PREDICTION JOIN



DMX Query Fundamentals



The SELECT statement is the basis for most use DMX queries. Depending on the clauses that you use with such statements. You can browse, copy, or predict against mining models. The prediction query uses a SELECT to create predictions based on existing mining models


 Data Mining Training in Hyderabad Kphb:


Learn MSBI from certified MSBI trainers. Advanced MSBI training from real-time experts in Hyderabad. kosmik technologies  provide online  Data Mining Training in Hyderabad Kphb .