YARN Architecture

 

Hadoop YARN Training Institute in Hyderabad Kphb.

 

YARN Architecture

 

 

The Apache Hadoop YARN (Yet Another Resource Negotiator) is a cluster management technology. It is one of the key features in the second-generation Hadoop 2.0 version.  The Apache Software Foundation’s open source distributed process framework

The idea of YARN is to split resource management and job scheduling/monitoring into separate. The idea is to have a global Resource Manager (RM) and per-application Application Master (AM). An application is either a single job or a DAG of jobs.

 

Resource Manager

 

 

The Resource Manager and the Node Manager from the data computation framework. The Resource Manager arbitrates resources among all the applications in the system. The Node Manager framework handles containers, monitoring. Their resource usage (CPU, memory, disk, network) and reporting the same to the Resource Manager/Scheduler.

 

Application Master

 

 

The per-application Application Master is the effect.  Resource Manager and working with the Node Manager(s) to execute and check the tasks.

 

 

 

The Resource Manager has two main components: Scheduler and Applications Manager.

 

Scheduler

 

The Scheduler handles various running applications subject to familiar constraints of capacities, queues etc.  That it performs no monitoring or tracking of status for the application. No guarantees about restarting failed tasks either due to application failure or hardware failures. The Scheduler performs its scheduling function based the resource requirements of the applications. It does so based on the abstract notion of a resource Container. This incorporates elements such as memory, CPU, disk, network etc.

The Scheduler handles partitioning the cluster resources among the various queues, applications etc. The current schedulers such as the capacity scheduler and fair scheduler.

 

Applications Manager

 

The Applications Manager handles accepting job-submissions.  The first container for executing the application specific Application Master. It provides the service for restarting the Application Master container on failure. The per-application Application Master has the responsibility resource containers. Monitoring for progress the Scheduler, tracking their status.

Map Reduce in hadoop-2.x maintains API compatibility with the previous stable release (hadoop-1.x). This means that all Map Reduce jobs should still run unchanged on top of YARN with just a recompile.

 

Hadoop YARN Training Institute in Hyderabad Kphb

 

 

Learn Hadoop from certified Hadoop trainers. Advanced Hadoop training from real-time experts in Hyderabad.  kosmik technologies provide online Hadoop YARN Training Institute in Hyderabad  Kphb.

 

 

 

Hive Architecture

 Hive Training centers in Hyderabad Kphb

 

Hive Architecture

 

The Apache Hive is a data warehouse infrastructure. It built on top of Hadoop for providing data summarization, query, and analysis. Hive developed by Facebook. Apache Hive is now used and developed by other companies such as Netflix

 

Major Components of Hive

 

 

 UI:

 

The UI User Interface. The user interface for users to submit queries and other operations to the system

Driver:

The Driver receives the quires from UI. This component session handles and provides execute and fetch APIs modeled on JDBC/ODBC interfaces

Compiler:  

 

The component that parses the query. It does semantic analysis on the different query blocks and query expressions. They generate the table and partition metadata looked up from the Meta store

MetaStore:

 

The component that stores all the structure information. The partitions in the warehouse including column and column type information. The serializes and desterilizes read and write data HDFS files where the data store

Execution Engine

  The component which executes the execution plan created by the compiler. The plan is a DAG of stages. The execution engine manages stages of the plan and executes system components

 

Hadoop Hive Architecture

 

 hive_architecture-1024x530

 

 

Step 1: 

 

The UI calls the execute interface to the Driver

Step 2: 

 

The Driver creates query and sends the query to the compiler to generate an execution plan

Step 3&4: 

 

The compiler needs the metadata so send a request for getting Metadata.  It receives the send Metadata request from MetaStore

Step 5:

 

This metadata checks the expressions in the query tree based on query predicates. The plan generated by the compiler is a DAG of stages with each stage being either a map/reduce job. A metadata operation or an operation on HDFS

Step 6: 

 

The execution engine submits these stages to appropriate components. It associated with the table or intermediate outputs use to read the rows from HDFS files

Step 7&8&9:

 

   the contents of the temporary file read by the execution engine. HDFS as part of the fetch call from the Driver

 

Hive Training centers in Hyderabad Kphb:

Learn Hadoop from certified Hadoop trainers. Advanced Hadoop training from real-time experts in Hyderabad. kosmik technologies provide online  Hive Training Centers in Hyderabad  Kphb.

 

AutoIT with Selenium | Selenium Coaching in Kukatpally

Selenium Coaching in Kukatpally

AutoIT with Selenium

 

Selenium is an open source tool that is design by automates web based applications. But to handle window GUI and non HTML popup in application. AutoIT is must as this Selenium doesn’t handle by window based activity.

AutoIt v3 is freeware. The combination of mouse movement, keystrokes and window control manipulation. To automate a task which is not possible by selenium web driver.

AutoIt Features

 

1) Easy to learn

2) Simulates keystrokes

3) Simulate mouse movements

4) Scripts can be compiling into standalone executable

5) Windows Management

6) Windows Controls

7) Detailed help file and large community

In short, any windows, mouse & keystrokes simulation. Which we cannot handle with Selenium that can handle with AutoIt. All we need to do is to use the script in Selenium which is generating with the help of AutoIt tool.

Moving ahead we will learn how to upload a file in selenium web driver using autoIT. Here we need three tools in to this.

  • Selenium Web driver
  • AutoIT editor and element identifier
  • The window that you want to automate

 

How to AutoIT download and install

 

1): Go to this link.

2): Dropdown on ‘Autoit’ and ‘Autoit Editor’ Hover.

 

2

 

 3) Click ‘AutoIT’ Downloads option.

 

3

 

 4): Download “Autoit” and then clicking on ‘Download Autoit’ button

 

4

 

5): Now download “Autoit editor” by clicking on ‘Downloads’ button.

 

5

 

 6): Click on the link as display below.

 

6

 

After download you will get two setup file as shown in below screen, first one is AutoIt version 3 setup. Second one is Scitautoit3.

 

7

 

 7): For Installing AutoIT-Click on both AutoIT setup one by one.

 

8): After successful installation – open up AutoIT Editor.

 

8

 

Go to ‘C: \Program Files

 

And click on ‘SciTE.exe’ file, the AutoIT editor opens as shown in below screen.

 

9

 

 9): Now opens element Identifier.

 

Go to ‘C: \Program Files (x86)\AutoIt3’

 

10

 

Selenium Coaching in Kukatpally

 

And click on ‘Au3Info.exe’ file, the element identifier opens as shown in below screen.

 

11

Importance of Hadoop EcoSystems | Hadoop Training in Hyderabad KPHB

Hadoop Training in Hyderabad KPHB

Importance of Hadoop EcoSystems 

 

Hadoop ecosystem is quite interest to envision. How we could adopt this within the realms of DevOps. Hadoop managed by Apache Foundation is a powerful open source platform written in java. That is capable of process large amount of heterogeneous data sets. It is design to scale up from single server to thousands of machines. Each offer local computation and storage and has become an in demand technical skill. Hadoop is an Apache top level project built and used by a global community of contributors and users.

The following sections provide information on most popular components:

Map Reduce: Framework for easy to write applications which process big amount of data. In parallel on large clusters of commodity hardware in a reliable, fault tolerant manner. In program, there are two functions which are most common in Map Reduce.

 Map Task: Input to convert it into divide smaller parts and distribute on other worker nodes. All worker nodes solve their own small problem and give answer to the master node.

Reduce Task: Master node combines all answers come from worker. some form of output which is answer of our big distributed problem in node and forms 

HDFS: HDFS is a distributed file-system that provides high throughput access to data. When data is push to HDFS.

 

Here are the main components of HDFS

 

Name Node: It maintains the name system directories and files. Manages the blocks which are present on the Data Nodes.

Data Nodes: They are the slaves which are deploy on each machine and provide the actual storage. They are responsible for serve clients wants read and write requests

Secondary Name Node: It is handled for perform periodic checkpoints. In the event of Name Node failure, you can restart the Name Node use the checkpoint.

Hive: Hive is part of the Hadoop ecosystem and provides an SQL like interface to Hadoop.

The main building blocks of Hive are

Metastore: To store the metadata about columns, partition and system catalogue.

Driver: To manage the lifecycle of a HiveQL statement

Query Compiler: To compiles HiveQL into a directed acyclic graph.

Execution Engine: To execute the tasks in proper order which are produce by the compiler.

Hive Server: To provide a Thrift interface and a JDBC / ODBC server.

HBase: As said earlier, HDFS works on write once and read many times pattern, but this isn’t a case always.  HBase built on top of HDFS and distributed on column oriented database.

 

Here are the main components of HBase:

 

HBase Master: It is handles for negotiate load balance across. All Region Servers and maintains the state of the cluster. It is not part of the actual data storage or retrieval path.

 

Hadoop Training in Hyderabad KPHB

Region Server: It is deploy on each machine and hosts data and processes I/O requests.

 

Sqoop Architecture | Hadoop Training Institute in KPHB Hyderabad

Sqoop Architecture | Hadoop Training Institute in KPHB Hyderabad

What is Sqoop?

 

Sqoop is a tool designed for efficient bulk data between Hadoop and external data stores. Such as relational databases, enterprise data warehouses

 

It is import data from external data stores into Hadoop Distributed File System. Like Hive and HBase. Similar, Sqoop can also be use to extract data from Hadoop and export it to external data stores. Such as relational hadoop databases, enterprise warehouses. It is works with relational databases such as Teradata, Netezza, Oracle, MySQL, Postgres etc.

 

Silent features like

Full Load

Incremental Load

Parallel import/export

Import results of SQL query

Compression

Connectors for all major RDBMS Databases

Kerberos Security Integration

Load data direct into Hive/Hbase

Support for Accumulo

 

Sqoop Architecture

 

 Provides command line interface to the end users. Sqoop can also be access using Java APIs. Sqoop command submitted by the end user is parse Sqoop and launches Hadoop Map only job to import. Just imports and exports the data it does not do any aggregations.

 

This is parses the arguments provided in the command line and prepares the Map job. Map job launch many mappers depends on the number defined by user in the command line. For Sqoop import, each mapper task will be assign with part of data to import based on key. Sqoop distributes the input data among the mappers equal to get high performance. Then each mapper creates connection with the database. Use JDBC and fetches the part of data assigned by Sqoop and writes it into HDFS. Based on the option provided in the command line.

Hadoop Training Institute in KPHB Hyderabad

 Architecture is a client-side tool, which is tight coupled with the Hadoop cluster. Client fetches initiated by the Sqoop command. The metadata of tables, columns, and data types, according to the connectors. The import or export is translated to a Map only Job program. The data in parallel between the databases and Hadoop to load. Clients should have the appropriate connector and driver for the execution of the process.

 

Report Parameters

 

 MSBI Training Institute in Hyderabad Kphb

Report Parameters

 

Reporting Services uses report parameters. The properties   much more. Report parameters enable you to control report data. Connect related reports together, and vary report presentation. It uses report parameters in paginated reports. The create in Report Builder and Report Designer. And also in mobile reports you create in SQL Server Mobile Report Publisher.

 

Control Paginated and Mobile Report Data

 

 

  • Filter paginated report data by writing dataset queries that contain variables.
  • Shared dataset to a paginated report. The query does not change.
  • Create mobile reports in SQL Server Mobile Report Publisher for more information.
  • Enable users to specify values to customize the data in a paginated report.

 

Connect Related Reports

 

  • Use parameters to relate main reports to drill through reports to sub reports

 

Vary Report Presentation

 

  • Send commands to a report server to the of a report.  Create mobile reports in SQL Server Mobile Report Publisher
  •  Provide a Boolean parameter whether to expand or collapse all nested row groups in a table.
  • Enable users to customize report data. The appearance by including parameters in an expression.

 

 

Viewing a Report with Parameters

 

 

  1.  The report viewer toolbar displays a prompt and default value for each parameter.
  2.  The prompt Select the Date appears next to the text box.
  3. The parameter @Show All is data type Boolean. Use the radio buttons to specify True or False.
  4.  On the report viewer toolbar, click this arrow to show or hide the parameters pane.
  5.  The parameter @Category Quota is data type Float, so it takes a numeric value.
  6.  If all parameters have default values, the report runs on the first view.

 

MSBI Training Institute in Hyderabad Kphb:

Get the best Online MSBI Training in-depth from certified faculty. we offer classroom and Online Courses for MSBI.kosmik technologies  provides  MSBI Training Institute in Hyderabad Kphb.

Creating Keyword & Frameworks with Selenium | Selenium testing KPHB

Selenium testing KPHB

Creating Keyword & Hybrid Frameworks

Frameworks assist to structure our code and maintenance easy. Without frameworks we could place total our code. Which is neither reusable nor readable and data in same place. To produce beneficial outcomes for using of frameworks. Like increase code reusage, higher portability, reduced script maintenance cost etc

There are three types of frameworks created by Selenium WebDriver to automate manual testcases

Data Driven Test Framework

Keyword Driven Test Framework

Hybrid Test Framework

 

Data Driven Test Framework

 

Test data is produce from some external files. Like excel, csv, XML or some database table in data driven framework.

 

Keyword Driven Test Framework

 

All the operations and instructions are register in some external file. Like excel worksheet in keyword driven test framework. Here is how the fulfilled framework looks like

 

2

 

If you can see it’s 5 steps of framework. Let’s refer it step wise in depth 

Selenium testing KPHB

Step 1) the driver script Execute.java can call ReadKosmikExcelFile.java

 

ReadKosmikExcelFile.java has t0 read data from an excel for POI script 

 

Step 2) ReadKosmikExcelFile.java should read data from TestCase.xlsx

 

Here how the sheet focus like:

 

3

 

Observe to the keywords written in excel file, the framework can perform the operation on UI.

 

For example, we want to click a button ‘Login’. State, our excel should have a keyword ‘Click’. Now the AUT will have hundreds of button on a page. To know a Login button, in excel input Object Name as login Button & object type as name. The Object Type will be xpath, name css or any other value

 

Step 3) ReadKosmikExcelFile.java can form this data to the driver script Execute.java

 

Step 4) for all our UI web elements we want to create an object repository.  Where should we place their element locator (like xpath, name, css path, class name etc.)

 

4

 

Execute.java could read the entire Object Repository and store in a variable

 

To read this object repository we want a Read Object class. Which has a getObjectRepository method to read it.

 

5

 

Step 5) the driver will pass the data from Excel & Object Repository to UIOperation class

 

UIOperation class has functions to perform actions. State to keywords like CLICK, SETTEXT etc mentioned in the excel

 

UIOperation class is a java class which has to implementation of the code. To perform operations on web elements

 

6

 

The complete project should focus like:

 

7

 

Analysis Services Scripting Language

SSAS Online Training Hyderabad

Analysis Services Scripting Language

 

The Analysis Services Scripting Language is an extension to XMLA. It adds an object definition language and command language.  For creating and managing Analysis Services structures on the server. The custom application to communicate with Analysis Services over the XMLA protocol.

The Analysis Services Scripting Language DDL defines the structure of Analysis Services objects. Such as cubes, dimensions, and mining models of Analysis Services objects to data sources. The DDL also persists the definition of Analysis Services objects. Applications use the DDL to create, alter, deploy, and describe Analysis Services objects

Usage Scenarios

 

Developer

 

A developer designs a set of cubes by using the Development Studio design tools. It saves the definition as part of a project. The developer is not conformed to using the design tools. But can also open the cube definition files to edit the XML, which uses the format described in this section.

 

Administrator

 

A database administrator (DBA) uses the SQL Server Management Studio to edit XML. To creating and altering Analysis Services objects in the same way the DBA uses. The SQL Server DDL to create and alter Microsoft SQL Server objects.

 

Namespace

 

The schema specification uses the XML namespace.

 

Schema

 

The definition of an XML Schema definition language (XSD) schema.  The Analysis Services object definition language works by the definition of the schema elements.

 

Extensibility

 

Extensibility of the object definition language schema means of an Annotation element. That includes in all objects. This element can contain valid XML from any XML namespace

  • The XML can contain only elements.
  • Each element must have a unique name. The value of Name reference the target namespace.

The contents of the Annotation tag can expose as a set of Name/Value pairs.

Comments and white space within the Annotation tag. That are not enclosed with a child element may not preserve.  All elements must read-write; read-only elements ignored.

The object definition language schema is close.  That the server does not allow substitution of derived types for elements schema. So, the server accepts only the set of elements. It defined here no other elements or attributes. Unknown elements cause the Analysis Services engine to raise an error.

 

SSAS Online Training Hyderabad :

 

Learn MSBI from certified MSBI trainers. Advanced MSBI training from real-time experts in Hyderabad.  kosmik technologies  provide online SSAS Online Training Hyderabad.

 

Tableau Data Joining & Data Blending | Tableau Training institute in KPHB

Tableau Training institute in KPHB

 

Data Joining

 

Data joining is a common need in any data analysis. We may need to join data from many sources or join data from different tables in a single source. It is provides to join the table by using the data pane available by under. Data menu in the Edit Data Source.

Creating a Join

Let’s consider the data source sample superstore. To create a join between Orders and Returns table. We have gone to the Data menu and then choose the option Edit Data Source. Next we drag the two tables, orders and Returns to the data pane. 

The below diagram displays the creation of inner join between orders

data_join_1

 

Editing Join Type

The type of join which Table creates automatic could change manual. We click on the middle of two circles showing join. A popup window appears below which displays the four types of joins available.

In the below diagram we see the inner and left outer join as the available joins.

data_join_2

 

Editing Join Fields

We should change the fields forming by clicking on the Data Source option for join condition. Available in the join popup window. While selecting the field we can also search for the field we are looking for using a search text box.

data_join_3

 

 

Data Blending

Data Blending is a powerful feature in Tableau. It is use when there is relate data in more data sources to analyze together in a single view. As an example which you want, The Present sales data is in a relational database. Then to compare actual sales to target sales.  We can blend the data based on common dimensions to get access to the Sales Target measure. The two sources involved in data blending are referring as Primary data sources.

Preparing Data for Blending

Tableau has two inbuilt data sources named Sample superstore. mdb which we will use to illustrate data blending. Let’s first load the sample coffee chain to tableau and look at its metadata. Go to the menu Data. Browse for the sample coffee chain new Data Source file which is a MS access database file. The below diagram displays the different tables and joins available in the file.

data_blend_connect_coffee

 

Adding Secondary Data Source

Next we add the secondary data source named by again following the steps Data. New Data Source and choosing this data source. Both the data sources now appear on the Data window as shown below.

data_blend_display_coffee2

 

Blending the Data

Now we can integrate the data from both above common dimension for based on sources. Note that a small chain image appears next to the dimension named State. This indicates the common dimension between the two data sources.  

Tableau Training institute in KPHB

We select the bullet chart option from Show me to get the bullet chart below. It showed how the profit ratio varies for each state in both the superstore and coffee chain shops.

data_blend_state_coffe_n_bullet3

 

Data Mining Extensions

 

 Data Mining Training in Hyderabad Kphb

Data Mining Extensions

 

The Data Mining Extensions (DMX) is a query language for Data Mining Models. It supported by Microsoft’s SQL Server Analysis Services product. The DMX used to create and train data mining models, and to browse, manage, and predict against them.

 

Microsoft OLE DB for Data Mining Specification

 

 

The data mining features to follow the Microsoft OLE DB for Data Mining specification

The Microsoft OLE DB for Data Mining specification defines the following

  • A structure to hold the information that defines a data mining model.
  • A language for creating and working with data mining models.

The specification defines the basis of data mining model virtual object. The data mining model object encapsulates is known as particular mining model.   SQL table, with columns, data types, and meta-information structure the data mining model. This structure lets you use the DMX language, which is an extension of SQL, to create and work with models.

 

DMX Statements

 

The DMX statements to create, process, delete, copy, browse, and predict against data mining models. There are two types of statements in DMX.

 

  • Data Definition Statements
  • Data Manipulation  Statements
  • Query Fundamentals

 

 

The Data Definition Statements

 

The Data definition statements in DMX to create and define new mining structures and models. They import and export mining models and mining structures. The drop existing models from a database. Data definition statements in DMX are part of the data definition language (DDL)

  • Create new data mining models and mining structures – CREATE MINING STRUCTURE, CREATE MINING MODEL
  • Delete existing data mining models and mining structures – DROP MINING STRUCTURE, DROP MINING MODEL
  • Export and import mining structures – EXPORT, IMPORT
  • Copy data from one mining model to another – SELECT INTO

 

Data Manipulation Statements

 

The Data manipulation statements in DMX to work with existing mining models. They browse the models and to create predictions against them. Data manipulation statements in DMX are part of the data manipulation language (DML)

  • Train mining models – INSERT INTO
  • Browse data in mining models – SELECT FROM
  • Make predictions using mining model – SELECT  FROM PREDICTION JOIN

 

 

DMX Query Fundamentals

 

 

The SELECT statement is the basis for most use DMX queries. Depending on the clauses that you use with such statements. You can browse, copy, or predict against mining models. The prediction query uses a SELECT to create predictions based on existing mining models

 

 Data Mining Training in Hyderabad Kphb:

 

Learn MSBI from certified MSBI trainers. Advanced MSBI training from real-time experts in Hyderabad. kosmik technologies  provide online  Data Mining Training in Hyderabad Kphb .