At iNternPEDIA, we groom people on Big Data Technology so that, they tame the humongous data generated everyday rather than being slaves of this data. We empower young minds to use the Big Data revolution to its fullest.BIG DATA/HADOOP Training

 

BIG DATA / HADOOP PROGRAM

 

                Introduction to Big data

  • Characteristics of Big Data
  • Big data collection and cleanup
  • Why analyze big data
  • Why parallel computing important
  • Various products for handling big data

Introducing Hadoop

  • Hadoop Stack
  • Components of Hadoop
  • Starting Hadoop
  • Various Hadoop processes

Working with HDFS

  • Basic file commands
  • Reading & writing to files
  • Run a word count on a large text file
  • Web based UI
  • View jobs status on Hadoop prompt
  • View jobs status on web UI
  • High availability

YARN

  • Resource Manager
  • Yarn Hands On

 

Installation & Configuring Hadoop

  • Types of installation (standalone, distributed)
  • Hadoop distributions (Apache, cloudera and hortonworks)
  • Setup linux for Hadoop installation (Java and SSH)
  • Hadoop directory structure
  • XML, masters and slave files
  • Checking system health
  • Checking file system health
  • Block size, replication factor and block health monitoring
  • Benchmarking cluster

Advanced administration activities

  • Secure Mode
  • Adding and de-commissioning nodes
  • Secondary NameNode
  • Manage Quotas l Enabling Thrash l Hands On

Monitoring Hadoop Cluster

  • Hadoop infrastructure monitoring
  • Hadoop specific monitoring
  • Install and configure Nagios / Ganglia
  • Capture metrics

Other Components of Hadoop Ecosystem

  • Discuss Hive, Sqoop, Pig, HBase, Flume
  • Use cases of each
  • Use Hadoop streaming to write code in Perl / Python