Big Data & Hadoop and Cloud Computing:-
Project Based Summer Training Program will be conducted on Big Data-Hadoop technology. Fifteen days in- depth hand-on training program includes what is Big Data and virtualization, difference between Virtualization and Cloud Computing, advantages and disadvantages of cloud computing, Cloud Industry Standard Architecture, salesforce.com model and structure, Private cloud, Public cloud, Installation and working of Hadoop, Hive, and Pig. How do we implement a private cloud like Eucalyptus, its architecture with the help of euca-tolls and instances.
Private Cloud : Eucalyptus, CloudStack, OpenStack, Open Nebula
Public Cloud : Amazon Web Services, Rackspace, GoGrid
A key aspect of the resiliency of hadoop clusters comes from the software ability to deduct and handle failures at the application layer. Hadoop has two main subprojects: first MapReduce, the framework that understand and assigns work to the nodes in a cluster and secondly, HDFS, a distributed file system that spans all the nodes in a Hadoop cluster for data storage.
Hadoop is an open source software project that enables the distributed processing of large data sets across clusters of servers. It is designed to scale up from a single server to thousands of machine, with a very high degree of fault tolerance. Hadoop was derived from Google’s MapReduce and the Google file system. Yahoo! was the originator and has beena major contributor and uses Hadoop across its business. Other major users include Facebook, IBM, Twitter, American Airlines, LinkedIn, The New York Times and many more.