|Introduction to Big Data|
1.1 Introduction to Big Data, Big Data characteristics, types of Big Data,
Traditional vs. Big Data business approach, Case Study of Big Data
Solutions.02 Introduction to Hadoop
2.1 What is Hadoop? Core Hadoop Components; Hadoop Ecosystem;
Physical Architecture; Hadoop limitations.
3.1 What is NoSQL? NoSQL business drivers; NoSQL case studies;
3.2 NoSQL data architecture patterns: Key-value stores, Graph stores,
Column family (Bigtable) stores, Document stores, Variations of NoSQL
3.3 Using NoSQL to manage big data: What is a big data NoSQL solution?
Understanding the types of big data problems; Analyzing big data with a
shared-nothing architecture; Choosing distribution models: master-slave
versus peer-to-peer; Four ways that NoSQL systems handle big data
04 MapReduce and the New Software Stack
4.1 Distributed File Systems : Physical Organization of Compute Nodes, Large-
Scale File-System Organization.
4.2 MapReduce: The Map Tasks, Grouping by Key, The Reduce Tasks,Combiners, Details of MapReduce Execution, Coping With Node Failures.
4.3 Algorithms Using MapReduce: Matrix-Vector Multiplication by MapReduce ,
Relational-Algebra Operations, Computing Selections by MapReduce,
Computing Projections by MapReduce, Union, Intersection, and Difference by
MapReduce, Computing Natural Join by MapReduce, Grouping and
Aggregation by MapReduce, Matrix Multiplication, Matrix Multiplication with
One MapReduce Step.
05 Finding Similar Items
5.1 Applications of Near-Neighbor Search, Jaccard Similarity of Sets,
Similarity of Documents, Collaborative Filtering as a Similar-Sets
5.2 Distance Measures: Definition of a Distance Measure, Euclidean
Distances, Jaccard Distance, Cosine Distance, Edit Distance, Hamming
06 Mining Data Streams
6.1 The Stream Data Model: A Data-Stream-Management System,
Examples of Stream Sources, Stream Querie, Issues in Stream Processing.
6.2 Sampling Data in a Stream : Obtaining a Representative Sample , The
General Sampling Problem, Varying the Sample Size.
6.3 Filtering Streams:
The Bloom Filter, Analysis.
6.4 Counting Distinct Elements in a Stream
The Count-Distinct Problem, The Flajolet-Martin Algorithm, Combining
Estimates, Space Requirements .
6.5 Counting Ones in a Window:
The Cost of Exact Counts, The Datar-Gionis-Indyk-Motwani Algorithm,
Query Answering in the DGIM Algorithm, Decaying Windows.
07 Link Analysis
7.1 PageRank Definition, Structure of the web, dead ends, Using Page rank
in a search engine, Efficient computation of Page Rank: PageRank
Iteration Using MapReduce, Use of Combiners to Consolidate the Result
7.2 Topic sensitive Page Rank, link Spam, Hubs and Authorities.
08 Frequent Itemsets
8.1 Handling Larger Datasets in Main Memory
Algorithm of Park, Chen, and Yu, The Multistage Algorithm, The Multihash
8.2 The SON Algorithm and MapReduce
8.3 Counting Frequent Items in a Stream
Sampling Methods for Streams, Frequent Itemsets in Decaying Windows Clustering
9.1 CURE Algorithm, Stream-Computing , A Stream-Clustering Algorithm,
Initializing & Merging Buckets, Answering Queries
10 Recommendation Systems
10.1 A Model for Recommendation Systems, Content-Based
Recommendations, Collaborative Filtering.
11 Mining Social-Network Graphs
11.1 Social Networks as Graphs, Clustering of Social-Network Graphs, Direct
Discovery of Communities, SimRank, Counting triangles using Map-
|Title: Study of Cloud Computing & Architecture.|
Concept: Cloud Computing & Architecture.
Objective: Objective of this module is to provide students an overview
of the Cloud Computing and Architecture and different types of Cloud
Scope: Cloud Computing & Architecture Types of Cloud Computing .
Technology: —02 Title: Virtualization in Cloud.
Objective: In this module students will learn, Virtualization Basics,
Objectives of Virtualization, and Benefits of Virtualization in cloud.
Scope: Creating and running virtual machines on open source OS.
Technology: KVM, VMware.
03 Title: Study and implementation of Infrastructure as a Service .
Concept: Infrastructure as a Service.
Objective: In this module student will learn Infrastructure as a Service
and implement it by using OpenStack.
Scope: Installing OpenStack and use it as Infrastructure as a Service .
Technology: Quanta Plus /Aptana /Kompozer
04 Title: Study and installation of Storage as Service.Concept: Storage as Service (SaaS)
Objective: is that, students must be able to understand the concept of
SaaS , and how it is implemented using ownCloud which gives
universal access to files through a web interface.
Scope: is to installation and understanding features of ownCloud as
05 Title: Implementation of identity management.
Concept: Identity Management in cloud
Objective: this lab gives an introduction about identity management in
cloud and simulate it by using OpenStack
Scope: installing and using identity management feature of OpenStack
06 Title: Write a program for web feed.
Concept: Web feed and RSS
Objective: this lab is to understand the concept of form and control
Scope: Write a program for web feed
Technology: PHP, HTML
07 Title: Study and implementation of Single-Sing-On.
Concept: Single Sing On (SSO),openID
Objective: is to understand the concept of access control in cloud and
single sing on (SSO), Use SSO and advantages of it, and also students
should able to implementation of it.
Scope: installing and using JOSSO
Technology: JOSSO Title: Securing Servers in Cloud.
Concept: Cloud Security
Objective: is to understand how to secure web server, how to secure
data directory and introduction to encryption for own cloud.
Scope: Installing and using security feature of ownCloud
09 Title: User Management in Cloud.
Concept: Administrative features of Cloud Managenet ,User
Objective: is to understand how to create, manage user and group of
Scope: Installing and using Administrative features of ownCloud
10 Title: Case study on Amazon EC2.
Concept: Amazon EC2
Objective: in this module students will learn about Amazon EC2.
Amazon Elastic Compute Cloud is a central part of Amazon.com’s
cloud computing platform, Amazon Web Services. EC2 allows users to
rent virtual computers on which to run their own computer applications
11 Title: Case study on Microsoft azure.
Concept: Microsoft Azure
Objective: students will learn about Microsoft Azure is a cloud
computing platform and infrastructure, created by Microsoft, for
building, deploying and managing applications and services through a
global network of Microsoft-managed datacenters. How it work,
different services provided by it.
Technology: Microsoft azure Title: Mini project.
Concept: using different features of cloud computing creating own
cloud for institute, organization etc.
Objective: is student must be able to create own cloud using different
features which are learned in previous practices.
Scope: creating a cloud like social site for institute.
Technology: any open system used for cloud