Computer Engineering Semester 8


Computer Engineering Semester 8

Data Warehousing and Mining
Introduction to Data Warehousing
1.1 The Need for Data Warehousing; Increasing Demand for Strategic
Information; Inability of Past Decision Support System; Operational V/s
Decisional Support System; Data Warehouse Defined; Benefits of Data
Warehousing ;Features of a Data Warehouse; The Information Flow
Mechanism; Role of Metadata; Classification of Metadata; Data Warehouse
Architecture; Different Types of Architecture; Data Warehouse and Data
Marts; Data Warehousing Design Strategies.Dimensional Modeling
2.1 Data Warehouse Modeling Vs Operational Database Modeling; Dimensional
Model Vs ER Model; Features of a Good Dimensional Model; The Star
Schema; How Does a Query Execute? The Snowflake Schema; Fact Tables
and Dimension Tables; The Factless Fact Table; Updates To Dimension
Tables: Slowly Changing Dimensions, Type 1 Changes, Type 2 Changes,
Type 3 Changes, Large Dimension Tables, Rapidly Changing or Large
Slowly Changing Dimensions, Junk Dimensions, Keys in the Data
Warehouse Schema, Primary Keys, Surrogate Keys & Foreign Keys;
Aggregate Tables; Fact Constellation Schema or Families of Star.
03 ETL Process
3.1 Challenges in ETL Functions; Data Extraction; Identification of Data
Sources; Extracting Data: Immediate Data Extraction, Deferred Data
Extraction; Data Transformation: Tasks Involved in Data Transformation,
Data Loading: Techniques of Data Loading, Loading the Fact Tables and
Dimension Tables Data Quality; Issues in Data Cleansing.Online Analytical Processing (OLAP) 04
4.1 Need for Online Analytical Processing; OLTP V/s OLAP; OLAP and
Multidimensional Analysis; Hypercubes; OLAP Operations in
Multidimensional Data Model; OLAP Models: MOLAP, ROLAP, HOLAP,
DOLAP;
05 Introduction to data mining
5.1 What is Data Mining; Knowledge Discovery in Database (KDD), What can
be Data to be Mined, Related Concept to Data Mining, Data Mining
Technique, Application and Issues in Data Mining

06 Data Exploration
6.1 Types of Attributes; Statistical Description of Data; Data Visualization;
Measuring similarity and dissimilarity.

07 Data Preprocessing
7.1 Why Preprocessing? Data Cleaning; Data Integration; Data Reduction:
Attribute subset selection, Histograms, Clustering and Sampling; Data
Transformation & Data Discretization: Normalization, Binning, Histogram
Analysis and Concept hierarchy generation.Classification
8.1 Basic Concepts; Classification methods:
1. Decision Tree Induction: Attribute Selection Measures, Tree
pruning.
2. Bayesian Classification: Naïve Bayes’ Classifier.
8.2 Prediction: Structure of regression models; Simple linear regression,
Multiple linear regression.
8.3 Model Evaluation & Selection: Accuracy and Error measures, Holdout,
Random Sampling, Cross Validation, Bootstrap; Comparing Classifier
performance using ROC Curves.
8.4 Combining Classifiers: Bagging, Boosting, Random Forests.

09 Clustering
9.1 What is clustering? Types of data, Partitioning Methods (K-Means, KMedoids)
Hierarchical Methods(Agglomerative , Divisive, BRICH),
Density-Based Methods ( DBSCAN, OPTICS)

10 Mining Frequent Pattern and Association Rule
10.1 Market Basket Analysis, Frequent Itemsets, Closed Itemsets, and
Association Rules; Frequent Pattern Mining, Efficient and Scalable Frequent
Itemset Mining Methods, The Apriori Algorithm for finding Frequent
Itemsets Using Candidate Generation, Generating Association Rules from
Frequent Itemsets, Improving the Efficiency of Apriori, A pattern growth
approach for mining Frequent Itemsets; Mining Frequent itemsets using
vertical data formats; Mining closed and maximal patterns; Introduction to
Mining Multilevel Association Rules and Multidimensional Association
Rules; From Association Mining to Correlation Analysis, Pattern Evaluation
Measures; Introduction to Constraint-Based Association Mining.

Term Work:
Term work should consist of at least of the following:
1. One case study given to a group of 3 /4 students of a data mart/ data warehouse.
a. Write Detail Statement Problem and creation of dimensional modeling
(creation star and snowflake schema)
b. Implementation of all dimension table and fact table
c. Implementation of OLAP operations.
2. Implementation of classifier like Decision tree, Naïve Bayes, Random Forest
using any languages like Java
3. Use WEKA to implement like Decision tree, Naïve Bayes, Random Forest
4. Implementation of clustering algorithm like K-means, K- Medoids,
Agglomerative, Divisive using languages any like Java, C# , etc.
5. Use WEKA to implement the following Clustering Algorithms – K-means,
Agglomerative, Divisive.
6. Implementation Association Mining like Apriori, FPM using languages like Java,
C#, etc.
7. Use WEKA to implement Association Mining like Apriori, FPM.
8. Use R tool to implement Clustering/Association Rule/ Classification Algorithms.
9. Detailed study of any one BI tool like Oracle BI, SPSS, Clementine, and XLMiner etc.
(paper Assignment)
Internal Assessment:
Internal Assessment consists of two tests. Test 1, an Institution level central test, is
for 20 marks and is to be based on a minimum of 40% of the syllabus. Test 2 is
also for 20 marks and is to be based on the remaining syllabus. Test 2 may be
either a class test or assignment on live problems or course project
Text Books:
1) Han, Kamber, “Data Mining Concepts and Techniques”, Morgan Kaufmann 3nd Edition
2) Paulraj Ponniah, “Data Warehousing: Fundamentals for IT Professionals”, Wiley India
3) Reema Theraja “Data warehousing”, Oxford University Press.
4) M.H. Dunham, “Data Mining Introductory and Advanced Topics”, Pearson Education
Human Machine Interaction
Detailed Contents Hrs.
01 Introduction
1.1 Introduction to Human Machine Interface, Hardware, software and
operating environment to use HMI in various fields.
1.2 The psychopathology of everyday things – complexity of modern devices;
human-centered design; fundamental principles of interaction; Psychology
of everyday actions- how people do things; the seven stages of action and
three levels of processing; human error;

02 Understanding goal directed design
2.1 Goal directed design; Implementation models and mental models;
Beginners, experts and intermediates – designing for different experience
levels; Understanding users; Modeling users – personas and goals.

03 GUI
3.1 benefits of a good UI; popularity of graphics; concept of direct
manipulation; advantages and disadvantages; characteristics of GUI;
characteristics of Web UI; General design principles.

04 Design guidelines
4.1 perception, Gesalt principles, visual structure, reading is unnatural, color,
vision, memory, six behavioral patterns, recognition and recall, learning,
factors affecting learning, time.

05 Interaction styles
5.1 menus; windows; device based controls, screen based controls;.

06 Communication
6.1 text messages; feedback and guidance; graphics, icons and images;
colours.

Term Work:
The distribution of marks for term work shall be as follows:
· Laboratory work (experiments/case studies): ………….. (15) Marks.
· Assignment:….….……………………………………… (05) Marks.
· Attendance ………………………………………. (05) Marks
TOTAL: ……………………………………………………. (25) Marks.
Internal Assessment:
Internal Assessment consists of two tests. Test 1, an Institution level central test, is
for 20 marks and is to be based on a minimum of 40% of the syllabus. Test 2 is
also for 20 marks and is to be based on the remaining syllabus. Test 2 may be
either a class test or assignment on live problems or course project
Text Books:
1. Alan Dix, J. E. Finlay, G. D. Abowd, R. Beale “Human Computer Interaction”,
Prentice Hall.
2. Wilbert O. Galitz, “The Essential Guide to User Interface Design”, Wiley publication.
3. Alan Cooper, Robert Reimann, David Cronin, “About Face3: Essentials of Interaction
design”, Wiley publication.
4. Jeff Johnson, “Designing with the mind in mind”, Morgan Kaufmann Publication.
5. Donald A. Normann, “Design of everyday things”, Basic Books; Reprint edition
2002.

 

Parallel and Distributed Systems
Detailed Contents Hrs.
01 Introduction
1.1 Parallel Computing, Parallel Architecture, Architectural Classification
Scheme, Performance of Parallel Computers, Performance Metrics for
Processors, Parallel Programming Models, Parallel Algorithms.

02 Pipeline Processing
2.1 Introduction, Pipeline Performance, Arithmetic Pipelines, Pipelined
Instruction Processing, Pipeline Stage Design, Hazards, Dynamic
Instruction Scheduling,

03 Synchronous Parallel Processing
3.1 Introduction, Example-SIMD Architecture and Programming Principles,
SIMD Parallel Algorithms, Data Mapping and memory in array
processors, Case studies of SIMD parallel Processors

04 Introduction to Distributed Systems
4.1 Definition, Issues, Goals, Types of distributed systems, Distributed
System Models, Hardware concepts, Software Concept, Models of
Middleware, Services offered by middleware, Client Server model.

05 Communication
5.1 Layered Protocols, Remote Procedure Call, Remote Object Invocation,
Message Oriented Communication, Stream Oriented Communication

06 Resource and Process Management
6.1 Desirable Features of global Scheduling algorithm, Task assignment
approach, Load balancing approach, load sharing approach, Introduction
to process management, process migration, Threads, Virtualization,
Clients, Servers, Code Migration

07 Synchronization7.1 Clock Synchronization, Logical Clocks, Election Algorithms, Mutual
Exclusion, Distributed Mutual Exclusion-Classification of mutual
Exclusion Algorithm, Requirements of Mutual Exclusion Algorithms,
Performance measure, Non Token based Algorithms: Lamport Algorithm,
Ricart–Agrawala’s Algorithm, Maekawa’s Algorithm
7.2 Token Based Algorithms: Suzuki-Kasami’s Broardcast Algorithms,
Singhal’s Heurastic Algorithm, Raymond’s Tree based Algorithm,
Comparative Performance Analysis.
08 Consistency and Replication
8.1 Introduction, Data-Centric and Client-Centric Consistency Models,
Replica Management.
Distributed File Systems
8.2 Introduction, good features of DFS, File models, File Accessing models,
File-Caching Schemes, File Replication, Network File System(NFS),
Andrew File System(AFS), Hadoop Distributed File System and Map
Reduce.

Term Work:
Term work should consist of at least 10 experiments, 2 assignments based on above theory
syllabus.
The final certification and acceptance of term work ensures that satisfactory performance of
laboratory work and minimum passing marks in term work.
The distribution of marks for term work shall be as follows:
· Laboratory work (experiments): ……………………….. (15) Marks.
· Assignments: …………………………………………… (05) Marks.
· Attendance ………………………………………. (05) Marks
TOTAL: ……………………………………………………. (25) Marks.
Internal Assessment:
Internal Assessment consists of two tests. Test 1, an Institution level central test, is for 20 marks
and is to be based on a minimum of 40% of the syllabus. Test 2 is also for 20 marks and is to be
based on the remaining syllabus. Test 2 may be either a class test or assignment on live problems
or course project
Text Books
1. M.R. Bhujade, “Parallel Computing”, 2nd edition, New Age International Publishers
2009.
2. Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles and
Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5
Elective-III Machine Learning
Introduction to Machine Learning
1.1 What is Machine Learning?, Key Terminology, Types of Machine
Learning, Issues in Machine Learning, Application of Machine Learning,
How to choose the right algorithm, Steps in developing a Machine
Learning Application.

02 Learning with Regression
2.1 Linear Regression, Logistic Regression.

03 Learning with trees
3.1 Using Decision Trees, Constructing Decision Trees, Classification and
Regression Trees (CART).

04 Support Vector Machines(SVM)
4.1 Maximum Margin Linear Separators, Quadratic Programming solution to
finding maximum margin separators, Kernels for learning non-linear
functions.

05 Learning with Classification
5.1 Rule based classification, classification by backpropoagation, Bayesian
Belief networks, Hidden Markov Models.

06 Dimensionality Reduction
6.1 Dimensionality Reduction Techniques, Principal Component Analysis,
Independent Component Analysis.

07 Learning with Clustering
7.1 K-means clustering, Hierarchical clustering, Expectation MaximizationAlgorithm, Supervised learning after clustering, Radial Basis functions.
08 Reinforcement Learning
8.1 Introduction, Elements of Reinforcement Learning, Model based learning,
Temporal Difference Learning, Generalization, Partially Observable
States.

Term Work:
The distribution of marks for term work shall be as follows:
· Laboratory work (experiments): ………..……………… (15) Marks.
· Assignments:……….………………………………… (05) Marks.
· Attendance ………………………………………. (05) Marks
TOTAL: ……………………………………………………. (25) Marks.
Internal Assessment:
Internal Assessment consists of two tests. Test 1, an Institution level central test, is for 20 marks and is to
be based on a minimum of 40% of the syllabus. Test 2 is also for 20 marks and is to be based on the
remaining syllabus. Test 2 may be either a class test or assignment on live problems or course project
Text Books:
1. Peter Harrington “Machine Learning In Action”, DreamTech Press
2. Ethem Alpayd?n, “Introduction to Machine Learning”, MIT Press
3. Tom M.Mitchell “Machine Learning” McGraw Hill
4. Stephen Marsland, “Machine Learning An Algorithmic Perspective” CRC Press
Elective-III Embedded Systems
Introduction to computational technologies
1.1 Review of computation technologies (ARM, RISC, CISC, PLD, SOC),
architecture, event managers, hardware multipliers, pipelining.
Hardware/Software co-design. Embedded systems architecture and design
process.

02 Program Design and Analysis
2.1 Integrated Development Environment (IDE), assembler, linking and
loading. Program-level performance analysis and optimization, energy
and power analysis and program size optimization, program validation
and testing. Embedded Linux, kernel architecture, GNU cross platform
tool chain. Programming with Linux environment.

03 Process Models and Product development life cycle management
3.1 State machine models: finite-state machines (FSM), finite-state machines
with data-path model (FSMD), hierarchical/concurrent state machine
08model (HCFSM), program-state machine model (PSM), concurrent
process model. Unified Modeling Language (UML), applications of UML
in embedded systems. IP-cores, design process model. Hardware software
co-design, embedded product development life cycle management.
04 High Performance 32-bit RISC Architecture
4.1 ARM processor family, ARM architecture, instruction set, addressing
modes, operating modes, interrupt structure, and internal peripherals.
ARM coprocessors, ARM Cortex-M3.

05 Processes and Operating Systems
5.1 Introduction to Embedded Operating System, multiple tasks and multiple
processes. Multi rate systems, preemptive real-time operating systems,
priority-based scheduling, inter-process communication mechanisms.
Operating system performance and optimization strategies. Examples of
real-time operating systems.

06 Real-time Digital Signal Processing (DSP)
6.1 Introduction to Real-time simulation, numerical solution of the mathematical
model of physical system. DSP on ARM, SIMD techniques. Correlation,
Convolution, DFT, FIR filter and IIR Filter implementation on ARM. Open
Multimedia Applications Platform (OMAP)

Term Work:
Term work should consist of at least 10 practicals and one mini project. Objective type term work
test shall be conducted with a weightage of 10 marks.
The distribution of marks for term work shall be as follows:
· Laboratory work (experiments/projects): ……….…….. (10) Marks.
· Mini project: …………………………………………… (10) Marks.
· Attendance ………………………………………. (05) Marks
TOTAL: ……………………………………………………. (25) Marks.
Internal Assessment:
Internal Assessment consists of two tests. Test 1, an Institution level central test, is for 20
marks and is to be based on a minimum of 40% of the syllabus. Test 2 is also for 20 marks
and is to be based on the remaining syllabus. Test 2 may be either a class test or assignment
on live problems or course project
Text Books:
1. Embedded Systems an Integrated Approach – Lyla B Das, Pearson
2. Computers as Components – Marilyn Wolf, Third Edition Elsevier
3. Embedded Systems Design: A Unified Hardware/Software Introduction – Frank
Vahid and Tony Givargis, John Wiley & Sons
4. An Embedded Software Primer – David E. Simon – Pearson Education Sough Asia
5. ARM System Developer’s Guide Designing and Optimizing System Software –
Andrew N. Sloss, Dominic Sysmes and Chris Wright – Elsevier Inc.
Elective-III Adhoc Wireless Networks
Introduction
1.1 Introduction to wireless Networks. Characteristics of Wireless channel,
Issues in Ad hoc wireless networks, Adhoc Mobility Models:- Indoor
and outdoor models.
1.2 Adhoc Networks: Introduction to adhoc networks – definition,
characteristics features, applications.

02 MAC Layer
2.1 MAC Protocols for Ad hoc wireless Networks: Introduction, Issues in
designing a MAC protocol for Ad hoc wireless Networks, Design goals
and Classification of a MAC protocol, Contention based protocols with
reservation mechanisms.
2.2 Scheduling algorithms, protocols using directional antennas. IEEE
standards: 802.11a, 802.11b, 802.11g, 802.15, 802.16, HIPERLAN.

03 Network Layer
3.1 Routing protocols for Ad hoc wireless Networks: Introduction, Issues in
designing a routing protocol for Ad hoc wireless Networks,
Classification of routing protocols, Table driven routing protocol, Ondemand
routing protocol.
3.2 Proactive Vs reactive routing, Unicast routing algorithms, Multicast
routing algorithms, hybrid routing algorithm, Energy aware routing
algorithm, Hierarchical Routing, QoS aware routing.

04 Transport Layer
4.1 Transport layer protocols for Ad hoc wireless Networks: Introduction,
Issues in designing a transport layer protocol for Ad hoc wireless
Networks, Design goals of a transport layer protocol for Ad hoc wirelessNetworks, Classification of transport layer solutions, TCP over Ad hoc
wireless Networks, Other transport layer protocols for Ad hoc wireless
Networks.
05 Security
5.1 Security: Security in wireless Ad hoc wireless Networks, Network
security requirements, Issues & challenges in security provisioning,
Network security attacks, Key management, Secure routing in Ad hoc
wireless Networks.

06 QoS
6.1 Quality of service in Ad hoc wireless Networks: Introduction, Issues and
challenges in providing QoS in Ad hoc wireless Networks, Classification
of QoS solutions, MAC layer solutions, network layer solutions.

Term Work:
· Term work should consist of at least 12 experiments.
· Journal must include at least 2 assignments.
· The final certification and acceptance of term work indicates that performance in
laboratory work is satisfactory and minimum passing marks may be given in term work.
The distribution of marks for term work shall be as follows:
· Laboratory work (experiments): …………..……….. (15) Marks.
· Assignment:………….……………………………… (05) Marks.
· Attendance ………………………………………. (05) Marks
TOTAL: ……………………………………………………. (25) Marks.
Internal Assessment:
Internal Assessment consists of two tests. Test 1, an Institution level central test, is for 20 marks
and is to be based on a minimum of 40% of the syllabus. Test 2 is also for 20 marks and is to be
based on the remaining syllabus. Test 2 may be either a class test or assignment on live problems
or course project
Text Books:
1. Siva Ram Murthy and B.S.Manoj, “Ad hoc Wireless Networks Architectures and protocols”,
2nd edition, Pearson Education, 2007
2. Charles E. Perkins, “Adhoc Networking”, Addison – Wesley, 2000
3. C. K. Toh,”Adhoc Mobile Wireless Networks”, Pearson Education, 2002
Elective-III Digital Forensics
Introduction:
1.1 Introduction of Cybercrime: Types, The Internet spawns crime, Worms
versus viruses, Computers’ roles in crimes, Introduction to digital
forensics, Introduction to Incident – Incident Response Methodology –
Steps – Activities in Initial Response, Phase after detection of an incident.

02 Initial Response and forensic duplication
2.1 Initial Response & Volatile Data Collection from Windows system –
Initial Response & Volatile Data Collection from Unix system – Forensic
Duplication: Forensic duplication: Forensic Duplicates as Admissible
Evidence, Forensic Duplication Tool Requirements, Creating a Forensic.
2.2 Duplicate/Qualified Forensic Duplicate of a Hard Drive.

03 Preserving and Recovering Digital Evidence
3.1 File Systems: FAT, NTFS – Forensic Analysis of File Systems – Storage
Fundamentals: Storage Layer, Hard Drives Evidence Handling: Types of
Evidence, Challenges in evidence handling, Overview of evidence
handling procedure.

04 Network Forensics
4.1 Intrusion detection; Different Attacks in network, analysis Collecting
Network Based Evidence – Investigating Routers – Network Protocols –
Email Tracing- Internet Fraud.

05 System investigation
5.1 Data Analysis Techniques – Investigating Live Systems (Windows &
Unix) Investigating
5.2 Hacker Tools – Ethical Issues – Cybercrime.

06 Bodies of law
6.1 Constitutional law, Criminal law, Civil law, Administrative regulations,
Levels of law: Local laws, State laws, Federal laws, International laws ,
Levels of culpability: Intent, Knowledge, Recklessness, Negligence
Level and burden of proof : Criminal versus civil cases ,Vicarious
liability, Laws related to computers: CFAA, DMCA, CAN Spam, etc.

Term Work:
· Term work should consist of at least 12 experiments.
· Journal must include at least 2 assignments.
· The final certification and acceptance of term work indicates that performance in
laboratory work is satisfactory and minimum passing marks may be given in term
work.
The distribution of marks for term work shall be as follows:
· Laboratory work (experiments): ……………………….. (15) Marks.
· Assignment: …………………………………………… (05) Marks.
· Attendance ………………………………………. (05) Marks
TOTAL: ……………………………………………………. (25) Marks.
Internal Assessment:
Internal Assessment consists of two tests. Test 1, an Institution level central test, is for 20 marks
and is to be based on a minimum of 40% of the syllabus. Test 2 is also for 20 marks and is to
be based on the remaining syllabus. Test 2 may be either a class test or assignment on live
problems or course project.
Text Books:
1. Kevin Mandia, Chris Prosise, “Incident Response and computer forensics”, Tata
McGrawHill, 2006
2. Peter Stephenson, “Investigating Computer Crime: A Handbook for Corporate
Investigations”, Sept 1999
3. Eoghan Casey, “Handbook Computer Crime Investigation’s Forensic Tools and
Technology”, Academic Press, 1st Edition, 2001
Elective III – Big Data Analytics
Introduction to Big Data
1.1 Introduction to Big Data, Big Data characteristics, types of Big Data,
Traditional vs. Big Data business approach, Case Study of Big Data
Solutions.

02 Introduction to Hadoop
2.1 What is Hadoop? Core Hadoop Components; Hadoop Ecosystem;
Physical Architecture; Hadoop limitations.

03 NoSQL
3.1 What is NoSQL? NoSQL business drivers; NoSQL case studies;
3.2 NoSQL data architecture patterns: Key-value stores, Graph stores,
Column family (Bigtable) stores, Document stores, Variations of NoSQL
architectural patterns;
3.3 Using NoSQL to manage big data: What is a big data NoSQL solution?
Understanding the types of big data problems; Analyzing big data with a
shared-nothing architecture; Choosing distribution models: master-slave
versus peer-to-peer; Four ways that NoSQL systems handle big data
problems

04 MapReduce and the New Software Stack
4.1 Distributed File Systems : Physical Organization of Compute Nodes, Large-
Scale File-System Organization.
4.2 MapReduce: The Map Tasks, Grouping by Key, The Reduce Tasks,Combiners, Details of MapReduce Execution, Coping With Node Failures.
4.3 Algorithms Using MapReduce: Matrix-Vector Multiplication by MapReduce ,
Relational-Algebra Operations, Computing Selections by MapReduce,
Computing Projections by MapReduce, Union, Intersection, and Difference by
MapReduce, Computing Natural Join by MapReduce, Grouping and
Aggregation by MapReduce, Matrix Multiplication, Matrix Multiplication with
One MapReduce Step.
05 Finding Similar Items
5.1 Applications of Near-Neighbor Search, Jaccard Similarity of Sets,
Similarity of Documents, Collaborative Filtering as a Similar-Sets
Problem .
5.2 Distance Measures: Definition of a Distance Measure, Euclidean
Distances, Jaccard Distance, Cosine Distance, Edit Distance, Hamming
Distance.

06 Mining Data Streams
6.1 The Stream Data Model: A Data-Stream-Management System,
Examples of Stream Sources, Stream Querie, Issues in Stream Processing.
6.2 Sampling Data in a Stream : Obtaining a Representative Sample , The
General Sampling Problem, Varying the Sample Size.
6.3 Filtering Streams:
The Bloom Filter, Analysis.
6.4 Counting Distinct Elements in a Stream
The Count-Distinct Problem, The Flajolet-Martin Algorithm, Combining
Estimates, Space Requirements .
6.5 Counting Ones in a Window:
The Cost of Exact Counts, The Datar-Gionis-Indyk-Motwani Algorithm,
Query Answering in the DGIM Algorithm, Decaying Windows.

07 Link Analysis
7.1 PageRank Definition, Structure of the web, dead ends, Using Page rank
in a search engine, Efficient computation of Page Rank: PageRank
Iteration Using MapReduce, Use of Combiners to Consolidate the Result
Vector.
7.2 Topic sensitive Page Rank, link Spam, Hubs and Authorities.

08 Frequent Itemsets
8.1 Handling Larger Datasets in Main Memory
Algorithm of Park, Chen, and Yu, The Multistage Algorithm, The Multihash
Algorithm.
8.2 The SON Algorithm and MapReduce
8.3 Counting Frequent Items in a Stream
Sampling Methods for Streams, Frequent Itemsets in Decaying Windows Clustering
9.1 CURE Algorithm, Stream-Computing , A Stream-Clustering Algorithm,
Initializing & Merging Buckets, Answering Queries

10 Recommendation Systems
10.1 A Model for Recommendation Systems, Content-Based
Recommendations, Collaborative Filtering.

11 Mining Social-Network Graphs
11.1 Social Networks as Graphs, Clustering of Social-Network Graphs, Direct
Discovery of Communities, SimRank, Counting triangles using Map-
Reduce

Term Work:
Assign a case study for group of 2/3 students and each group to perform the following
experiments on their case-study; Each group should perform the exercises on a large dataset
created by them.
The distribution of marks for term work shall be as follows:
· Programming Exercises: ………………………..….. (10) Marks.
· Mini project: ………………………………………… (10) Marks.
· Attendance ………………………………………. (05) Marks
TOTAL: ……………………………………………………. (25) Marks.
Internal Assessment:
Internal Assessment consists of two tests. Test 1, an Institution level central test,
is for 20 marks and is to be based on a minimum of 40% of the syllabus. Test 2 is
also for 20 marks and is to be based on the remaining syllabus. Test 2 may be
either a class test or assignment on live problems or course project.
Text Books:
1. Anand Rajaraman and Jeff Ullman “Mining of Massive Datasets”, Cambridge
University Press,
2. Alex Holmes “Hadoop in Practice”, Manning Press, Dreamtech Press.
3. Dan McCreary and Ann Kelly “Making Sense of NoSQL” – A guide for managers
and the rest of us, Manning Press.
Cloud Computing Laboratory
Title: Study of Cloud Computing & Architecture.
Concept: Cloud Computing & Architecture.
Objective: Objective of this module is to provide students an overview
of the Cloud Computing and Architecture and different types of Cloud
Computing
Scope: Cloud Computing & Architecture Types of Cloud Computing .
Technology: —

02 Title: Virtualization in Cloud.
Concept: Virtualization
Objective: In this module students will learn, Virtualization Basics,
Objectives of Virtualization, and Benefits of Virtualization in cloud.
Scope: Creating and running virtual machines on open source OS.
Technology: KVM, VMware.

03 Title: Study and implementation of Infrastructure as a Service .
Concept: Infrastructure as a Service.
Objective: In this module student will learn Infrastructure as a Service
and implement it by using OpenStack.
Scope: Installing OpenStack and use it as Infrastructure as a Service .
Technology: Quanta Plus /Aptana /Kompozer

04 Title: Study and installation of Storage as Service.Concept: Storage as Service (SaaS)
Objective: is that, students must be able to understand the concept of
SaaS , and how it is implemented using ownCloud which gives
universal access to files through a web interface.
Scope: is to installation and understanding features of ownCloud as
SaaS.
Technology: ownCloud
05 Title: Implementation of identity management.
Concept: Identity Management in cloud
Objective: this lab gives an introduction about identity management in
cloud and simulate it by using OpenStack
Scope: installing and using identity management feature of OpenStack
Technology: OpenStack

06 Title: Write a program for web feed.
Concept: Web feed and RSS
Objective: this lab is to understand the concept of form and control
validation
Scope: Write a program for web feed
Technology: PHP, HTML

07 Title: Study and implementation of Single-Sing-On.
Concept: Single Sing On (SSO),openID
Objective: is to understand the concept of access control in cloud and
single sing on (SSO), Use SSO and advantages of it, and also students
should able to implementation of it.
Scope: installing and using JOSSO
Technology: JOSSO Title: Securing Servers in Cloud.
Concept: Cloud Security
Objective: is to understand how to secure web server, how to secure
data directory and introduction to encryption for own cloud.

Scope: Installing and using security feature of ownCloud
Technology: ownCloud
09 Title: User Management in Cloud.
Concept: Administrative features of Cloud Managenet ,User
Management
Objective: is to understand how to create, manage user and group of
users accounts.
Scope: Installing and using Administrative features of ownCloud
Technology: ownCloud

10 Title: Case study on Amazon EC2.
Concept: Amazon EC2
Objective: in this module students will learn about Amazon EC2.
Amazon Elastic Compute Cloud is a central part of Amazon.com’s
cloud computing platform, Amazon Web Services. EC2 allows users to
rent virtual computers on which to run their own computer applications

11 Title: Case study on Microsoft azure.
Concept: Microsoft Azure
Objective: students will learn about Microsoft Azure is a cloud
computing platform and infrastructure, created by Microsoft, for
building, deploying and managing applications and services through a
global network of Microsoft-managed datacenters. How it work,
different services provided by it.
Technology: Microsoft azure Title: Mini project.
Concept: using different features of cloud computing creating own
cloud for institute, organization etc.
Objective: is student must be able to create own cloud using different
features which are learned in previous practices.
Scope: creating a cloud like social site for institute.
Technology: any open system used for cloud

Term Work:
· Term work should consist of at least 6 experiments and a mini project.
· Journal must include at least 2 assignments.
· The final certification and acceptance of term work indicates that performance in
laboratory work is satisfactory and minimum passing marks may be given in term
work.
The distribution of marks for term work shall be as follows:
· Laboratory work (experiments): ……………………….. (15) Marks.
· Mini project presentation: ………………………………… (05) Marks.
· Attendance ………………………………………. (05) Marks
TOTAL: …………………………………………… (25) Marks.
Text Books:
1. Enterprise Cloud Computing by Gautam Shroff, Cambridge,2010
2. Cloud Security by Ronald Krutz and Russell Dean Vines, Wiley – India, 2010 ,
ISBN:978-0-470-58987-8
3. Getting Started with OwnCloud by Aditya Patawar , Packt Publishing Ltd, 2013
4. www.openstack.org
No comments yet.

Leave a Reply

Powered by WordPress. Designed by Woo Themes