Syllabus: Data Warehousing and Mining [M.Sc. I] @ Mumbai Uni
Posted by csrins on July 26, 2006
Paper IV
Section-II
Objectives of the course: The data warehousing part of module aims to give students a good overview of the ideas and techniques which are behind recent development in the data warehousing and online analytical processing (OLAP) fields, in terms of data models, query language, conceptual design methodologies, and storage techniques. Data mining part of the model aims to motivate, define and characterize data mining as process; to motivate, define and characterize data mining applications.
Data Warehousing:
- Overview And Concepts: Need for data warehousing, Basic elements of data warehousing, Trends in data warehousing.
- Planning And Requirements: Project planning and management, Collecting the requirements.
- Architecture And Infrastructure: Architectural components, Infrastructure and metadata.
- Data Design And Data Representation: Principles of dimensional modeling, Dimensional modeling advanced topics, data extraction, transformation and loading, data quality.
- Information Access And Delivery: Matching information to classes of users, OLAP in data warehouse, Data warehousing and the web.
- Implementation And Maintenance: Physical design process, data warehouse deployment, growth and maintenance.
Data Mining:
- Introduction: Basics of data mining, related concepts, Data mining techniques.
- Data Mining Algorithms: Classification, Clustering, Association rules.
- Knowledge Discovery : KDD Process
- Web Mining: Web Content Mining, Web Structure Mining, Web Usage mining.
- Advanced Topics: Spatial mining, Temporal mining.
- Visualisation : Data generalization and summarization-based
- Data Mining Primitives, Languages, and System Architectures: Data mining primitives, Query language, Designing GUI based on a data mining query language, Architectures of data mining systems
- Application and Trends in Data Mining: Applications, Systems products and research prototypes, Additional themes in data mining, Trends in data mining
characterization, Analytical characterization: analysis of attribute relevance, Mining class comparisons: Discriminating between different classes, Mining descriptive statistical measures in large databases
Text Books:
- Paulraj Ponniah, “Data Warehousing Fundamentals”, John Wiley.
- M.H. Dunham, “Data Mining Introductory and Advanced Topics”, Pearson Education.
- Han, Kamber, “Data Mining Concepts and Techniques”, Morgan Kaufmann
- Pieter Adriaans, Dolf Zantinge , “Data Mining”, Pearson Education Asia
References:
- Ralph Kimball, “The Data Warehouse Lifecycle toolkit”, John Wiley.
- M Berry and G. Linoff, “Mastering Data Mining”, John Wiley.
- W.H. Inmon, “Building the Data Warehouses”, Wiley Dreamtech.
- R. Kimball, “The Data Warehouse Toolkit”, John Wiley.
- E.G. Mallach, “Decision Support and Data Warehouse systems”, TMH.
Practicals
Section II
Software used: Microsoft SQL Server 2000/7.0
1. Create a warehouse in MS SQL Server 2000 and import various databases from external sources such as Access/Excel/Text File by using Data Transformation Services (DTS) tool.
2. Create and schedule a DTS Package using Data Transformation services (DTS) tool. Fire at least 5 queries on the database.
3. Create a Database using Analysis Manager and create a Single-Dimensional OLAP cube by using STAR schema.
4. Create a Database using Analysis Manager and create a Multi-Dimensional OLAP cube by using Snowflake schema.
5. Create a Mining Model by using Relational Data.
6. Create a Mining Model by using OLAP Data.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.