ITOM Graduate Data Mining Certificate Program

 

Draft Curriculum

 

Session

Topic

 

1

Introduction to Data Mining

  • What data can you mine?
  • What patterns can you mine?
  • Types of Data Mining Systems
  • Major Issues in Data Mining
  • Applications of Data Mining (CRM, EIS)

 

2

Introduction to SAS and Enterprise Miner

Overview of SPSS

 

3

Preparing Data for Data Mining

  • Data Cleaning
  • Data Integration and Transformation

Lab exercise on data preparation

 

4

Preparing Data for Data Mining (2)

  • Data Reduction
  • Discretization and concept hierarchies

Lab exercise on data preparation

 

5

Review of Statistical Concepts and Methods

  • probability
  • Sampling
  • Distributions and confidence intervals
  • Correlation, variance and covariance

 

6

Data Warehousing and OLAP

  • Multidimensional databases
  • Data Warehouse Architecture
  • OLAP for data analysis

From DW/OLAP to Data Mining

 

7

Concept Description: Characterization and Comparison

  • Generalization and Summarization
  • Analysis of attribute relevance
  • Discriminating between different classes
  • Mining statistical measures in large databases

Lab Exercise on concept formation and description

 

8

Mining Association Rules

  • Market basket analysis
  • Boolean association rules
  • Multilevel association rules
  • From association rules to correlation analysis

Lab exercise

 

9

Classification and Prediction

  • classification by decision tree induction
  • integrating decision trees with data warehouses
  • Bayesian Classification

 

10

Neural Networks for Classification and Prediction

  • Backpropagation
  • Other NN models

 

11

Other Methods for Classification

  • Genetic algorithms
  • Case-based reasoning
  • K-nearest neighbor classifiers

Classifier Accuracy Estimation

 

12 & 13

Prediction Using Traditional Statistics

  • Linear and Multiple regression
  • Nonlinear regression
  • Other regression and prediction models

 

14

Cluster Analysis

  • Classical partitioning methods
  • Hierarchical partitioning methods
  • Density based methods
  • Grid-based methods

 

15

Outlier Analysis

  • statistical outlier analysis
  • Distance-based outlier analysis
  • Deviation-based outlier analysis

 

16

Issues in Mining Complex Data

  • Spatial Data
  • Multimedia data
  • Object data

 

17

Issues in Mining Complex Data

  • Time-series and sequential data
  • Mining text data
  • Mining Web data

 

18

Social Implications of Data Mining

  • Security and privacy concerns
  • Implications for personalization

 

19

Survival Data Mining

  • Exploratory data analysis using survival curves
  • Handling time-dependent variables
  • Modeling multiple and repeated events

 

20

Trends in Data Mining