Data Mining with Matlab
(C) 2007,2008 Chase Krumpelman, the IDEAL Laboratory, and UT Austin
This course provides an introduction to the theory and application of data mining techniques. Each section explains the concepts and math behind a particular technique, and provides examples as to how to actually perform the technique using the Matlab numerical computation environment. As you progress through the tutorials, you should be sure that you understand both the theory and the implementation. In real-world data-mining problems (and your research and class problems), the practitioners that understand the theory are able to quickly come up with optimal strategies for analysis, which those that don't are forced to hack away at their code without a clear picture of what they're doing.
  1. Introduction to Matlab, Review of Linear Algebra and Probability
    1. Matlab Command Line Interface, Basic Operations
    2. Linear Algebra Review
    3. Eigenvectors and Eigenvalues
    4. Probability Review
    5. Loading Data into Matlab
    6. Matlab Functions and Scripts
  2. Classification
    1. Introduction to Classification
    2. Nearest Neighbor Classification
    3. Bayes Optimal Classification, The Bayes Error Rate, and The Naïve Bayes Classifier
    4. Decision Trees
    5. Support Vector Machines
    6. Artificial Neural Networks
    7. Logistic Regression
    8. Classification Ensembles
  3. Regression
    1. Linear Regression
    2. Data Mining Concepts in Linear Regression
    3. Cross Validation
    4. Linear Basis Function Regression: Sigmoids and RBF's
    5. Online Regression and Gradient Descent
    6. Multiple Outputs
    7. Multiple Input
    8. Artificial Neural_Networks for Regression
    9. PCA and Demixing
  4. Clustering
    1. Hierarchical Agglomerative
    2. k-means
    3. Gaussian Mixture Models and Expectation Maximization
    4. Clustering, Compression, and Communication
    5. Density Based Clustering
    6. Similarity-Based Clustering and Graph Partitioning
    7. Clustering Ensembles
    8. Self-Organizing Maps
  5. Data Preparation
    1. Data Acquisition and Munging
      1. Dealing with Missing Values
    2. Feature Extraction and Dimensionality Reduction
      1. Principal Component Analysis
      2. Multidimensional Scaling
      3. Independent Component Analysis
    3. Nonlinear Dimensionality Reduction
      1. Self Organizing Maps
      2. Isomap
  6. Frequent Itemset (Market Basket) Analysis
    1. The APRIORI Algorithm
  7. Sampling
    1. Rejection Sampling
    2. The Gibbs Sampler