Learn how to take data (consumers, genes, stores, ...) and organise them into homogeneous groups for use in many applications, such as market analysis and biomedical data analysis, or as a pre-processing step for many data mining tasks. Cluster analysis comprises a collection of powerful techniques. Learn about this very active field of research in statistics and data mining, and discover new techniques.
Upon completion of this module, participants will be able to:
- Understand the context of use for cluster analysis
- Understand the principle of the clustering techniques: distance, agglomeration methods, and so on.
- Understand the difference between the covered classification techniques
- Select an appropriate clustering technique based on study objective & type of the classification variables
- Interpret statistical software output
- Determine the most likely number of clusters to retain
- Validate and interpret the clusters formed
- Appreciate the limitations and difficulties associated with cluster analysis
This module is intended for all scientific staff who collect large datasets and who wish to graphically summarise them as well as identify groups of objects or individuals with similar characteristics.
This workshop introduces the important concepts in statistics and data analysis. It assumes that participants have no previous knowledge of statistics or that they have not used it for a long time.
- Introduction to Cluster Analysis
- Context of Use, Objective, Terminology
- Principle of Hierarchical Methods: Determining the Distance Between Objects & Linking Clusters
- Modeling Techniques
- Optimization Methods
- Other Methods: Fuzzy Clustering
- Use and Interpretation of Clusters
- Software Packages for Cluster Analysis
Recommended Duration: 1-1.5 days
- Course notes on statistical techniques
- Datasets to illustrate specific statistical concepts
posted by RD Reeleder
This course was excellent value for the money. Well-structured and with plenty of hands-on opportunities, it is suited to both beginners and to those with some experience in the technique. The instructors were familiar with all the software packages used by the students and were able to offer practical advice on getting the desired output. A very practical course; loaded with information I could put to use right away. Highly recommended.
posted by Ping Qiu
This course is very well structured and instructed. I attended both the PCA and cluster analysis session followed by workshop. The instructor (Natalie) is very knowledgeable and very good at explaining difficult statistical problem in a simple way. This course is especially suitable for non-statistician who needs to perform hands on data analysis. This course also exposed students to many different popular statistics packages so you can get a flavor of each of them which helps me a lot in choosing tools in my future research.