×

Warning

The RokSprocket Module needs the RokSprocket Component enabled.

Regression Models for Categorical Data

Regression Models for Categorical Data

Linear regression is inappropriate to model binary responses such as pass/fail, alive/dead. Learn the principle of logistic regression, its similarities with linear regression and its specific tools. Good practices for model-building are presented.

Course Details

Learning Objectives

Learning Objectives:

Upon completion of this module, participants will be able to:

  • Understand the context of use of logistic regression
  • Understand why ordinary regression fails for the modeling of categorical variables
  • Construct a logistic regression model
  • Assess the goodness-of-fit of the model to the data
  • Identify common issues in logistic regression, diagnose problems and fix them
  • Interpret statistical software output

Target Audience

Target Audience:

This module is aimed at all scientific staff who collect categorical data and who must make decisions based on them. The regression techniques covered in this session will be particularly useful for people who deal with qualitative response variables (measurements) in finance, epidemiology, medicine, genetics, social sciences, econometrics and marketing.

Prerequisite

Prerequisite:

Participants must have attended the following modules.

or possess an equivalent background.

Course Outline

Course Outline:

Introduction to Logistic Regression

  • Goal: To Study the Relationship between a Categorical Variable and a Set of Explanatory Variables
  • Why Does Ordinary Multiple Linear Regression Fail for the Analysis of a Categorical Response Variable?

Refresher on Multiple Linear Regression

  • Definition and Estimation of the Model
  • Interpretation of model coefficients
  • Goodness-of-Fit and Validation Techniques

Classical Case: a Binary Response Variable

  • Basic Principle: Modeling the probability of observing a given value of the reponse variable
  • Example
  • Interpretation of Statistical Software Output: Coefficients and Mathematical Transformations, Odds Ratios, Statistical Testing of Model Coefficients
  • Comparison of Logistic Regression Software Output with Multiple Linear Regression
  • Goodness-of-Fit Measures: Nested Models, Cross-Validation Techniques
  • Using the Model for Predictive Purposed

Overview of the Case of an Ordinal and Nominal Response Variables Practical Considerations

  • Procedures Available in Statistical Software
  • Implementation and Interpretation

Practical Info

Practical Info:

Recommended Duration: 1 day

Course materials :

  • Course notes on the statistical techniques
  • Sample datasets
  • Related Sessions

    • An applied set of modules with focus on the most widely used multivariate methods and their applications in several fields of application. Learn about the principle of the methods, the data needed, and the information they provide.

    • Learn about preference mapping techniques to explore and understand consumer preferences. Applications dealing with segmentation and the identification of niche markets are discussed. Focus on pitfalls and good practices.

    • Predictive analytics (PA) is on everyone's lips. But what is it really all about? Discover its principle, implementation, typical pitfalls and good practices. Learn about data wrangling and munging, a crucial step in predictive analytics. An overview of the most commonly used models is also presented.

    • The primary goal of this method is to discover which variables have the best ability of discriminating between two or more known groups in your data. Discrimimant analysis may also be used to build predictive analytics models.