Linear regression is inappropriate to model binary responses such as pass/fail, alive/dead. Learn the principle of logistic regression, its similarities with linear regression and its specific tools. Good practices for model-building are presented.
Regression Models for Categorical Data
Upon completion of this module, participants will be able to:
- Understand the context of use of logistic regression
- Understand why ordinary regression fails for the modeling of categorical variables
- Construct a logistic regression model
- Assess the goodness-of-fit of the model to the data
- Identify common issues in logistic regression, diagnose problems and fix them
- Interpret statistical software output
This module is aimed at all scientific staff who collect categorical data and who must make decisions based on them. The regression techniques covered in this session will be particularly useful for people who deal with qualitative response variables (measurements) in finance, epidemiology, medicine, genetics, social sciences, econometrics and marketing.
Participants must have attended the following modules.
or possess an equivalent background.
Introduction to Logistic Regression
- Goal: To Study the Relationship between a Categorical Variable and a Set of Explanatory Variables
- Why Does Ordinary Multiple Linear Regression Fail for the Analysis of a Categorical Response Variable?
Refresher on Multiple Linear Regression
- Definition and Estimation of the Model
- Interpretation of model coefficients
- Goodness-of-Fit and Validation Techniques
Classical Case: a Binary Response Variable
- Basic Principle: Modeling the probability of observing a given value of the reponse variable
- Interpretation of Statistical Software Output: Coefficients and Mathematical Transformations, Odds Ratios, Statistical Testing of Model Coefficients
- Comparison of Logistic Regression Software Output with Multiple Linear Regression
- Goodness-of-Fit Measures: Nested Models, Cross-Validation Techniques
- Using the Model for Predictive Purposed
Overview of the Case of an Ordinal and Nominal Response Variables Practical Considerations
- Procedures Available in Statistical Software
- Implementation and Interpretation
Recommended Duration: 1 day
Course materials :