Predictive multi-varied analysis techniques
Aims
This course covers predictive modelling using the SAS/STAT software, with particular emphasis on Proc REG for multiple regression analyses.
The final aim is creating an entire predictive process for a continuous target event, illustrating the ways to correctly identify and define the event, selecting the explanatory variables, analysing the multicollinearity amongst the regressors, models assessment, treatment of the missing values and techniques for the management of large volumes of data.
Who should attend
Statistical analysts, data mining experts, business users; the topics covered focus on the marketing database area, credit risk assessment, default detection and predictive modelling applications in general.
Prerequisites
Basic experience in the use of the SAS language is required as is statistics fundamentals. Basic data analysis experience is recommended.
Course outline
Database creation
- Defining the phenomenon to be analysed (analysis temporal interval)
- Identifying data sources
- Customer Table Design and Construction
- TARGET variable creation
- Development sample determination (TRAINING/VALIDATION)
- Features analysis (missing, outlier, etc.)
Multivariate Linear Regression
- Underlying model hypotheses
- Parameter estimates
- Model significance
- Single regressor significance
- Fit diagnostics
- Residue analysis
- Influence analysis
- Variables interaction
- Multicollinearity
- Selection procedures
Duration
The duration of the course is 2 days.