Page 1 of 1

Module Code - Title:

MS6022 - STATISTICAL LEARNING

Year Last Offered:

2025/6

Hours Per Week:

Lecture

2

Lab

1

Tutorial

0

Other

0

Private

7

Credits

6

Grading Type:

N

Prerequisite Modules:

Rationale and Purpose of the Module:

To ground the students in applied multivariate analysis techniques used in data science and statistical learning. The module introduces the mathematical and statistical ideas behind state-of-the-art techniques used in data science. These include supervised and unsupervised learning techniques such as classification methods (discriminant analysis, k-nearest neighbours, classification trees, logistic regression, support vector machines), clustering methods (hierarchical clustering, k-means clustering, model-based clustering) and dimension reduction methods (principal components analysis, factor analysis). The students will learn how to implement these techniques in R with practical applications. They will also develop their report-writing skills to both technical and non-technical audiences.

Syllabus:

1. Introduction to multivariate data 2. Mathematical necessities 3. Discriminant analysis (linear and quadratic, k-NN) 4. Logistic regression 5. Classification and regression trees 6. Principal components analysis 7. Factor analysis 8. Multidimensional scaling 9. Cluster analysis (hierarchical, k-means, model-based) 10. Non-linear modelling

Learning Outcomes:

Cognitive (Knowledge, Understanding, Application, Analysis, Evaluation, Synthesis)

On completion of this module students will be able to: 1. Recognise the underlying structure of complex multivariate data sets in data science applications. 2. Demonstrate an understanding of the mathematical and statistical concepts underpinning key multivariate analysis techniques in data science and statistical learning. 3. Identify the most appropriate method to implement in a wide variety of settings. 4. Critically appraise and demonstrate an understanding of the strengths and limitations of each method. 5. Apply each method to real-world data using R and RStudio. 6. Compile data analysis reports, interpreting the results and presenting key findings to technical and non-technical stakeholders in industrial or commercial settings.

Affective (Attitudes and Values)

On completion of this module students will: 1. Display sharp critical appraisal skills with an appreciation of the appropriate analytics tools to use in a variety of data science applications. 2. Formulate a well-constructed rationale to defend and justify any decisions made. 3. Challenge incorrect applications of methods.

Psychomotor (Physical Skills)

NA

How the Module will be Taught and what will be the Learning Experiences of the Students:

The module will be taught via lectures and computer labs. This module will contribute towards graduating MSc students who are knowledgeable (being able to bring their discipline knowledge to bear on real world problems), responsible (being able to challenge and question the appropriate use of data and statistical analysis methods), proactive (making active use of data and research to drive improvements and positive change), articulate (being able to report their findings to stakeholders with technical and non-technical backgrounds).

Research Findings Incorporated in to the Syllabus (If Relevant):

Prime Texts:

James, G., Witten, D., Hastie, T., Tibshirani, R. (2018) An Introduction to Statistical Learning with Applications in R , Springer
Hastie, T., Tibshirani, R., Friedman, J. (2008) The Elements of Statistical Learning: Data Mining, Inference and Prediction , Springer

Other Relevant Texts:

Programme(s) in which this Module is Offered:

Semester(s) Module is Offered:

Spring

Module Leader:

shirin.moghaddam@ul.ie