Essential Statistical Data Analysis

for Doctoral Studies Across Disciplines

Author

Alfonso Iodice D’Enza

Published

May 4, 2026

Course description

This course introduces PhD students to an applied statistical learning workflow that moves from data exploration to clustering and classification.

Given the heterogeneity of the PhD programmes involved — spanning engineering, economics and management, education and sport, and the humanities — the course does not aim to develop statistical theory in full generality. Instead, it focuses on helping students acquire a practical and transferable workflow for importing, manipulating, visualising, transforming, modelling, and critically interpreting data.

The main theme of this edition is the transition from data exploration to clustering and classification. Students will first work with ordinary rectangular datasets and then encounter examples of non-standard data, such as text, showing how raw information can be transformed into analysable variables. Clustering is introduced as a way to explore hidden structure in data, while classification is introduced as a supervised learning task where known labels are used to build predictive rules.

The course combines conceptual lectures with computer-lab sessions. Each meeting includes approximately two hours of theory and two hours of hands-on work in R.

Course schedule

Date	Time	Room	Main topic
Monday, 4 May 2026	11:00–13:00 and 14:00–16:00	B.0012A	Data exploration, manipulation, visualisation, and non-standard data
Monday, 18 May 2026	11:00–13:00 and 14:00–16:00	B.01.08	Clustering: discovering structure without labels
Monday, 25 May 2026	11:00–13:00 and 14:00–16:00	B.01.08	Classification: predicting known groups

Software environment

The course uses R and RStudio/Posit.

Students can download R from CRAN and RStudio Desktop from Posit.

Before the first lab

Please install R and RStudio before the first meeting.

In R, install the main packages used in the course by running:

install.packages(c(
  "tidyverse",
  "palmerpenguins",
  "tidytext"
))