(see e.g. Murugesan, Cho, and Tortora, 2021 for a comparative review)
association-based measures aim to go beyond match/mismatch of categories
a non-parametric approach is suitable for non-convex/oddly shaped clusters
computationally demanding (but it can be made bearable)
convex combination of the cont/cat distances
Le, S. Q. and T. B. Ho (2005). “An association-based dissimilarity measure for categorical data”. In: Pattern Recognition Letters 26.16, pp. 2549-2557.
Mbuga, F. and C. Tortora (2021). “Spectral Clustering of Mixed-Type Data”. In: Stats 5.1, pp. 1-11.
Murugesan, N., I. Cho, and C. Tortora (2021). “Benchmarking in cluster analysis: a study on spectral clustering, DBSCAN, and K-Means”. In: Conference of the International Federation of Classification Societies. Springer. , pp. 175-185.
Ng, A., M. Jordan, and Y. Weiss (2001). “On spectral clustering: analysis and an algorithm, Advances in Neural Information Processing Systems”. In: volume 14, 849.
van de Velden, M., A. Iodice D’Enza, A. Markos, et al. (2023). “A general framework for implementing distances for categorical variables”. In: submitted to Pattern Recognition, pp. 1-21.
Velden, M. van de, A. Iodice D’Enza, and F. Palumbo (2017). “Cluster correspondence analysis”. In: Psychometrika 82.1, pp. 158-185.
slides available at https://alfonsoiodicede.github.io/
SIS 2023 - Statistical Learning, Sustainability and Impact Evaluation