New Chemometrics Project

I will be employing multivariate data analysis (mostly PCA & factor analysis) for toxins in floor dust samples and study their health impact from a birth cohort study of rural children. The data set is a matrix of 228 observations of 379 variables, mostly metabolite concentrations of bacterial and fungal species.

The main challenge are that a lot of the concentrations are below the detection limit and that the data set is made up of some dichotomous data that cannot easily be transformed for PCA.

I will be experimenting with 1/2-LOD replacements (as a baseline only) and Kaplan-Meier treatment (better!) of censored data (see Helsel, 2005) to replace initial zeros for non-detects with estimates.

Furthermore I need to study the transformation of my non-numeric data matrix (e.g., employing the Filmer-Pritchett procedure or a polychoric PCA) for proper treatment of my dichotomous variables.

Published by

greg

Atmospheric chemistry researcher and university teacher. Data analysis/chemometrics specialist (PCA, PCR, Cluster analysis, SOM)

Leave a Reply

Your email address will not be published. Required fields are marked *