Application of Clustering Techniques to the Classification of Marine Phytoplankton

  • Samantha Ann Hardy

    Student thesis: Master's Thesis

    Abstract

    An introduction to classification methods is given with sections on Flow Cytometry, Clustering and so on. This includes a literature survey on research into fuzzy clustering algorithms, with a section specifically related to Flow Cytometry. Details are given about the data sets and the software used, and the clustering algorithms investigated. The flow cytometry data was collected for marine phytoplankton. Two groups of data are used, one containing species that overlapped each other, and one containing independent species (non-overlapped). Ten clusters of 1000 records each are collated for each group, each record comprising of seven variables. Six clustering algorithms (Fuzzy K-Means, Adaptive Distances, Fuzzy K-Means, Generalised Distances, Maximum Likelihood, Minimum Total Volume, and Sum of all Normalised Determinants) are used to cluster the flow cytometry data. The results are compared for each group of data, for each algorithm, based on the number of clusters produced and the relationships of the phytoplankton species placed in each cluster. Conclusions are drawn about the suitability of each algorithm to cluster phytoplankton flow cytometry data, and a discussion follows on some flow cytometry data issues. Various potential algorithms that could be investigated in future research are discussed.
    Date of AwardJul 2004
    Original languageEnglish
    SupervisorColin Morris (Supervisor)

    Cite this

    '