Cluster Analysis by descriptors
Bit of modification to previous post to k-means cluster the chemicals.
So these nodes are from previous workflow...
File Reader: Read csv file
Column Filter: Filter out non-SMILES paramters
RDKit Descriptor Calculation: Used All Descriptors!!!!
Normalizer: Gaussian (but 0 to 1 could be good too)
PCA: Into 3D
And two addtions
k-Means: Into 10 clusters for a trial
Joiner: Join by row ID so that I can add colour to 3D plot
Then set the 3D plot to colour by the Cluster column
Hmm... seems to clustering at near places. Not bad again.
I though PCA may cause some werid things but looks like it didn't. Nice simple clustering workflow I guess... KNIME is so cool.