Cluster Analysis by descriptors

Bit of modification to previous post to k-means cluster the chemicals.f:id:hateknime:20190415201154p:plain

 

So these nodes are from previous workflow...

File Reader: Read csv file

Column Filter: Filter out non-SMILES paramters

RDKit Descriptor Calculation: Used All Descriptors!!!!

Normalizer: Gaussian (but 0 to 1 could be good too)

PCA: Into 3D

 

And two addtions

k-Means: Into 10 clusters for a trial

Joiner: Join by row ID so that I can add colour to 3D plot

 

Then set the 3D plot to colour by the Cluster column

 

f:id:hateknime:20190415201119p:plain

 

Hmm... seems to clustering at near places. Not bad again.

I though PCA may cause some werid things but looks like it didn't. Nice simple clustering workflow I guess... KNIME is so cool.