cluster.method.base

class cluster.method.base.BaseClusterMethod(input, distance_function, progress_callback=None)

Bases: object

The base class of all clustering methods.

Parameters:input – a list of objects
Distance_function:
 a function returning the distance - or opposite of similarity (distance = -similarity) - of two items from the input. In other words, the closer the two items are related, the smaller this value needs to be. With 0 meaning they are exactly the same.

Note

The distance function should always return the absolute distance between two given items of the list. Say:

distance(input[1], input[4]) = distance(input[4], input[1])

This is very important for the clustering algorithm to work! Naturally, the data returned by the distance function MUST be a comparable datatype, so you can perform arithmetic comparisons on them (< or >)! The simplest examples would be floats or ints. But as long as they are comparable, it’s ok.

data

Returns the data that is currently in process.

raw_data

Returns the raw data (data without being clustered).

topo()

Returns the structure (topology) of the cluster.

See topology() for more information.