API Reference

class kluster_fudge.InitMethod(value)[source]

Bases: Enum

An enumeration.

CAO = 2
HUANG = 1
RAND = 0
class kluster_fudge.KModes(n_clusters: int = 8, n_init: int = 10, max_iter: int = 100, init_method: str = 'cao', dist_metric: str = 'hamming', random_state: int = 42)[source]

Bases: object

fit(X: numpy.typing.ArrayLike) None[source]

Fit the model to the input data.

Parameters:

X – (npt.ArrayLike) Data array (n_samples, n_features)

Returns:

None

fit_predict(X: numpy.typing.ArrayLike) numpy.typing.NDArray.numpy.int64[source]

Fit the model to the input data and return the cluster labels.

Parameters:

X – (npt.ArrayLike) Data array (n_samples, n_features)

Returns:

(npt.NDArray[np.int64]) Labels array (n_samples,)

predict(X: numpy.typing.ArrayLike) numpy.typing.NDArray.numpy.int64[source]

Predict the cluster labels for the input data.

Parameters:

X – (npt.ArrayLike) Data array (n_samples, n_features)

Returns:

(npt.NDArray[np.int64]) Labels array (n_samples,)

class kluster_fudge.KModesGPU(n_clusters: int = 8, n_init: int = 10, max_iter: int = 100, init_method: str = 'cao', dist_metric: str = 'hamming', random_state: int = 42, device: str | None = None)[source]

Bases: KModes

fit(X: numpy.typing.ArrayLike) None[source]

Fit the model to the input data using GPU acceleration.

Parameters:

X – (npt.ArrayLike) Input data, array-like

Returns:

None

Dist Metrics

class kluster_fudge.dist.DistanceMetrics(value)[source]

Bases: Enum

An enumeration.

HAMMING = 'hamming'
JACCARD = 'jaccard'
NG = 'ng'
kluster_fudge.dist.distance(X: np.ndarray, centroids: np.ndarray, metric: DistanceMetrics, labels: npt.NDArray[np.int64] | None = None) npt.NDArray[np.float64][source]

Compute distance between X and centroids using the specified metric.

Parameters:
  • X – (npt.NDArray[np.int64]) Data array (n_samples, n_features)

  • centroids – (npt.NDArray[np.int64]) Centroids array (n_clusters, n_features)

  • metric – (DistanceMetrics) Distance metric to use

  • labels – (npt.NDArray[np.int64] | None) Labels array (n_samples,) for ng dist

Returns:

(npt.NDArray[np.float64]) Distance matrix (n_samples, n_clusters)