๐ Technical docs๏
Here is the functional diagram of the main objects of this component, as well as their technical documentation:
Functional diagram of the topological anomaly detection scheme.๏
- class tdaad.anomaly_detectors.TopologicalAnomalyDetector(window_size: int = 100, step: int = 5, tda_max_dim: int = 1, n_centers_by_dim: int = 5, support_fraction: float | None = None, contamination: float = 0.1, random_state: int | RandomState | None = 42)[source]
Anomaly detection for multivariate time series using topological embeddings and robust covariance estimation.
This detector extracts topological features from sliding windows of time series data and uses a robust Mahalanobis distance (via EllipticEnvelope) to score anomalies.
Read more in the User Guide.
- Parameters:
window_size (int, default=100) โ Sliding window size for extracting time series subsequences.
step (int, default=5) โ Step size between windows.
tda_max_dim (int, default=1) โ Maximum homology dimension used for topological feature extraction.
n_centers_by_dim (int, default=5) โ Number of k-means centers per topological dimension (for vectorization).
support_fraction (float or None, default=None) โ Proportion of data to use for robust covariance estimation. If None, computed automatically.
contamination (float, default=0.1) โ Proportion of anomalies in the data, used to compute decision threshold.
random_state (int, RandomState instance, or None, default=42) โ Controls randomness of the topological embedding and robust estimator.
- topological_embedding_
TopologicalEmbedding transformer object that is fitted at fit.
- Type:
object
Examples
>>> n_timestamps = 1000 >>> n_sensors = 20 >>> import pandas as pd >>> timestamps = pd.to_datetime('2024-01-01', utc=True) + pd.Timedelta(1, 'h') * np.arange(n_timestamps) >>> X = pd.DataFrame(np.random.random(size=(n_timestamps, n_sensors)), index=timestamps) >>> X.iloc[n_timestamps//2:,:10] = -X.iloc[n_timestamps//2:,10:20] >>> detector = TopologicalAnomalyDetector(n_centers_by_dim=2, tda_max_dim=1).fit(X) >>> anomaly_scores = detector.score_samples(X) >>> decision = detector.decision_function(X) >>> anomalies = detector.predict(X)
- class tdaad.topological_embedding.TopologicalEmbedding(window_size: int = 40, step: int = 5, tda_max_dim: int = 2, n_centers_by_dim: int = 5)[source]
Topological embedding for multiple time series.
Slices time series into smaller time series windows, forms an affinity matrix on each window and applies a Rips procedure to produce persistence diagrams for each affinity matrix. Then uses Atol [ref:Atol] on each dimension through the gudhi.representation.Archipelago representation to produce topological vectorization.
Read more in the User Guide.
- Parameters:
window_size (int, default=40) โ Size of the sliding window algorithm to extract subsequences as input to named_pipeline.
step (int, default=5) โ Size of the sliding window steps between each window.
n_centers_by_dim (int, default=5) โ The number of centroids to generate by dimension for vectorizing topological features. The resulting embedding will have total dimension =< tda_max_dim * n_centers_by_dim. The resulting embedding dimension might be smaller because of the KMeans algorithm in the Archipelago step.
tda_max_dim (int, default=2) โ The maximum dimension of the topological feature extraction.
Examples
>>> n_timestamps = 100 >>> n_sensors = 5 >>> timestamps = pd.to_datetime('2024-01-01', utc=True) + pd.Timedelta(1, 'h') * np.arange(n_timestamps) >>> X = pd.DataFrame(np.random.random(size=(n_timestamps, n_sensors)), index=timestamps) >>> TopologicalEmbedding(n_centers_by_dim=2, tda_max_dim=1).fit_transform(X)