tdaad packageο
Submodulesο
tdaad.anomaly_detectors moduleο
Topological Anomaly Detectors.
- class tdaad.anomaly_detectors.TopologicalAnomalyDetector(window_size: int = 100, step: int = 5, tda_max_dim: int = 1, n_centers_by_dim: int = 5, support_fraction: float | None = None, contamination: float = 0.1, random_state: int | RandomState | None = 42)[source]ο
Bases:
EllipticEnvelope,TransformerMixinAnomaly detection for multivariate time series using topological embeddings and robust covariance estimation.
This detector extracts topological features from sliding windows of time series data and uses a robust Mahalanobis distance (via PandasEllipticEnvelope) to score anomalies.
Read more in the User Guide.
- Parameters:
window_size (int, default=100) β Sliding window size for extracting time series subsequences.
step (int, default=5) β Step size between windows.
tda_max_dim (int, default=1) β Maximum homology dimension used for topological feature extraction.
n_centers_by_dim (int, default=5) β Number of k-means centers per topological dimension (for vectorization).
support_fraction (float or None, default=None) β Proportion of data to use for robust covariance estimation. If None, computed automatically.
contamination (float, default=0.1) β Proportion of anomalies in the data, used to compute decision threshold.
random_state (int, RandomState instance, or None, default=42) β Controls randomness of the topological embedding and robust estimator.
- topological_embedding_ο
TopologicalEmbedding transformer object that is fitted at fit.
- Type:
object
Examples
>>> n_timestamps = 1000 >>> n_sensors = 20 >>> import pandas as pd >>> timestamps = pd.to_datetime('2024-01-01', utc=True) + pd.Timedelta(1, 'h') * np.arange(n_timestamps) >>> X = pd.DataFrame(np.random.random(size=(n_timestamps, n_sensors)), index=timestamps) >>> X.iloc[n_timestamps//2:,:10] = -X.iloc[n_timestamps//2:,10:20] >>> detector = TopologicalAnomalyDetector(n_centers_by_dim=2, tda_max_dim=1).fit(X) >>> anomaly_scores = detector.score_samples(X) >>> decision = detector.decision_function(X) >>> anomalies = detector.predict(X)
- fit(X, y=None)[source]ο
Fit the TopologicalAnomalyDetector model.
- Parameters:
X ({array-like, sparse matrix} of shape (n_timestamps, n_sensors)) β Multiple time series to transform, where n_timestamps is the number of timestamps in the series X, and n_sensors is the number of sensors.
y (Ignored) β Not used, present for API consistency by convention.
- Returns:
self β Returns the instance itself.
- Return type:
object
- tdaad.anomaly_detectors.score_flat_fast_remapping(scores, window_size, stride, padding_length=0)[source]ο
Remap window-level anomaly scores to a flat sequence of per-time-step scores.
- Parameters:
scores (array-like of shape (n_windows,)) β Anomaly scores for each window. Can be a pandas Series or NumPy array.
window_size (int) β Size of the sliding window.
stride (int) β Step size between windows.
padding_length (int, optional (default=0)) β Extra length to pad the output array (typically at the end of a signal).
- Returns:
remapped_scores β Flattened anomaly scores with per-timestep resolution. NaN values (from positions not covered by any window) are replaced with 0.
- Return type:
np.ndarray of shape (n_timestamps + padding_length,)
tdaad.topological_embedding moduleο
Topological Embedding Transformers.
- class tdaad.topological_embedding.SlidingWindowTransformer(window_size=40, step=5)[source]ο
Bases:
BaseEstimator,TransformerMixinSlice a 2D numpy array into overlapping windows.
Output: list of 2D numpy arrays, one per window.
- class tdaad.topological_embedding.TopologicalEmbedding(window_size: int = 40, step: int = 5, tda_max_dim: int = 2, n_centers_by_dim: int = 5, filter_nan: bool = True, output: str = 'pandas')[source]ο
Bases:
BaseEstimator,TransformerMixinTopological embedding for multivariate time series using sliding windows, persistent homology (Rips), and ATOL vectorization.
- Pipeline:
Sliding windows -> similarity -> RipsPersistence -> ColumnTransformer(Atol)
- Parameters:
window_size (int) β Number of rows per sliding window.
step (int) β Step size between windows.
tda_max_dim (int) β Maximum homology dimension for RipsPersistence.
n_centers_by_dim (int) β Number of centroids per homology dimension in ATOL.
filter_nan (bool) β Whether to filter NaNs in similarity matrices.
output (str, default="pandas") β βpandasβ returns a DataFrame with proper index and column names. βnumpyβ returns a numpy array.
Module contentsο
Topological Data Analysis module for Anomaly Detection in Time Seriesο
tdaad is a Python module integrating TDA tools from gudhi into learning algorithms designed to detect anomalies in Multiple/Multivariate Time Series.