๐Ÿ“– Guidelines๏ƒ

Install from PyPI (recommended):

pip install tdaad

Or install from source:

git clone https://github.com/IRT-SystemX/tdaad.git
cd tdaad
pip install .

Requirements:

  • Python โ‰ฅ 3.7

  • See requirements.txt for full dependency list

๐Ÿš€ Quickstart๏ƒ

Hereโ€™s a minimal example using TopologicalAnomalyDetector:

import numpy as np
from tdaad.anomaly_detectors import TopologicalAnomalyDetector

# Example multivariate time series with shape (n_samples, n_features)
X = np.random.randn(1000, 3)

# Initialize and fit the detector
detector = TopologicalAnomalyDetector(window_size=100, n_centers_by_dim=3)
detector.fit(X)

# Compute anomaly scores
scores = detector.score_samples(X)

You can also use pandas.DataFrame instead of a NumPy array โ€” column names will be preserved in the output.

For more advanced usage (e.g. custom embeddings, parameter tuning), see the examples folder or API documentation

๐Ÿ“Œ Usage Notes๏ƒ

  • TDAAD is designed for multivariate time series (2D inputs) โ€” univariate data is not supported.

  • The core detection method relies on sliding-window embeddings and persistent homology to identify structural changes in the signal.

  • The key parameters that impact results and runtime are:

    • window_size controls the time resolution โ€” larger windows capture slower anomalies, smaller ones detect more localized changes.

    • n_centers_by_dim controls the number of reference shapes used per homology dimension (e.g. connected components in H0, loops in H1, โ€ฆ). Increasing this improves sensitivity but adds computation time.

    • tda_max_dim sets the maximum topological feature dimension computed (0 = connected components, 1 = loops, 2 = voids, โ€ฆ). Higher values increase runtime and memory usage.

  • Inputs can be numpy.ndarray or pandas.DataFrame. Column names are preserved in the output when using DataFrames.

โš™๏ธ You can typically handle ~100 sensors and a few hundred time steps per window on a modern machine.

๐Ÿงฎ Basic Complexity of Persistent Homology in TDAAD๏ƒ

  • Total complexity scales with: $O(N ร— (w ร— p)^{(d+2)})$ where $w$ is the time resolution (or window_size, number of time steps per window), $p$ is the number of variables (features/sensors), $d$ is the maximum homology dimension tda_max_dim, and $N$ is the total number of sliding windows.

  • So note that increasing max homology dimension d raises the exponent, causing exponential growth. The number of centers n_centers_by_dim used after the PH computation does not significantly affect the overall complexity.