tadkit.catalog package

Submodules

tadkit.catalog.rawtowideformatter module

class tadkit.catalog.rawtowideformatter.RawToWideFormatter(data: DataFrame | ndarray, timestamps: Sequence | None = None, columns: Sequence[str] | None = None, backend: str = 'numpy')[source]

Bases: Formatter

A Formatter that supports both pandas DataFrame and NumPy array outputs.

Parameters:

data (pd.DataFrame or np.ndarray) – Input data.
backend (str) – ‘pandas’ or ‘numpy’.
timestamps (np.ndarray, optional) – Required if data is a NumPy array.
columns (list[str], optional) – Column names for NumPy arrays.

format(target_period=None, target_space=None, resample=False, resample_freq: float = 1.0)[source]: Slice and optionally resample the data, using backend-specific resampling.

tadkit.catalog.registry_init module

tadkit.catalog.sklearners module

class tadkit.catalog.sklearners.CustomScoreOutlierDetector(score_func: Callable[[ndarray], ndarray], contamination: float = 0.1)[source]

Bases: BaseDensityOutlierDetector

Parameters:

score_func (callable) – Function X -> scores (higher = inliers). Must accept 2D array and return 1D array.
contamination (float, default=0.1) – Proportion of outliers. Must be in (0, 0.5).

score_func: Callable[[ndarray], ndarray]

class tadkit.catalog.sklearners.GMMOutlierDetector(n_components=1, covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1, init_params='kmeans', weights_init=None, means_init=None, precisions_init=None, random_state=None, warm_start=False, verbose=0, verbose_interval=10, contamination: float = 0.1)[source]

Bases: BaseDensityOutlierDetector

Density-based outlier detection using GaussianMixture.

Parameters:

n_components (int, default=1) – The number of mixture components.
covariance_type ({'full', 'tied', 'diag', 'spherical'}, default='full') – Type of covariance parameters to use.
tol (float, default=1e-3) – Convergence threshold.
reg_covar (float, default=1e-6) – Non-negative regularization added to the diagonal of covariance matrices.
max_iter (int, default=100) – The number of EM iterations to perform.
n_init (int, default=1) – The number of initializations to perform. The best result is kept.
init_params ({'kmeans', 'random'}, default='kmeans') – Method used to initialize the weights, means, and precisions.
weights_init (array-like of shape (n_components,), default=None) – The user-provided initial weights.
means_init (array-like of shape (n_components, n_features), default=None) – The user-provided initial means.
precisions_init (array-like, default=None) – The user-provided initial precisions.
random_state (int, RandomState instance, default=None) – Controls the random seed.
warm_start (bool, default=False) – If True, reuse the solution of the last fitting.
verbose (int, default=0) – Enable verbose output.
verbose_interval (int, default=10) – Number of iteration steps between printing progress.
contamination (float, default=0.1) – Proportion of outliers in the dataset.

class tadkit.catalog.sklearners.KDEOutlierDetector(bandwidth=1.0, algorithm='auto', kernel='gaussian', metric='euclidean', atol=0, rtol=0, breadth_first=True, leaf_size=40, metric_params=None, contamination: float = 0.1)[source]

Bases: BaseDensityOutlierDetector

Density-based outlier detection using KernelDensity.

Parameters:

bandwidth (float, default=1.0) – The bandwidth of the kernel.
algorithm ({'kd_tree', 'ball_tree', 'auto'}, default='auto') – The tree algorithm to use.
kernel (str, default='gaussian') – The kernel to use. Valid kernels are [‘gaussian’, ‘tophat’, ‘epanechnikov’, ‘exponential’, ‘linear’, ‘cosine’].
metric (str, default='euclidean') – The distance metric to use.
atol (float, default=0) – The desired absolute tolerance of the result.
rtol (float, default=0) – The desired relative tolerance of the result.
breadth_first (bool, default=True) – If true, use a breadth-first approach to the problem.
leaf_size (int, default=40) – Leaf size passed to BallTree or KDTree.
metric_params (dict, default=None) – Additional parameters for the metric function.
contamination (float, default=0.1) – Proportion of outliers in the data set.

tadkit.catalog package

Submodules

tadkit.catalog.rawtowideformatter module

tadkit.catalog.registry_init module

tadkit.catalog.sklearners module

Module contents