tadkit.catalog packageο
Submodulesο
tadkit.catalog.rawtowideformatter moduleο
- class tadkit.catalog.rawtowideformatter.RawToWideFormatter(data: DataFrame | ndarray, timestamps: Sequence | None = None, columns: Sequence[str] | None = None, backend: str = 'numpy')[source]ο
Bases:
FormatterA Formatter that supports both pandas DataFrame and NumPy array outputs.
- Parameters:
data (pd.DataFrame or np.ndarray) β Input data.
backend (str) β βpandasβ or βnumpyβ.
timestamps (np.ndarray, optional) β Required if data is a NumPy array.
columns (list[str], optional) β Column names for NumPy arrays.
tadkit.catalog.registry_init moduleο
tadkit.catalog.sklearners moduleο
- class tadkit.catalog.sklearners.CustomScoreOutlierDetector(score_func: Callable[[ndarray], ndarray], contamination: float = 0.1)[source]ο
Bases:
BaseDensityOutlierDetector- Parameters:
score_func (callable) β Function X -> scores (higher = inliers). Must accept 2D array and return 1D array.
contamination (float, default=0.1) β Proportion of outliers. Must be in (0, 0.5).
- score_func: Callable[[ndarray], ndarray]ο
- class tadkit.catalog.sklearners.GMMOutlierDetector(n_components=1, covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1, init_params='kmeans', weights_init=None, means_init=None, precisions_init=None, random_state=None, warm_start=False, verbose=0, verbose_interval=10, contamination: float = 0.1)[source]ο
Bases:
BaseDensityOutlierDetectorDensity-based outlier detection using GaussianMixture.
- Parameters:
n_components (int, default=1) β The number of mixture components.
covariance_type ({'full', 'tied', 'diag', 'spherical'}, default='full') β Type of covariance parameters to use.
tol (float, default=1e-3) β Convergence threshold.
reg_covar (float, default=1e-6) β Non-negative regularization added to the diagonal of covariance matrices.
max_iter (int, default=100) β The number of EM iterations to perform.
n_init (int, default=1) β The number of initializations to perform. The best result is kept.
init_params ({'kmeans', 'random'}, default='kmeans') β Method used to initialize the weights, means, and precisions.
weights_init (array-like of shape (n_components,), default=None) β The user-provided initial weights.
means_init (array-like of shape (n_components, n_features), default=None) β The user-provided initial means.
precisions_init (array-like, default=None) β The user-provided initial precisions.
random_state (int, RandomState instance, default=None) β Controls the random seed.
warm_start (bool, default=False) β If True, reuse the solution of the last fitting.
verbose (int, default=0) β Enable verbose output.
verbose_interval (int, default=10) β Number of iteration steps between printing progress.
contamination (float, default=0.1) β Proportion of outliers in the dataset.
- class tadkit.catalog.sklearners.KDEOutlierDetector(bandwidth=1.0, algorithm='auto', kernel='gaussian', metric='euclidean', atol=0, rtol=0, breadth_first=True, leaf_size=40, metric_params=None, contamination: float = 0.1)[source]ο
Bases:
BaseDensityOutlierDetectorDensity-based outlier detection using KernelDensity.
- Parameters:
bandwidth (float, default=1.0) β The bandwidth of the kernel.
algorithm ({'kd_tree', 'ball_tree', 'auto'}, default='auto') β The tree algorithm to use.
kernel (str, default='gaussian') β The kernel to use. Valid kernels are [βgaussianβ, βtophatβ, βepanechnikovβ, βexponentialβ, βlinearβ, βcosineβ].
metric (str, default='euclidean') β The distance metric to use.
atol (float, default=0) β The desired absolute tolerance of the result.
rtol (float, default=0) β The desired relative tolerance of the result.
breadth_first (bool, default=True) β If true, use a breadth-first approach to the problem.
leaf_size (int, default=40) β Leaf size passed to BallTree or KDTree.
metric_params (dict, default=None) β Additional parameters for the metric function.
contamination (float, default=0.1) β Proportion of outliers in the data set.