tadkit.base package

Submodules

tadkit.base.basedensitydetector module

class tadkit.base.basedensitydetector.BaseDensityOutlierDetector(contamination: float = 0.1)[source]

Bases: BaseEstimator, OutlierMixin

Base class for density-based outlier detection.

Subclasses must implement:

_fit_density(X)
_score_density(X)

Accepts pandas DataFrame/Series but works internally with NumPy arrays. Returns results with same index as input if input is pandas.

contamination: float

decision_function(X: Any) → ndarray | Series[source]

fit(X: Any, y=None)[source]

offset_: float | None

predict(X: Any) → ndarray | Series[source]

score_samples(X: Any) → ndarray | Series[source]

tadkit.base.dataframe_type module

class tadkit.base.dataframe_type.DataFrameType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

Data type of datasets: long (asynchronous) vs wide (synchronous).

ASYNCHRONOUS = 'asynchronous'

SYNCHRONOUS = 'synchronous'

static from_text(name: str | None)[source]: Convert a string to a DataFrameType enum.

static infer_from_df(df: DataFrame) → DataFrameType[source]

tadkit.base.formatter module

class tadkit.base.formatter.Formatter[source]

Bases: ABC

Abstract base class for all formalizers. Provides array-agnostic interface for ML pipelines.

add_property(name: str)[source]

add_query_description(name: str, param_info: Dict[str, Any])[source]

property available_properties: List[str]

default_query() → Dict[str, Any][source]: Return default query parameters based on query_description.

abstract format(**query) → ndarray | DataFrame[source]: Transform raw data into standard array-like format. Return type depends on backend (numpy array, pandas DataFrame, etc.)

property query_description: Dict[str, Any]

remove_property(name: str)[source]

tadkit.base.registry module

class tadkit.base.registry.Registry[source]

Bases: object

Registry for dynamically tracking and matching learner classes.

list_learners() → List[str][source]

match_learners(formatter: Any | None = None) → List[Type][source]: Return learner classes compatible with the given formatter.

print_catalog_classes(detailed=False)[source]

static print_compliance_miss(learner)[source]

register_learner(name: str, learner: Type | str, condition: Callable[[Any], bool], optional: bool = False)[source]

Register a learner with a compatibility condition.

Parameters:

name (str) – Display name for the learner.
learner (class or str) – The learner class OR an import path (“module.submodule.ClassName”).
condition (callable(formatter) -> bool) – Determines whether this learner is compatible.
optional (bool) – If True, missing imports are ignored instead of raising.

tadkit.base.tadlearner module

class tadkit.base.tadlearner.TADLearner(*args, **kwargs)[source]

Bases: Protocol

Abstract class of Time Anomaly Detection Learner (model).

Avoid explicit inheritance from this class. Better to simply do it implicitly.

fit()[source]: Fit the learner on input data.

score_samples()[source]: The measure of normality of an observation according to the fitted model. The lower, the more abnormal.

predict()[source]: Predict if a particular sample is an outlier or not. For each observation, tells whether or not (+1 or -1) it should be considered as an inlier according to the fitted model.

Class attributes:: params_description: Description of the arguments of the __init__ method. See examples in the catalog. required_properties: Get the properties that the input data must satisfies. See examples in the catalog.

Example

>>> assert isinstance(MyLearner, TADLearner)
>>> MyLearner.required_properties  # The required property of input data
>>> MyLearner.params_description  # The description of the params
>>> params = ...  # Params to initiate learner
>>> learner = MyLearner(**params)
>>> learner.fit(X)  # X, y must satisfy MyLearner.required_properties
>>> score_sample_pred = learner.score_samples(X_test)

fit(X: ndarray | list | DataFrame, y: ndarray | list | DataFrame | None = None) → TADLearner[source]

predict(X: ndarray | list | DataFrame) → ndarray | list | DataFrame[source]

Predict if a particular sample is an outlier or not. Scikit-learn compatible.

Parameters:: X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The input samples.
Returns:: is_inlier – For each observation, tells whether or not (+1 or -1) it should be considered as an inlier according to the fitted model.
Return type:: ndarray of shape (n_samples,)

score_samples(X: ndarray | list | DataFrame) → ndarray | list | DataFrame[source]

The measure of normality of an observation according to the fitted model. Scikit-learn compatible.

Parameters:: X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The input samples.
Returns:: scores – The anomaly score of the input samples. The lower, the more abnormal.
Return type:: ndarray of shape (n_samples,)