tadkit.base package

Submodules

tadkit.base.basedensitydetector module

class tadkit.base.basedensitydetector.BaseDensityOutlierDetector(contamination: float = 0.1)[source]

Bases: BaseEstimator, OutlierMixin

Base class for density-based outlier detection.

Subclasses must implement:
  • _fit_density(X)

  • _score_density(X)

    Accepts pandas DataFrame/Series but works internally with NumPy arrays. Returns results with same index as input if input is pandas.

contamination: float
decision_function(X: Any) ndarray | Series[source]
fit(X: Any, y=None)[source]
offset_: float | None
predict(X: Any) ndarray | Series[source]
score_samples(X: Any) ndarray | Series[source]

tadkit.base.dataframe_type module

class tadkit.base.dataframe_type.DataFrameType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

Data type of datasets: long (asynchronous) vs wide (synchronous).

ASYNCHRONOUS = 'asynchronous'
SYNCHRONOUS = 'synchronous'
static from_text(name: str | None)[source]

Convert a string to a DataFrameType enum.

static infer_from_df(df: DataFrame) DataFrameType[source]

tadkit.base.formatter module

class tadkit.base.formatter.Formatter[source]

Bases: ABC

Abstract base class for all formalizers. Provides array-agnostic interface for ML pipelines.

add_property(name: str)[source]
add_query_description(name: str, param_info: Dict[str, Any])[source]
property available_properties: List[str]
default_query() Dict[str, Any][source]

Return default query parameters based on query_description.

abstract format(**query) ndarray | DataFrame[source]

Transform raw data into standard array-like format. Return type depends on backend (numpy array, pandas DataFrame, etc.)

property query_description: Dict[str, Any]
remove_property(name: str)[source]

tadkit.base.registry module

class tadkit.base.registry.Registry[source]

Bases: object

Registry for dynamically tracking and matching learner classes.

list_learners() List[str][source]
match_learners(formatter: Any) List[Type][source]

Return learner classes compatible with the given formatter.

print_catalog_classes(detailed=False)[source]
static print_compliance_miss(learner)[source]
register_learner(name: str, learner: Type | str, condition: Callable[[Any], bool], optional: bool = False)[source]

Register a learner with a compatibility condition.

Parameters:
  • name (str) – Display name for the learner.

  • learner (class or str) – The learner class OR an import path (β€œmodule.submodule.ClassName”).

  • condition (callable(formatter) -> bool) – Determines whether this learner is compatible.

  • optional (bool) – If True, missing imports are ignored instead of raising.

tadkit.base.tadlearner module

class tadkit.base.tadlearner.TADLearner(*args, **kwargs)[source]

Bases: Protocol

Abstract class of Time Anomaly Detection Learner (model).

Avoid explicit inheritance from this class. Better to simply do it implicitly.

fit()[source]

Fit the learner on input data.

score_samples()[source]

The measure of normality of an observation according to the fitted model. The lower, the more abnormal.

predict()[source]

Predict if a particular sample is an outlier or not. For each observation, tells whether or not (+1 or -1) it should be considered as an inlier according to the fitted model.

Class attributes:

params_description: Description of the arguments of the __init__ method. See examples in the catalog. required_properties: Get the properties that the input data must satisfies. See examples in the catalog.

Example

>>> assert isinstance(MyLearner, TADLearner)
>>> MyLearner.required_properties  # The required property of input data
>>> MyLearner.params_description  # The description of the params
>>> params = ...  # Params to initiate learner
>>> learner = MyLearner(**params)
>>> learner.fit(X)  # X, y must satisfy MyLearner.required_properties
>>> score_sample_pred = learner.score_samples(X_test)
fit(X: ndarray | list | DataFrame, y: ndarray | list | DataFrame | None = None) TADLearner[source]
predict(X: ndarray | list | DataFrame) ndarray | list | DataFrame[source]

Predict if a particular sample is an outlier or not. Scikit-learn compatible.

Parameters:

X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The input samples.

Returns:

is_inlier – For each observation, tells whether or not (+1 or -1) it should be considered as an inlier according to the fitted model.

Return type:

ndarray of shape (n_samples,)

score_samples(X: ndarray | list | DataFrame) ndarray | list | DataFrame[source]

The measure of normality of an observation according to the fitted model. Scikit-learn compatible.

Parameters:

X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The input samples.

Returns:

scores – The anomaly score of the input samples. The lower, the more abnormal.

Return type:

ndarray of shape (n_samples,)

tadkit.base.typing module

Module contents