tadkit.base package

Submodules

tadkit.base.dataframe_type module

class tadkit.base.dataframe_type.DataFrameType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

Data type of Dataplatform datasets.

ASYNCHRONOUS = 'asynchronous': All data have their own “timestamp” x-axis, sensor names are found in the “sensor” column and data is found in the “data” column of the provided dataframe.

SYNCHRONOUS = 'synchronous': All data share a common “timestamp” x-axis, sensor names are in columns of the provided dataframe (except for “id”, “filename”, “minio”, “timestamp” and possible others… tbd more precisely with DataPlatform.)

static from_text(name: str)[source]: Converts the string-representation to an Enum-object.

tadkit.base.formalizer module

class tadkit.base.formalizer.Formalizer[source]

Bases: ABC

Abstract class of data formalizer (provider). Transforms Data from Confiance DataProvider into standard Data for ML pipelines.

formalize()[source]: Take a data query and return associated data.

no_data_leakage(): Check if no leakage from a first data query to a second.

Properties:

query_description: Get the description of a data query. available_properties: Get the properties that the formalized data satisfies.

Example of usage:

>>> assert issubclass(MyFormalizer, Formalizer)
>>> formalizer = MyFormalizer(**args_init)
>>> formalizer.available_properties  # The provided property of the formalized data
>>> formalizer.query_description  # The description of the queries
>>> query_train = ...  # Query to create data, following the query description
>>> query_test = ...
>>> X_test = formalizer.formalize(query_test)
>>> X_train = formalizer.formalize(query_train)

abstract property available_properties: Sequence[str]

default_query()[source]

abstract formalize(**query: Dict[str, Number | str | datetime | Sequence[Number] | Sequence[str] | Sequence[datetime]]) → Array | Sequence[Array][source]

abstract property query_description: Dict[str, Dict[str, Number | str | datetime | Sequence[Number] | Sequence[str] | Sequence[datetime]]]

tadkit.base.tadlearner module

class tadkit.base.tadlearner.TADLearner(*args, **kwargs)[source]

Bases: Protocol

Abstract class of Time Anomaly Detection Learner (model).

Avoid explicit inheritance from this class. Better to simply do it implicitly.

fit()[source]: Fit the learner on input data.

score_samples()[source]: The measure of normality of an observation according to the fitted model. The lower, the more abnormal.

predict()[source]: Predict if a particular sample is an outlier or not. For each observation, tells whether or not (+1 or -1) it should be considered as an inlier according to the fitted model.

Class attributes:: params_description: Description of the arguments of the __init__ method. See examples in the catalog. required_properties: Get the properties that the input data must satisfies. See examples in the catalog.

Example

>>> assert isinstance(MyLearner, TADLearner)
>>> MyLearner.required_properties  # The required property of input data
>>> MyLearner.params_description  # The description of the params
>>> params = ...  # Params to initiate learner
>>> learner = MyLearner(**params)
>>> learner.fit(X)  # X, y must satisfy MyLearner.required_properties
>>> score_sample_pred = learner.score_samples(X_test)

fit(X: Array, y: Array | None = None) → TADLearner[source]

params_description: Dict[str, Dict[str, Number | str | datetime | Sequence[Number] | Sequence[str] | Sequence[datetime]]] = {}

predict(X: Array) → Array[source]

Predict if a particular sample is an outlier or not. Scikit-learn compatible.

Parameters:: X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The input samples.
Returns:: is_inlier – For each observation, tells whether or not (+1 or -1) it should be considered as an inlier according to the fitted model.
Return type:: ndarray of shape (n_samples,)

required_properties: Sequence[str] = []

score_samples(X: Array) → Array[source]

The measure of normality of an observation according to the fitted model. Scikit-learn compatible.

Parameters:: X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The input samples.
Returns:: scores – The anomaly score of the input samples. The lower, the more abnormal.
Return type:: ndarray of shape (n_samples,)

tadkit.base package

Submodules

tadkit.base.dataframe_type module

tadkit.base.formalizer module

tadkit.base.tadlearner module

tadkit.base.typing module

Module contents