tadkit.base package
Submodules
tadkit.base.dataframe_type module
- class tadkit.base.dataframe_type.DataFrameType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
Data type of Dataplatform datasets.
- ASYNCHRONOUS = 'asynchronous'
All data have their own “timestamp” x-axis, sensor names are found in the “sensor” column and data is found in the “data” column of the provided dataframe.
- SYNCHRONOUS = 'synchronous'
All data share a common “timestamp” x-axis, sensor names are in columns of the provided dataframe (except for “id”, “filename”, “minio”, “timestamp” and possible others… tbd more precisely with DataPlatform.)
tadkit.base.formalizer module
- class tadkit.base.formalizer.Formalizer[source]
Bases:
ABC
Abstract class of data formalizer (provider). Transforms Data from Confiance DataProvider into standard Data for ML pipelines.
- no_data_leakage()
Check if no leakage from a first data query to a second.
- Properties:
query_description: Get the description of a data query. available_properties: Get the properties that the formalized data satisfies.
- Example of usage:
>>> assert issubclass(MyFormalizer, Formalizer) >>> formalizer = MyFormalizer(**args_init) >>> formalizer.available_properties # The provided property of the formalized data >>> formalizer.query_description # The description of the queries >>> query_train = ... # Query to create data, following the query description >>> query_test = ... >>> X_test = formalizer.formalize(query_test) >>> X_train = formalizer.formalize(query_train)
- abstract property available_properties: Sequence[str]
- abstract formalize(**query: Dict[str, Number | str | datetime | Sequence[Number] | Sequence[str] | Sequence[datetime]]) Array | Sequence[Array] [source]
- abstract property query_description: Dict[str, Dict[str, Number | str | datetime | Sequence[Number] | Sequence[str] | Sequence[datetime]]]
tadkit.base.tadlearner module
- class tadkit.base.tadlearner.TADLearner(*args, **kwargs)[source]
Bases:
Protocol
Abstract class of Time Anomaly Detection Learner (model).
Avoid explicit inheritance from this class. Better to simply do it implicitly.
- score_samples()[source]
The measure of normality of an observation according to the fitted model. The lower, the more abnormal.
- predict()[source]
Predict if a particular sample is an outlier or not. For each observation, tells whether or not (+1 or -1) it should be considered as an inlier according to the fitted model.
- Class attributes:
params_description: Description of the arguments of the __init__ method. See examples in the catalog. required_properties: Get the properties that the input data must satisfies. See examples in the catalog.
Example
>>> assert isinstance(MyLearner, TADLearner) >>> MyLearner.required_properties # The required property of input data >>> MyLearner.params_description # The description of the params >>> params = ... # Params to initiate learner >>> learner = MyLearner(**params) >>> learner.fit(X) # X, y must satisfy MyLearner.required_properties >>> score_sample_pred = learner.score_samples(X_test)
- fit(X: Array, y: Array | None = None) TADLearner [source]
- params_description: Dict[str, Dict[str, Number | str | datetime | Sequence[Number] | Sequence[str] | Sequence[datetime]]] = {}
- predict(X: Array) Array [source]
Predict if a particular sample is an outlier or not. Scikit-learn compatible.
- Parameters:
X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The input samples.
- Returns:
is_inlier – For each observation, tells whether or not (+1 or -1) it should be considered as an inlier according to the fitted model.
- Return type:
ndarray of shape (n_samples,)
- required_properties: Sequence[str] = []
- score_samples(X: Array) Array [source]
The measure of normality of an observation according to the fitted model. Scikit-learn compatible.
- Parameters:
X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The input samples.
- Returns:
scores – The anomaly score of the input samples. The lower, the more abnormal.
- Return type:
ndarray of shape (n_samples,)