uqmodels.preprocessing package

Submodules

uqmodels.preprocessing.Custom_Preprocessor module

class uqmodels.preprocessing.Custom_Preprocessor.Generic_Features_processor(name='Generic_Features_processor', cache=None, structure=None, update_query=None, list_params_features=[], list_fit_features=[], list_compute_features=[], list_update_params_features=None, list_params_targets=[], list_fit_targets=[], list_compute_targets=[], list_update_params_targets=None, normalise_data=False, normalise_context=False, dataset_formalizer=None, min_size=1, concat_features=False, concat_targets=True, **kwargs)[source]

Bases: Preprocessor

fit(data, query={}, **kwargs)[source]

Fit Preprocessing using data and fit_function procedure

Parameters:
  • data (obj, optional) – data. Defaults to None.

  • query – dict_query that generated the data

  • save_formaliser (bool, optional) – boolean flag that inform if we have to save preprocessor or not

fit_transform(data, query={}, **kwarg)[source]

Fit Processor and apply it on data

Parameters:
  • data (obj, optional) – data. Defaults to None.

  • query – dict_query that generated the data.

Return

data : Preprocessed data

transform(data, query={}, training=True, **kwarg)[source]

Apply transform_function to data :param data: data. Defaults to None. :type data: obj, optional :param query: dict_query that generated the data

Return

data : Preprocessed data

class uqmodels.preprocessing.Custom_Preprocessor.dict_to_TS_Dataset(name='dict_to_TS_Dataset')[source]

Bases: Preprocessor

fit(data, query)[source]

Do nothing

transform(data, query)[source]

Provide dataset as list of array : [X,y,context,train,test,X_split]

uqmodels.preprocessing.Custom_Preprocessor.init_Features_processor(name='Features_processor', dict_params_FE_ctx=None, dict_params_FE_dyn=None, dict_params_FE_targets=None, update_params_FE_ctx=None, update_params_FE_dyn=None, update_params_FE_targets=None, normalise_data=False, normalise_context=False, dataset_formalizer=None, min_size=1, structure=None, cache=None)[source]

uqmodels.preprocessing.Preprocessor module

class uqmodels.preprocessing.Preprocessor.Generic_Preprocessor(name='Generic_preprocessor', cache=None, structure=None, update_query=None, fit_function=<function fit_default>, transform_function=<function transform_default>, **kwargs)[source]

Bases: Preprocessor

fit(data, query={})[source]
Apply fit_function on data with query as query and self.structure as metadata
if query has an β€œsource” attribute:

try to access to corrrespoding substructure by structure.get_structure(query[source])

Parameters:
  • data (obj, optional) – data. Defaults to None.

  • query – dict_query that generated the data

  • save_formaliser (bool, optional) – boolean flag that inform if we have to save preprocessor or not

transform(data, query={}, **kwarg)[source]
Apply transform_function on data with query as query and self.structure as metadata
if query has an β€œsource” attribute:

try to access to corrrespoding substructure by structure.get_structure(query[source])

Parameters:
  • data (obj, optional) – data. Defaults to None.

  • query – dict_query that generated the data

Return

data : Preprocessed data

class uqmodels.preprocessing.Preprocessor.Preprocessor(name='formaliser', cache=None, structure=None, update_query=None, **kwargs)[source]

Bases: Processor

default_update_query(query, name)[source]
fit(data=None, query={}, save_preprocessor=False)[source]

Fit Preprocessing using data

Parameters:
  • data (obj, optional) – data. Defaults to None.

  • query – dict_query that generated the data

  • save_formaliser (bool, optional) – boolean flag that inform if we have to save preprocessor or not

fit_transform(data=None, query={})[source]

Fit Processor and apply it on data

Parameters:
  • data (obj, optional) – data. Defaults to None.

  • query – dict_query that generated the data.

Return

data : Preprocessed data

get(keys, default_value=None)[source]

Get obj from structure using structure.get

Parameters:
  • keys (_type_) – key or list of keys related to attributes to get

  • default_value (_type_, optional) – default_value if no attribute. Defaults to None.

load(query={}, name='data')[source]

Load method to load Preprocessor at query+name location using cache_manager and use it parameters

Parameters:
  • query (dict, optional) – query_paramaters. Defaults to None.

  • name (_type_, optional) – filename of obj to load. Defaults to None.

save(query={}, object=None, name='data')[source]

Save method to store object at query+name location using cache_manager

Parameters:
  • query (dict, optional) – dict_query that generated the data.

  • object (obj, optional) – object to store. Defaults to None.

  • name (_type_, optional) – filename of obj to store. Defaults to None.

set(key, obj)[source]

Set ogj in structure using structure.get

Parameters:
  • keys (_type_) – key or list of keys related to attributes to get

  • obj (_type_) – _description_

transform(data=None, query={})[source]

Apply Preprocessor to data :param data: data. Defaults to None. :type data: obj, optional :param query: dict_query that generated the data

Return

data : Preprocessed data

update_query(query={})[source]

Apply the update_query_function provided at init to update query :param query: dict_query that generated the data. :type query: dict

Returns:

updated query

Return type:

new_query

use_cache(query={})[source]

Use_cache manager to check if there is cache link to data already processed

Parameters:

query (dict) – dict_query that generated the data.

Raises:

FileNotFoundError – cache Not Found error caught by method that called use_case

Returns:

if file is found else error

Return type:

data

uqmodels.preprocessing.Preprocessor.fit_default(self, data, query={}, structure=None)[source]

fit function that done nothing

Parameters:
  • data (obj) – data

  • query (dict) – dict_query that generated the data.

  • structure (structure obj, optional) – structure object that provide all meta information about data.

uqmodels.preprocessing.Preprocessor.transform_default(self, data, query={}, structure=None)[source]

Transform+ function that done nothing

Parameters:
  • data (obj) – data

  • query (dict) – dict_query that generated the data.

  • structure (structure obj, optional) – structure object that provide all meta information about data.

uqmodels.preprocessing.data_loader module

class uqmodels.preprocessing.data_loader.TS_csv_Data_loader(data_loader_api=<function read>)[source]

Bases: Data_loader

load(dict_query)[source]

load form a dict_query that will be provide to the data_loader_api function

Parameters:

dict_query (dict) – query as a dict that contains argument of the self.data_loader_api

Raises:

FileNotFoundError – error if file not found

Returns:

selected_data loaded by the data_loader_api function from the dict_query

Return type:

selected_data

uqmodels.preprocessing.features_processing module

Data preprocessing module.

uqmodels.preprocessing.features_processing.automatic_periods_detection(array)[source]
uqmodels.preprocessing.features_processing.build_window_representation(y, step=1, window=10)[source]
uqmodels.preprocessing.features_processing.check_transform_input_to_panda(input, name='')[source]
Check if input is dataframe.

if it’s a np.ndarray turn it to dataframe else raise error.

Parameters:
  • input (_type_) – input to check or tranforaam

  • name (str, optional) – name of input

Raises:

TypeError – Input have a wrong type

Returns:

pd.dataframe

Return type:

input

uqmodels.preprocessing.features_processing.compute_FE_by_estimator(data, context, ind_data=None, ind_context=None, estimator=None, estimator_params={}, data_lag=[1], params_=None, **kwargs)[source]
uqmodels.preprocessing.features_processing.compute_MV_features(data, context, ind_data=None, ind_context=None, focus=None, n_components=3, n_neighboor=4, lags=[0], derivs=[0], windows=[1], params_=None, **kwargs)[source]

Naive FE function : Fit function to select features having stronger correlation to targets, plus compute PCA synthesis of them

Parameters:
  • data (_type_) – _description_

  • context (_type_) – _description_

  • ind_data (_type_, optional) – _description_. Defaults to None.

  • ind_context (_type_, optional) – _description_. Defaults to None.

  • focus (_type_, optional) – _description_. Defaults to None.

  • n_components (int, optional) – _description_. Defaults to 3.

  • n_neighboor (int, optional) – _description_. Defaults to 4.

  • lags (list, optional) – _description_. Defaults to [0].

  • derivs (list, optional) – _description_. Defaults to [0].

  • windows (list, optional) – _description_. Defaults to [1].

  • params (_type_, optional) – _description_. Defaults to None.

Returns:

_description_

Return type:

_type_

uqmodels.preprocessing.features_processing.compute_ctx_features(data, context, ind_data=None, ind_context=None, n_components=3, lag=0, params_=None, **kwargs)[source]

Produce contextual information by apply a PCA on ctx_measure + nan_series if provided

Parameters:
  • list_channels (_type_) – ctx_sources to synthesize 2D (times, features) array

  • nan_series (_type_, optional) – nan_series : capteurs issues localisation.

  • list_target_channels (list, optional) – Defaults to [0].

Returns:

X_ctx

uqmodels.preprocessing.features_processing.compute_feature_engeenering(data, context=None, dict_FE_params={}, params_=None)[source]
uqmodels.preprocessing.features_processing.compute_pca(data, context=None, n_components=3, data_lag=1, ind_data=None, ind_context=None, params_=None, **kwargs)[source]

Fit&Compute for PCA features generation: compute PCA from selected data & context and params which contains fitted pca.

if params is none call fit_pca to get a fitted PCA_model

Parameters:
  • data (ndarray) – data

  • context (ndarray, optional) – context_data. Defaults to None.

  • n_components (int, optional) – n_components of pca. Defaults to 3.

  • ind_data (ind_array, optional) – selected data. Defaults to None : all dim are pick

  • ind_context (ind_array, optional) – seletected data context.

  • None (Defaults to) – all dim are pick if there is context

Returns:

data_reduced,PCA_model

uqmodels.preprocessing.features_processing.compute_tsfresh_feature_engeenering(data, context=None, window=10, step=10, ind_data=None, ind_context=None, params_=None, **kwargs)[source]
uqmodels.preprocessing.features_processing.fit_FE_by_estimator(data, context, ind_data=None, ind_context=None, estimator=None, estimator_params={}, data_lag=[1], **kwargs)[source]
uqmodels.preprocessing.features_processing.fit_MV_features(data, context, ind_data=None, ind_context=None, focus=None, n_components=3, n_neighboor=4, lags=[0], derivs=[0], windows=[1], **kwargs)[source]

Naive FE function : Fit function to select features having stronger correlation to targets, plus compute PCA synthesis of them

Parameters:
  • data (_type_) – _description_

  • context (_type_) – _description_

  • ind_data (_type_, optional) – _description_. Defaults to None.

  • ind_context (_type_, optional) – _description_. Defaults to None.

  • focus (_type_, optional) – _description_. Defaults to None.

  • n_components (int, optional) – _description_. Defaults to 3.

  • n_neighboor (int, optional) – _description_. Defaults to 4.

  • lags (list, optional) – _description_. Defaults to [0].

  • derivs (list, optional) – _description_. Defaults to [0].

  • windows (list, optional) – _description_. Defaults to [1].

Returns:

_description_

Return type:

_type_

uqmodels.preprocessing.features_processing.fit_compute_MA_derivate(data, context=None, ind_data=None, ind_context=None, windows=[1], lags=[0], derivs=[0], params=None, **kwargs)[source]

Compute a MA-values of the window last values, then apply lags, then derivates and returns values. Apply a 1-lag by default

uqmodels.preprocessing.features_processing.fit_compute_lag(data, context=None, lag=[0], delay=0, ind_data=None, ind_context=None, params=None, **kwargs)[source]

Create lag features from a numerical array :param Y: Target to extract lag-feature :type Y: float array :param lag: Lag number. Defaults to 3. :type lag: int, optional :param delay: Delay before 1 lag feature. Defaults to 0. :type delay: int, optional

uqmodels.preprocessing.features_processing.fit_compute_lag_values(data, context=None, ind_data=None, ind_context=None, derivs=[0], windows=[1], lag=[0], delay=0, params=None, **kwargs)[source]

Turn step_scale context array into cos/sin periodic features

Parameters:
  • context (_type_) – context_data

  • ind_context (_type_) – ind of step_scale

  • modularity (_type_) – modularity of data

  • freq (list, optional) – frequence of sin/cos. Defaults to [1].

uqmodels.preprocessing.features_processing.fit_compute_periods(data, context=None, ind_data=None, ind_context=None, periodicities=[1], freqs=[1], params_=None, **kwargs)[source]

Turn step_scale context array into cos/sin periodic features

Parameters:
  • context (_type_) – context_data

  • ind_context (_type_) – ind of step_scale

  • modularity (_type_) – modularity of data

  • freq (list, optional) – frequence of sin/cos. Defaults to [1].

uqmodels.preprocessing.features_processing.fit_ctx_features(data, context, ind_data=None, ind_context=None, n_components=3, lags=[0], **kwargs)[source]

Produce contextual information by apply a PCA on ctx_measure + nan_series if provided

Parameters:
  • list_channels (_type_) – ctx_sources to synthesize 2D (times, features) array

  • nan_series (_type_, optional) – nan_series : capteurs issues localisation.

  • list_target_channels (list, optional) – Defaults to [0].

Returns:

X_ctx

uqmodels.preprocessing.features_processing.fit_feature_engeenering(data, context=None, dict_FE_params={}, **kwargs)[source]
uqmodels.preprocessing.features_processing.fit_pca(data, context=None, n_components=3, data_lag=1, ind_data=None, ind_context=None, **kwargs)[source]

Fit&Compute for PCA features generation: fit PCA from selected data & context.

Parameters:
  • data (ndarray) – data

  • context (ndarray, optional) – context_data. Defaults to None.

  • n_components (int, optional) – n_components of pca. Defaults to 3.

  • ind_data (ind_array, optional) – selected data.

  • None (Defaults to) – all dim are pick

  • ind_context (ind_array, optional) – seletected data context. Defaults to None : all dim are pick if there is context

uqmodels.preprocessing.features_processing.fit_tsfresh_feature_engeenering(data, context=None, window=10, step=None, ts_fresh_params=None, ind_data=None, ind_context=None, **kwargs)[source]
uqmodels.preprocessing.features_processing.get_FE_params(delta=None)[source]

Provide defaults parameters for features engenering

Parameters:

delta (_type_, optional) – resample step parameters

uqmodels.preprocessing.features_processing.mask_corr_feature_target(X, y, v_seuil=0.05)[source]
uqmodels.preprocessing.features_processing.normalise_panda(dataframe, mode, scaler=None)[source]

Apply normalisation on a dataframe

Parameters:
  • dataframe (_type_) – _description_

  • mode (_type_) – _description_

Returns:

_description_

Return type:

_type_

uqmodels.preprocessing.features_processing.select_data(data, context=None, ind_data=None, **kwargs)[source]

Select data from ind_data indice array

Parameters:
  • data (ndarray) – data

  • ind_data (ind_array, optional) – selected data. Defaults to None : all dim are pick

Returns:

Ndarray that contains np.concatenation of all selected features

Return type:

data_selected

uqmodels.preprocessing.features_processing.select_data_and_context(data, context=None, ind_data=None, ind_context=None, **kwargs)[source]

Select data and context using ind_data & ind_context.

Parameters:
  • data (ndarray) – data

  • context (ndarray, optional) – context_data. Defaults to None.

  • n_components (int, optional) – n_components of pca. Defaults to 3.

  • ind_data (ind_array, optional) – selected data. Defaults to None : all dim are pick

  • ind_context (ind_array, optional) – seletected data context.

  • None (Defaults to) – all dim are pick if there is context

Returns:

Ndarray that contains np.concatenation of all selected features

Return type:

data_selected

uqmodels.preprocessing.features_processing.select_features_from_FI(X, y, model='RF', threesold=0.01, **kwargs)[source]
uqmodels.preprocessing.features_processing.select_tsfresh_params(list_keys=['variance', 'skewness', 'fft', 'cwt', 'fourrier', 'meantrend'])[source]

uqmodels.preprocessing.preprocessing module

Data preprocessing module.

uqmodels.preprocessing.preprocessing.Past_Moving_window_mapping(array, deta, window_size=None)[source]
uqmodels.preprocessing.preprocessing.Regular_Moving_window_mapping(array, deta, window_size, mode='left', **kwargs)[source]
uqmodels.preprocessing.preprocessing.add_row(df, date_pivot, mode='first')[source]

Add first or last np.Nan row to df with date_pivot as index values.

Parameters:
  • df (_type_) – dataframe

  • date_pivot (_type_) – index

  • mode (str, optional) – β€˜first’ or β€˜last’. Defaults to β€˜first’.

Returns:

dataframe augmented with one row

Return type:

df

uqmodels.preprocessing.preprocessing.auto_corr_reduce(set_)[source]
uqmodels.preprocessing.preprocessing.check_is_pd_date(date)[source]
uqmodels.preprocessing.preprocessing.compute_corr_and_filter(data)[source]
uqmodels.preprocessing.preprocessing.corrcoef_reduce(set_)[source]
uqmodels.preprocessing.preprocessing.dataset_generator_from_array(X, y, context=None, objective=None, sk_split=TimeSeriesSplit(gap=0, max_train_size=None, n_splits=5, test_size=None), repetition=1, remove_from_train=None, attack_name='', cv_list_name=None)[source]

Produce data_generator (iterable [X, y, X_split, context, objective, name]) from arrays

Parameters:
  • X (array) – Inputs.

  • y (array or None) – Targets.

  • context (array or None) – Additional information.

  • objective (array or None) – Ground truth (Unsupervised task).

  • sk_split (split strategy) – Sklearn split strategy.

uqmodels.preprocessing.preprocessing.df_interpolation_and_fusion(list_df, target_index_scale, dtype='datetime64[s]')[source]

Interpolation of all sources on a same temporal referencial

Parameters:
  • list_df (list of 2D array) – List of dataframe

  • target_index_scale (_type_) – Indice of sensors

  • dtype

Returns:

List of interpolated array

Return type:

interpolated_data

uqmodels.preprocessing.preprocessing.df_selection(df, start_date=None, end_date=None)[source]

Format dataframe to obtain a new version that start at start_date and finish and end_date

Parameters:
  • df (_type_) – dataframe

  • start_date (_type_, optional) – strat_date or None. Defaults to None: do nothnig

  • end_date (_type_, optional) – end_date or None. Defaults to None: do nothnig

Returns:

Time formated dataframe

Return type:

dataframe

uqmodels.preprocessing.preprocessing.downscale_series(dataframe, delta, offset='-1ms', start_date=None, end_date=None, mode='mean', dtype='datetime64[s]', **kwargs)[source]
uqmodels.preprocessing.preprocessing.entropy(y, set_val, v_bins=100)[source]

Compute naive entropy score of y by tokenize values with max of v_bins

uqmodels.preprocessing.preprocessing.extract_sensors_errors(series, type_sensor_error=[])[source]

Extract list of non floating values

Parameters:
  • series (_type_) – series of sensor_values

  • type_sensor_error (list, optional) – list of others errors.

uqmodels.preprocessing.preprocessing.fft_reduce(set_)[source]
uqmodels.preprocessing.preprocessing.get_event_into_series(list_events, index_scale, n_type_event, dtype='datetime64[s]')[source]

Locate flag erros of sensors in regular time_refenrencial

uqmodels.preprocessing.preprocessing.get_k_composants(mat_corr, n_cible)[source]
uqmodels.preprocessing.preprocessing.handle_nan(y)[source]

Replace nan values by last values

uqmodels.preprocessing.preprocessing.identity(x, **kwargs)[source]
uqmodels.preprocessing.preprocessing.identity_split(X_fit, y_fit, X_calib, y_calib)[source]

Identity splitter that wraps an already existing data assignment

uqmodels.preprocessing.preprocessing.interpolate(x, y, xnew=None, time_structure=None, type_interpolation='linear', fill_values=None, moving_average=False)[source]
Drop nan values & perform β€˜interpolation’ interpolation from [x,y] to [xnew,ynew]

if xnew is none, compute xnew from time_structure

if moving_average=True perform β€œinterpolate moving average” using int(len(xnew)/len(x))= M in order to perform mean of M interpolated point evenly distributed for each step.

Parameters:
  • x (array) – X_axis

  • y (array) – Y_axis (values)

  • xnew (array) – new X_axis

  • moving_average (bool, optional) – Perform moving average β€˜interpolation’.

Returns:

new interpolated Y_axis

Return type:

ynew

uqmodels.preprocessing.preprocessing.kfold_random_split(K, random_state=None)[source]

Splitter that randomly assign data into K folds

uqmodels.preprocessing.preprocessing.last_reduce(set_)[source]
uqmodels.preprocessing.preprocessing.locate_near_index_values(index_scale, index_val)[source]
uqmodels.preprocessing.preprocessing.map_reduce(data, map_=<function identity>, map_paramaters={}, reduce=<function identity>)[source]
uqmodels.preprocessing.preprocessing.mean_reduce(set_)[source]
uqmodels.preprocessing.preprocessing.process_irregular_data(self, data, query, structure)[source]

Apply interpolation & statistics extraction on data using query parameters with metadata stored in structure [β€˜start_date’,’end_date’,’delta’] of structure are used to specificy the start, the end and the statistics_step_synthesis. [β€˜window_size’,’begin_by_interpolation] of query are used to specify the final step (delta*window_size) and if there is a pre-interpolation step.

Parameters:
  • data (_type_) – _description_

  • query (_type_) – _description_

  • structure (_type_) – _description_

Returns:

_description_

Return type:

_type_

uqmodels.preprocessing.preprocessing.process_label(label_df, sources_selection, start_date, end_date, delta=1, dtype='datetime64[s]')[source]

Process anom label dataframe with (start: datetime64[s], end: datetime64[s],source) Into a ground truth matrix with a regular step scale of delta that start at start_date & end at end_date

uqmodels.preprocessing.preprocessing.process_raw_source(self, data, query, structure)[source]
uqmodels.preprocessing.preprocessing.random_split(ratio)[source]

Random splitter that assign samples given a ratio

uqmodels.preprocessing.preprocessing.raw_analysis(raw_series, time_structure)[source]
uqmodels.preprocessing.preprocessing.remove_rows(df, date_pivot, mode='first')[source]

Remove rows smaller/greated than date_pivot. then add apply add_row

Parameters:
  • df (_type_) – dataframe

  • date_pivot (_type_) – index_pivot

  • mode (str, optional) – β€˜first’ or β€˜last’. Defaults to β€˜first’.

Returns:

dataframe which removed values and a new bondary row

Return type:

df

uqmodels.preprocessing.preprocessing.rolling_statistics(data, delta, step=None, reduc_functions=['mean'], reduc_names=['mean'], **kwargs)[source]

Compute rollling_statistics from dataframe

Parameters:
  • data (pd.DataFrame) – dataframe (times,sources)

  • delta (int or timedelta64) – size of rolling window

  • step (int) – Evaluate the window at every step result

  • reduc_functions (_type_) – str of pandas window function (fast) or custom set->stat function (slow)

  • reduc_names (_type_) – name stored in stat_dataframe

  • time_mask (_type_, optional) – time_mask. Defaults to None.

  • **kwargs – others paramaters provide to DataFrame.rolling

Returns:

_description_

Return type:

_type_

uqmodels.preprocessing.preprocessing.select_best_representant(mat_corr, list_components)[source]
uqmodels.preprocessing.preprocessing.select_signal(data, n_cible=None)[source]
class uqmodels.preprocessing.preprocessing.splitter(X_split)[source]

Bases: object

Generic data-set provider (Iterable)

split(X)[source]
uqmodels.preprocessing.preprocessing.upscale_series(dataframe, delta, offset=None, start_date=None, end_date=None, mode='time', max_time_jump=10, replace_val=None, **kwargs)[source]

Upsample series using pandas interpolation function

Parameters:
  • dataframe (_type_) – data to resample

  • delta (_type_) – Timedelta

  • offset (str, optional) – _description_. Defaults to β€˜-1ms’.

  • origin (str, optional) – _description_. Defaults to β€˜start_day’.

  • mode (str, optional) – _description_. Defaults to β€˜time’.

  • max_time_jump (int, optional) – _description_. Defaults to 10.

  • replace_val (_type_, optional) – _description_. Defaults to None.

Returns:

_description_

Return type:

_type_

uqmodels.preprocessing.structure module

Specification of structure object representing operation knowledge about specific data structure.

class uqmodels.preprocessing.structure.Irregular_time(name, start_date, date_init='1970-01-01 00:00:00.000000', dtype='datetime64[s]', **kwargs)[source]

Bases: Structure

get_date(step)[source]
get_step(date)[source]
class uqmodels.preprocessing.structure.Multi_source(regular_sub_structure=True, name='Multi_sources', **kwargs)[source]

Bases: Structure

get(keys, default_value=None, query={})[source]

get list of obj related to keys (or obj relate to key if not list) return default values if key not found

Parameters:
  • keys (str or list of str) – key or list of ker

  • default_value (_type_, optional) – default values if key not found. Defaults to None.

Returns:

list of obj or a obj

Return type:

objs

get_structure(str_key)[source]
class uqmodels.preprocessing.structure.Regular_time(name, start_date, delta=numpy.timedelta64(1, 's'), window_size=None, date_init='1970-01-01 00:00:00.000000', dtype='datetime64[s]', **kargs)[source]

Bases: Structure

get_date(step)[source]
get_step(date)[source]
get_step_scale(start_date, end_date)[source]

Generate step_scale using specification :returns: Numeric time regular array :rtype: step_scale

class uqmodels.preprocessing.structure.Structure(name, **kwargs)[source]

Bases: object

get(keys, default_value=None, **kwarg)[source]

get list of obj related to keys (or obj relate to key if not list) return default values if key not found

Parameters:
  • keys (str or list of str) – key or list of ker

  • default_value (_type_, optional) – default values if key not found. Defaults to None.

Returns:

list of obj or a obj

Return type:

objs

get_structure(str_key, **kargs)[source]
set(key, obj)[source]

self[key] = obj using setattr function

Parameters:
  • key (str) – key of attribute

  • obj (obj) – attribute to store

toJSON()[source]
uqmodels.preprocessing.structure.check_date(date, dtype='datetime64[s]')[source]
uqmodels.preprocessing.structure.check_delta(delta, dtype='datetime64[s]')[source]
uqmodels.preprocessing.structure.date_to_step(date, delta=1, dtype='datetime64[s]', date_init=None)[source]
Transform date or date_array into a step using datetime64[s] format and delta + d_init information

date format : β€œ%Y-%m-%d %H:%M:%S.%f” deeping about precision. step = (date).astype(dtype).tofloat * delta + (date_init).astype(dtype).to_float

Parameters:
  • date (date or np.array(date)) – datetime64 or str_date format : β€œ%Y-%m-%d %H:%M:%S.%f”

  • delta (int, optional) – delta between two step. Defaults to 1.

  • dtype (str, optional) – dtype of date. Defaults to β€˜datetime64[s]’.

  • date_init (str, optional) – str_date of first step. Defaults to None.

Returns:

step in float representation

Return type:

step or np.array(step)

uqmodels.preprocessing.structure.get_date_mask(date, date_min, date_max, out_of_mask=True, delta=1, dtype='datetime64[s]', date_init=None)[source]
uqmodels.preprocessing.structure.get_regular_step_scale(delta, range_temp, time_offset=0, **kwarg)[source]

Generate regular step_scale with delta : :param delta: size of unitary delta between windows :type delta: int :param range_temp: temporal range :type range_temp: int :param padding: Initial_state :type padding: int :param mode: linespace or arange :type mode: str

Returns:

Numeric regular time scale

Return type:

step_scale

uqmodels.preprocessing.structure.get_step_mask(step, step_min, step_max, out_of_mask=True)[source]

Compute mask of step_scale array from time boundary

Parameters:
  • time (array) – step_scale

  • x_min (float) – Minimal considered step

  • x_max (float) – Maximal considered steps

  • out_of_mask (bool, optional) – if true incorporate the previous and the next out of bondary step.

Returns:

_description_

Return type:

_type_

uqmodels.preprocessing.structure.get_unit(dtype)[source]
uqmodels.preprocessing.structure.regular_date_scale(start, end=None, periods=None, delta=1, dtype='datetime64[s]')[source]

Create regular date scale of dtype using pd.date_range starting at start date, and ending a end date or start + range * freq

Parameters:
  • start (str or date) – start date

  • end (str or date or None, optional) – end date. Defaults to None : use start + range*freq

  • periods (int, optional) – number of period. Defaults to 1000.

  • delta (int or timedelta, optional) – delta of scale.

  • dtype (str, optional) – dtype. Defaults to β€œdatetime64[s]”.

Returns:

_description_

Return type:

_type_

uqmodels.preprocessing.structure.regular_representation(list_output, list_delta, delta_target, dim_t=0)[source]

Resample list of ndarray using np.repeat according to time representation parameters of each source

Parameters:
  • list_output (_type_) – list of models output for each source

  • list_step_scale (_type_) – list of times parameters for each source

Returns:

list_output with same length (using duplication)

uqmodels.preprocessing.structure.step_to_date(step, delta=1, dtype='datetime64[s]', date_init=None)[source]
Transform float_step or float_step_array into a date using datetime64[s] format and delta + d_init information

date format : β€œ%Y-%m-%d %H:%M:%S.%f” deeping about precision. date = (step/delta-d_init).astype(dtype).tostr()

Parameters:
  • step (float or np.array(float)) – float representing step

  • delta (int, optional) – delta between two step. Defaults to 1.

  • dtype (str, optional) – dtype of date. Defaults to β€˜datetime64[s]’.

  • date_init (str, optional) – str_date of first step. Defaults to None.

Returns:

date that can be cast as float using date.astype(str)

Return type:

date or np.array(date)

uqmodels.preprocessing.structure.str_to_date(str, dtype='datetime64[s]')[source]
uqmodels.preprocessing.structure.time_selection(x, y, x_min, x_max, out_of_mask, mode='step')[source]
uqmodels.preprocessing.structure.window_expansion(step, n_expend=5, delta=1)[source]

Module contents