Base modules#

BaseDataset#

class ice.base.BaseDataset(num_chunks=None, force_download=False)[source]#

Bases: ABC

Base class for datasets.

Parameters:

num_chunks (int) – If given, download only num_chunks chunks of data. Used for testing purposes.
force_download (bool) – If True, download the dataset even if it exists.

set_name_public_link()[source]#: This method has to be implemented by all children. Set name and public link.

BaseModel#

class ice.base.BaseModel(window_size: int, stride: int, batch_size: int, lr: float, num_epochs: int, device: str, verbose: bool, name: str, random_seed: int, val_ratio: float, save_checkpoints: bool)[source]#

Bases: ABC

Base class for all models.

Parameters:

window_size (int) – The window size to train the model.
stride (int) – The time interval between first points of consecutive sliding windows in training.
batch_size (int) – The batch size to train the model.
lr (float) – The larning rate to train the model.
num_epochs (float) – The number of epochs to train the model.
device (str) – The name of a device to train the model. cpu and cuda are possible.
verbose (bool) – If true, show the progress bar in training.
name (str) – The name of the model for artifact storing.
random_seed (int) – Seed for random number generation to ensure reproducible results.
val_ratio (float) – Proportion of the dataset used for validation, between 0 and 1.
save_checkpoints (bool) – If true, store checkpoints.

evaluate(df: DataFrame, target: Series) → dict[source]#

Evaluate the metrics: accuracy.

Parameters:

df (pandas.DataFrame) – A dataframe with sensor data. Index has two columns: run_id and sample. All other columns a value of sensors.
target (pandas.Series) – A series with target values. Indes has two columns: run_id and sample.

Returns:

A dictionary with metrics where keys are names of metrics and: values are values of metrics.

Return type:

dict

fit(df: DataFrame, target: Optional[Series] = None, epochs: Optional[int] = None, save_path: Optional[str] = None, trial: Optional[Trial] = None, force_model_ctreation: bool = False)[source]#

Fit (train) the model by a given dataset.

Parameters:

df (pandas.DataFrame) – A dataframe with sensor data. Index has two columns: run_id and sample. All other columns a value of sensors.
target (pandas.Series) – A series with target values. Index has two columns: run_id and sample. It is omitted for anomaly detection task.
epochs (int) – The number of epochs for training step. If None, self.num_epochs parameter is used.
save_path (str) – Path to save checkpoints. If None, the path is created automatically.
trial (optuna.Trial, None) – optuna.Trial object created by optimize method.
force_model_ctreation (bool) – force fit to create model for optimization study.

classmethod from_config(cfg: Config)[source]#

Create instance of the model class with parameters from config.

Parameters:: cfg (Config) – A config with model’s parameters.
Returns:: Instance of BaseModel child class initialized with parameters from config.
Return type:: BaseModel

load_checkpoint(checkpoint_path: str)[source]#

Load checkpoint.

Parameters:: checkpoint_path (str) – Path to load checkpoint.

model_param_estimation()[source]#

Calculate number of self.model parameters, mean and std for inference time

Returns:

A tuple containing the number of parameters in the: model and the mean and standard deviation of model inference time.

Return type:

tuple

optimize(df: DataFrame, target: Optional[Series] = None, optimize_parameter: str = 'batch_size', optimize_range: tuple = (128, 256), direction: str = 'minimize', n_trials: int = 5, epochs: Optional[int] = None, optimize_metric: Optional[str] = None)[source]#

Make the optuna study to return the best hyperparameter value on validation dataset

Parameters:

df (pd.DataFrame) – DataFrame to use method fit
optimize_parameter (str, optional) – Model parameter to optimize. Defaults to ‘batch_size’.
optimize_range (tuple, optional) – Model parameter range for optuna trials. Defaults to (128, 256).
n_trials (int, optional) – number of trials. Defaults to 5.
target (pd.Series, optional) – target pd.Series to use method fit. Defaults to None.
epochs (int, optional) – Epoch number to use method fit. Defaults to None.
optimize_metric (str) – Metric on validation dataset to use as a target for hyperparameter optimization.
direction (str) – “minimize” or “maximize” the target for hyperparameter optimization

predict(sample: Tensor) → Tensor[source]#

Make a prediction for a given batch of samples.

Parameters:: sample (torch.Tensor) – A tensor of the shape (B, L, C) where B is the batch size, L is the sequence length, C is the number of sensors.
Returns:: A tensor with predictions of the shape (B,).
Return type:: torch.Tensor

save_checkpoint(save_path: Optional[str] = None)[source]#

Save checkpoint.

Parameters:: save_path (str) – Path to save checkpoint.