Base modules#

BaseDataset#

class ice.base.BaseDataset(num_chunks=None, force_download=False)[source]#

Bases: ABC

Base class for datasets.

Parameters:
  • num_chunks (int) – If given, download only num_chunks chunks of data. Used for testing purposes.

  • force_download (bool) – If True, download the dataset even if it exists.

This method has to be implemented by all children. Set name and public link.

BaseModel#

class ice.base.BaseModel(window_size: int, stride: int, batch_size: int, lr: float, num_epochs: int, device: str, verbose: bool, name: str, random_seed: int, val_ratio: float, save_checkpoints: bool)[source]#

Bases: ABC

Base class for all models.

Parameters:
  • window_size (int) – The window size to train the model.

  • stride (int) – The time interval between first points of consecutive sliding windows in training.

  • batch_size (int) – The batch size to train the model.

  • lr (float) – The larning rate to train the model.

  • num_epochs (float) – The number of epochs to train the model.

  • device (str) – The name of a device to train the model. cpu and cuda are possible.

  • verbose (bool) – If true, show the progress bar in training.

  • name (str) – The name of the model for artifact storing.

  • random_seed (int) – Seed for random number generation to ensure reproducible results.

  • val_ratio (float) – Proportion of the dataset used for validation, between 0 and 1.

  • save_checkpoints (bool) – If true, store checkpoints.

evaluate(df: DataFrame, target: Series) dict[source]#

Evaluate the metrics: accuracy.

Parameters:
  • df (pandas.DataFrame) – A dataframe with sensor data. Index has two columns: run_id and sample. All other columns a value of sensors.

  • target (pandas.Series) – A series with target values. Indes has two columns: run_id and sample.

Returns:

A dictionary with metrics where keys are names of metrics and

values are values of metrics.

Return type:

dict

fit(df: DataFrame, target: Optional[Series] = None, epochs: Optional[int] = None, save_path: Optional[str] = None, trial: Optional[Trial] = None, force_model_ctreation: bool = False)[source]#

Fit (train) the model by a given dataset.

Parameters:
  • df (pandas.DataFrame) – A dataframe with sensor data. Index has two columns: run_id and sample. All other columns a value of sensors.

  • target (pandas.Series) – A series with target values. Index has two columns: run_id and sample. It is omitted for anomaly detection task.

  • epochs (int) – The number of epochs for training step. If None, self.num_epochs parameter is used.

  • save_path (str) – Path to save checkpoints. If None, the path is created automatically.

  • trial (optuna.Trial, None) – optuna.Trial object created by optimize method.

  • force_model_ctreation (bool) – force fit to create model for optimization study.

classmethod from_config(cfg: Config)[source]#

Create instance of the model class with parameters from config.

Parameters:

cfg (Config) – A config with model’s parameters.

Returns:

Instance of BaseModel child class initialized with parameters from config.

Return type:

BaseModel

load_checkpoint(checkpoint_path: str)[source]#

Load checkpoint.

Parameters:

checkpoint_path (str) – Path to load checkpoint.

model_param_estimation()[source]#

Calculate number of self.model parameters, mean and std for inference time

Returns:

A tuple containing the number of parameters in the

model and the mean and standard deviation of model inference time.

Return type:

tuple

optimize(df: DataFrame, target: Optional[Series] = None, optimize_parameter: str = 'batch_size', optimize_range: tuple = (128, 256), direction: str = 'minimize', n_trials: int = 5, epochs: Optional[int] = None, optimize_metric: Optional[str] = None)[source]#

Make the optuna study to return the best hyperparameter value on validation dataset

Parameters:
  • df (pd.DataFrame) – DataFrame to use method fit

  • optimize_parameter (str, optional) – Model parameter to optimize. Defaults to ‘batch_size’.

  • optimize_range (tuple, optional) – Model parameter range for optuna trials. Defaults to (128, 256).

  • n_trials (int, optional) – number of trials. Defaults to 5.

  • target (pd.Series, optional) – target pd.Series to use method fit. Defaults to None.

  • epochs (int, optional) – Epoch number to use method fit. Defaults to None.

  • optimize_metric (str) – Metric on validation dataset to use as a target for hyperparameter optimization.

  • direction (str) – “minimize” or “maximize” the target for hyperparameter optimization

predict(sample: Tensor) Tensor[source]#

Make a prediction for a given batch of samples.

Parameters:

sample (torch.Tensor) – A tensor of the shape (B, L, C) where B is the batch size, L is the sequence length, C is the number of sensors.

Returns:

A tensor with predictions of the shape (B,).

Return type:

torch.Tensor

save_checkpoint(save_path: Optional[str] = None)[source]#

Save checkpoint.

Parameters:

save_path (str) – Path to save checkpoint.