Logistic Regression

class heaan_sdk.ml.models.linear_model.logistic_regression.LogisticRegression(context: Context, unit_shape: Tuple[int, int], num_feature: int, classes: List[int], initializer: str = 'kaiming_he', path: str | Path | None = None)
__init__(context: Context, unit_shape: Tuple[int, int], num_feature: int, classes: List[int], initializer: str = 'kaiming_he', path: str | Path | None = None) None
Parameters:
  • context (heaan_sdk.Context) – Context of the model.

  • unit_shape (Tuple[int, int]) – Unit shape of single block for encoding parameter matrix.

  • num_feature (int) – Number of features of the model.

  • classes (List[int]) – List of class labels. From 0 to (number of classes - 1).

  • initializer (str) – initializer for the model parameter. Defaults to “kaiming_he”.

  • path (Optional[Path], optional) – path of the model. Defaults to None.

evaluate(mat: HEMatrix, param: HESubMatrix | None = None, last_activation: bool = False, training: bool = False) HEMatrix

Evaluate (infer) class labels for samples in the matrix.

Parameters:
  • mat (HEMatrix) – Data to train the model.

  • param (Optional[HESubMatrix], optional) – Model parameter to use. If SGD with nesterov acceleration is used for training, then this is the auxiliary parameter. Otherwise, it is just the model parameter theta. Defaults to None, which then uses theta for evaluation.

  • last_activation (bool, optional) – Whether to apply softmax at last or not. Defauls to False.

  • training (bool, optional) – Whether evaluate is used for training or only for inference. For training this would check and bootstrap the model’s parameter if needed. Defaults to False.

fit(data_set: DataSet, num_epoch: int = 10, lr: float = 0.01, batch_size: int = 128, optimizer: str = 'sgd', lr_scheduler: str = 'constant', nesterov: bool = True, activation: str = 'auto', regularizer: str = 'none', regularize_coeff: float = 1.0, **kwargs) None

Fit the model according to the given data_set.

Parameters:
  • data_set (DataSet) – data_set to train the model.

  • num_epoch (int) – number of epoch. Default to 10.

  • lr (float) – learning rate. Default to 0.01.

  • batch_size (int) – size of minibatch. Needs to be a power of 2. Default to 128.

  • optimizer (str, {'sgd', 'adam', 'adagrad', 'rmsprop', ...}) – optimizer of the model. Default to sgd.

  • nesterov (bool) – whether to apply Nesterov acceleration. Default to True.

  • multi_class (str, {'auto', 'multinomial', 'ovr'}) – If the option chosen is ‘ovr’, then a binary problem is fit for each label. For ‘multinomial’ the loss minimised is the multinomial loss fit across the entire probability distribution, even when the data is binary. ‘auto’ selects ‘ovr’ if the data is binary, and otherwise selects ‘multinomial’.

  • activation (str, {'auto', 'sigmoid', 'sigmoid_wide', 'softmax', 'softmax_wide'}) – activation function of the model. If the option is ‘auto’, then the activation function would be selected according to the number of classes - ‘sigmoid’ for binary classification, and ‘softmax’ for multi-class classification. It is recommended to use ‘sigmoid_wide’ or ‘softmax_wide’ instead of ‘sigmoid’ and ‘softmax’ if the input features are not properly scaled. Default to ‘auto’.

  • regularizer (str) – Regularization options. Currently, the valid options are - “none”: no regularization - “l2”: L2 regularization (ridge regression, weight decay) - “l1”: L1 regularization (Lasso regression) - “elasticnet”: Combination of L1 and regularization Defaults to “none”.

  • regularize_coeff (float) – Regularization coefficient. Defaults to 1.0.

static from_ndarray(context: Context, array: ndarray, unit_shape: Tuple[int, int], path: Path | None = None) LogisticRegression

Load model from ndarray of weights and returns model.

Parameters:
  • context (Context) – Context of the model.

  • array (np.ndarray) – ndarray of weights of shape (num_classes, num_features + 1), where the rightmost column is a bias column.

  • unit_shape (Tuple[int, int]) – Unit shape for packing.

  • path (Optional[Path], optional) – Path of the model. Defaults to None.

Returns:

Multinomial logistic regression model with loaded weights.

Return type:

LogisticRegression

static from_sklearn(context: Context, sklearn_model: LogisticRegression, unit_shape: Tuple[int, int], path: Path | None = None) LogisticRegression

Transform (trained) sklearn logistic regression model into HE model.

Parameters:
  • context (Context) – Context of the model.

  • sklearn_model (SKLogisticRegression) – Scikit-learn logistic regression model.

  • unit_shape (Tuple[int, int]) – Unit shape for packing.

  • path (Optional[Path], optional) – Path of the model. Defaults to None.

Returns:

Logistic regression model with loaded weights.

Return type:

LogisticRegression

predict(mat: HEMatrix) HEMatrix

Model inference. Returns HEMatrix that contains logits, and has shape of (batch_Size, num_classes).

Parameters:

mat (HEMatrix) – Input matrix of shape (num_data, num_features), without bias column.

Returns:

Output HEMatrix. Dot-product of the input matrix and the model parameter.

Return type:

HEMatrix

to_dataframe() DataFrame

Transform HE logistic regression model into pandas DataFrame. Model needs to be decrypted before transformation.

Returns:

Pandas DataFrame model.

Return type:

pd.DataFrame

to_sklearn() LogisticRegression

Transform HE logistic regression model into sklearn model. Model needs to be decrypted before transformation.

Returns:

Scikit-learn logistic regression model with loaded weights.

Return type:

SKLogisticRegression

Copyright (C) 2023, CryptoLab Inc. All rights reserved.

This software and/or source code may be commercially used and/or disseminated only with the written permission of CryptoLab Inc, or in accordance with the terms and conditions stipulated in the agreement/contract under which the software and/or source code has been supplied by CryptoLab Inc. Any unauthorized commercial use and/or dissemination of this file is strictly prohibited and will constitute an infringement of copyright.

heaan_sdk.ml.preprocessing.encode_train_data(context: Context, X: ndarray, y: ndarray, unit_shape: Tuple[int, int], dtype: str = 'classification', scale_type: str | Dict[str | int, str] = 'none', path: str | Path | None = None) DataSet

Return a dataset from ndarray.

Parameters:
  • context (Context) – Context.

  • X (np.ndarray) – The input features, 2d array.

  • y (np.ndarray) – The input labels, 1d array. When there are c many classes, the array should contain only and all the integers from 0 to c-1, and the trained model parameter will be indexed as a same order.

  • unit_shape (Tuple[int, int]) – Unit encoding shape of matrix.

  • dtype (str, {'regression', 'classification'}) – type of target. Default to ‘classification’.

  • scale_type (Union[str, Dict[Union[str, int], str]], optional) – Type of the scaler. Default to ‘none’.

  • path (Optional[Path], optional) – path of the dataset.

class heaan_sdk.matrix.matrix.HEMatrix(context: Context, **kwargs)
__init__(context: Context, **kwargs)
decode() ndarray

Dncode a HEMatrix as a numpy array.

Returns:

Decoded numpy array.

Return type:

np.ndarray

decrypt_decode() ndarray

Dncode and decrypt a HEMatrix as a numpy array.

Returns:

Decoded numpy array.

Return type:

np.ndarray

static encode(context: Context, array: ndarray, unit_shape: Tuple[int, int]) HEMatrix

Encode a numpy array as a HEMatrix. Array should be 1-dim or 2-dim. If array is 1-dimensional, it will be repeated along axis = 0 and encoded. If array is 2-dimensional, its number of rows should be smaller than or equalt to unit_shape[0].

Parameters:
  • context (Context) – Context.

  • array (np.ndarray) – Input array.

  • unit_shape (Tuple[int, int]) – Unit encoding shape for matrix.

Returns:

Encoded HEMatrix.

Return type:

HEMatrix

static encode_encrypt(context: Context, array: ndarray, unit_shape: Tuple[int, int], target_level: int | None = None) HEMatrix

Encode and encrypt a numpy array as a HEMatrix. Array should be 1-dim or 2-dim. If array is 1-dimensional, it will be repeated along axis = 0 and encoded. If array is 2-dimensional, its number of rows should be smaller than or equal to unit_shape[0].

Parameters:
  • context (Context) – Context.

  • array (np.ndarray) – Input array.

  • unit_shape (Tuple[int, int]) – Unit encoding shape for matrix.

  • target_level (Optional[int], optional) – If not None, then encrypt HESubMatrix as specified level. Defaults to None.

Returns:

Encoded abnd encrypted HEMatrix.

Return type:

HEMatrix