FAQ¶

How do I apply L2 regularization?¶

To apply L2 regularization (aka weight decay), PyTorch supplies the weight_decay parameter, which must be supplied to the optimizer. To pass this variable in skorch, use the double-underscore notation for the optimizer:

net = NeuralNet(
    ...,
    optimizer__weight_decay=0.01,
)

How can I continue training my model?¶

By default, when you call fit() more than once, the training starts from zero instead of from where it was left. This is in line with sklearn’s behavior but not always desired. If you would like to continue training, use partial_fit() instead of fit(). Alternatively, there is the warm_start argument, which is False by default. Set it to True instead and you should be fine.

How do I shuffle my train batches?¶

skorch uses DataLoader from PyTorch under the hood. This class takes a couple of arguments, for instance shuffle. We therefore need to pass the shuffle argument to DataLoader, which we achieve by using the double-underscore notation (as known from sklearn):

net = NeuralNet(
    ...,
    iterator_train__shuffle=True,
)

Note that we have an iterator_train for the training data and an iterator_valid for validation and test data. In general, you only want to shuffle the train data, which is what the code above does.

How do I use sklearn GridSeachCV when my data is in a dictionary?¶

skorch supports dicts as input but sklearn doesn’t. To get around that, try to wrap your dictionary into a SliceDict. This is a data container that partly behaves like a dict, partly like an ndarray. For more details on how to do this, have a look at the coresponding data section in the notebook.

I want to use sample_weight, how can I do this?¶

Some scikit-learn models support to pass a sample_weight argument to fit calls as part of the fit_params. This allows you to give different samples different weights in the final loss calculation.

In general, skorch supports fit_params, but unfortunately just calling net.fit(X, y, sample_weight=sample_weight) is not enough, because the fit_params are not split into train and valid, and are not batched, resulting in a mismatch with the training batches.

Fortunately, skorch supports passing dictionaries as arguments, which are actually split into train and valid and then batched. Therefore, the best solution is to pass the sample_weight with X as a dictionary. Below, there is example code on how to achieve this:

X, y = get_data()
# put your X into a dict if not already a dict
X = {'data': X}
# add sample_weight to the X dict
X['sample_weight'] = sample_weight

class MyModule(nn.Module):
    ...
    def forward(self, data, sample_weight):
        # when X is a dict, its keys are passed as kwargs to forward, thus
        # our forward has to have the arguments 'data' and 'sample_weight';
        # usually, sample_weight can be ignored here
        ...

class MyNet(NeuralNet):
    def get_loss(self, y_pred, y_true, X, *args, **kwargs):
        # override get_loss to use the sample_weight from X
        loss_unreduced = super().get_loss(y_pred, y_true, X, *args, **kwargs)
        sample_weight = X['sample_weight']
        loss_reduced = (sample_weight * loss_unreduced).mean()
        return loss_reduced

# make sure to pass reduce=False to your criterion, since we need the loss
# for each sample so that it can be weighted
net = MyNet(MyModule, ..., criterion__reduce=False)
net.fit(X, y)