FAQ¶
How do I apply L2 regularization?¶
To apply L2 regularization (aka weight decay), PyTorch supplies
the weight_decay parameter, which must be supplied to the
optimizer. To pass this variable in skorch, use the
double-underscore notation for the optimizer:
net = NeuralNet(
...,
optimizer__weight_decay=0.01,
)
How can I continue training my model?¶
By default, when you call fit() more than
once, the training starts from zero instead of from where it was left.
This is in line with sklearn’s behavior but not always desired. If
you would like to continue training, use
partial_fit() instead of
fit(). Alternatively, there is the
warm_start argument, which is False by default. Set it to
True instead and you should be fine.
How do I shuffle my train batches?¶
skorch uses DataLoader from PyTorch under
the hood. This class takes a couple of arguments, for instance
shuffle. We therefore need to pass the shuffle argument to
DataLoader, which we achieve by using the
double-underscore notation (as known from sklearn):
net = NeuralNet(
...,
iterator_train__shuffle=True,
)
Note that we have an iterator_train for the training data and an
iterator_valid for validation and test data. In general, you only
want to shuffle the train data, which is what the code above does.
How do I use sklearn GridSeachCV when my data is in a dictionary?¶
skorch supports dicts as input but sklearn doesn’t. To get around
that, try to wrap your dictionary into a SliceDict. This is
a data container that partly behaves like a dict, partly like an
ndarray. For more details on how to do this, have a look at the
coresponding data section
in the notebook.
I want to use sample_weight, how can I do this?¶
Some scikit-learn models support to pass a sample_weight argument
to fit calls as part of the fit_params. This allows you to
give different samples different weights in the final loss
calculation.
In general, skorch supports fit_params, but unfortunately just
calling net.fit(X, y, sample_weight=sample_weight) is not enough,
because the fit_params are not split into train and valid, and are
not batched, resulting in a mismatch with the training batches.
Fortunately, skorch supports passing dictionaries as arguments, which
are actually split into train and valid and then batched. Therefore,
the best solution is to pass the sample_weight with X as a
dictionary. Below, there is example code on how to achieve this:
X, y = get_data()
# put your X into a dict if not already a dict
X = {'data': X}
# add sample_weight to the X dict
X['sample_weight'] = sample_weight
class MyModule(nn.Module):
...
def forward(self, data, sample_weight):
# when X is a dict, its keys are passed as kwargs to forward, thus
# our forward has to have the arguments 'data' and 'sample_weight';
# usually, sample_weight can be ignored here
...
class MyNet(NeuralNet):
def get_loss(self, y_pred, y_true, X, *args, **kwargs):
# override get_loss to use the sample_weight from X
loss_unreduced = super().get_loss(y_pred, y_true, X, *args, **kwargs)
sample_weight = X['sample_weight']
loss_reduced = (sample_weight * loss_unreduced).mean()
return loss_reduced
# make sure to pass reduce=False to your criterion, since we need the loss
# for each sample so that it can be weighted
net = MyNet(MyModule, ..., criterion__reduce=False)
net.fit(X, y)