Utilities related to chainer¶

Activation functions¶

class researchutils.chainer.functions.activation.grad_clip_lstm.GradClipLSTM(clip_min, clip_max)[source]¶

Long short-term memory unit with forget gate and gradient clipping before each gates. It has two inputs (c, x) and two outputs (c, h), where c indicates the cell state. x must have four times channels compared to the number of units.

Gradient clipping is done during backward process and not before applying the gradient to weights.

See: https://arxiv.org/abs/1308.0850

backward(inputs, grads)[source]¶

Computes gradients w.r.t. specified inputs given output gradients.

This method is used to compute one step of the backpropagation corresponding to the forward computation of this function node. Given the gradients w.r.t. output variables, this method computes the gradients w.r.t. specified input variables. Note that this method does not need to compute any input gradients not specified by target_input_indices.

Unlike Function.backward(), gradients are given as Variable objects and this method itself has to return input gradients as Variable objects. It enables the function node to return the input gradients with the full computational history, in which case it supports differentiable backpropagation or higher-order differentiation.

The default implementation returns None s, which means the function is not differentiable.

Parameters:	target_input_indexes (tuple of int) – Sorted indices of the input variables w.r.t. which the gradients are required. It is guaranteed that this tuple contains at least one element. grad_outputs (tuple of `Variable`s) – Gradients w.r.t. the output variables. If the gradient w.r.t. an output variable is not given, the corresponding element is `None`.
Returns:	Tuple of variables that represent the gradients w.r.t. specified input variables. The length of the tuple can be same as either `len(target_input_indexes)` or the number of inputs. In the latter case, the elements not specified by `target_input_indexes` will be discarded.

Loss functions¶

researchutils.chainer.functions.loss.average_k_step_squared_error.average_k_step_squared_error(x1, x2, k_step)[source]¶

Average k-step squared error introduced by Oh et al.

\[\frac{1}{2K}\sum_{i}\sum_{t}\sum_{k}\|\hat{\mathbf{x}}_{t+k}^{(i)} - \mathbf{x}_{t+k}^{(i)}\|^{2}\]

See: https://arxiv.org/abs/1507.08750

Parameters:	x1 (array) – predicted image x2 (array) – expected image k_step (int) – maximum steps to predict from given input
Returns:	error – k-step squared error
Return type:	Variable

Iterators¶

class researchutils.chainer.iterators.decorable_multithread_iterator.DecorableMultithreadIterator(dataset, batch_size, repeat=True, shuffle=True, n_threads=1, decor_fun=None, end_index=None)[source]¶

MultithreadIterator which enables to configure dataset’s end index and preprocess dataset for given index before adding to batch

Preprocess procedure will be done in parallel with multi thread to preload batch

class researchutils.chainer.iterators.decorable_serial_iterator.DecorableSerialIterator(dataset, batch_size, repeat=True, shuffle=True, decor_fun=None, end_index=None)[source]¶

SerialIterator which enables to configure dataset’s end index and preprocess dataset for given index before adding to batch

Preprocess procedure will be done on caller’s thread

class researchutils.chainer.iterators.decorable_multiprocess_iterator.DecorableMultiprocessIterator(dataset, batch_size, repeat=True, shuffle=None, n_processes=None, n_prefetch=1, shared_mem=None, decor_fun=None, end_index=None)[source]¶

MultiprocessIterator which enables to configure dataset’s end index and preprocess dataset for given index before adding to batch

Preprocess procedure will be done in parallel with multi process to preload batch

NOTE: This is an experimental implementation

class researchutils.chainer.iterators.unmodifiable_decorable_list.UnmodifiableDecorableList(items, decor_fun=None, end_index=None)[source]¶

Unmodifiable list which can decorate items in the list by providing decor_fun

Parameters:	items (iterable) – items of the list decor_fun (callable or None) – function to apply everytime __getitem__ is called end_index (integer or None) – length of the list (exclusive) to announce to the user of this list can be same or smaller than the length of the given items. if end_index is None, then the length of this list will be same as length of given items

Connection links¶

class researchutils.chainer.links.connection.grad_clip_lstm.GradClipLSTM(in_size, out_size=None, lateral_init=None, upward_init=None, bias_init=None, forget_bias_init=None, clip_min=None, clip_max=None)[source]¶

Fully-connected LSTM layer with gradient clip before each gates. See: https://arxiv.org/abs/1308.0850

For detail description of LSTM layer itself, check original LSTM implementation of chainer.

Serializers¶

researchutils.chainer.serializers.npz.load_model(path, model)[source]¶

Load model from the npz file of given path

Parameters:	path (string) – path of the saved model
Returns:	model – model with parameters initialized from loaded file if the file does not exist, then will return given model without any changes
Return type:	chainer.Link

researchutils.chainer.serializers.npz.load_snapshot(path, trainer)[source]¶

Load snapshot from the npz file of given path

Parameters:	path (string) – path of the saved model
Returns:	trainer – trainer with associated objects initialized with loaded file if the file does not exist, then will return given trainer without any changes
Return type:	chainer.Trainer

researchutils.chainer.serializers.npz.save_model(path, model)[source]¶

Save model as an npz file to given path

Parameters:	path (string) – path of the model to be saved model (chainer.Link) – model to save parameters
Raises:	`ValueError` – File already exists

Training extensions¶

class researchutils.chainer.training.extensions.slack_report.SlackReport(token_or_client, entries, channel='general', log_report='LogReport')[source]¶

Sends learning status periodically to slack’s channel

Basic behavior is same as chainer.training.extensions.PrintReport The main difference is that this extension sends result to specified slack channel