Initializers¶

Initializers provide init values for network parameter blobs. In Caffe, they are called Fillers.

class NullInitializer¶: An initializer that does nothing. To initialize with zeros, use a ConstantInitializer.

class ConstantInitializer¶

Set everything to a constant.

value¶: The value used to initialize a parameter blob. Typically this is set to 0.

class XavierInitializer¶

An initializer based on [BengioGlorot2010], but does not use the fan-out value. It fills the parameter blob by randomly sampling uniform data from \([-S,S]\) where the scale \(S=\sqrt{3 / F_{\text{in}}}\). Here \(F_{\text{in}}\) is the fan-in: the number of input nodes.

Heuristics are used to determine the fan-in: For a ND tensor parameter blob, the product of all the 1 to N-1 dimensions are considered as fan-in, while the last dimension is considered as fan-out.

[BengioGlorot2010]

Y. Bengio and X. Glorot, Understanding the difficulty of training deep feedforward neural networks, in Proceedings of AISTATS 2010, pp. 249-256.

class GaussianInitializer¶

Initialize each element in the parameter blob as independent and identically distributed Gaussian random variables.

mean¶: Default 0.

std¶: Default 1.

class OrthogonalInitializer¶

Initialize the parameter blob to be a random orthogonal matrix (i.e. \(W^TW=I\)), times a scalar gain factor. Based on [Saxe2013].

[Saxe2013]

Andrew M. Saxe, James L. McClelland, Surya Ganguli, Exact solutions to the nonlinear dynamics of learning in deep linear neural networks, http://arxiv.org/abs/1312.6120 with a presentation https://www.youtube.com/watch?v=Ap7atx-Ki3Q

gain¶: Default 1. Use \(\sqrt{2}\) for layers with ReLU activations.