Initializers¶
Initializers provide init values for network parameter blobs. In Caffe, they are called Fillers.
-
class
NullInitializer
¶ An initializer that does nothing. To initialize with zeros, use a ConstantInitializer.
-
class
ConstantInitializer
¶ Set everything to a constant.
-
value
¶ The value used to initialize a parameter blob. Typically this is set to 0.
-
-
class
XavierInitializer
¶ An initializer based on [BengioGlorot2010], but does not use the fan-out value. It fills the parameter blob by randomly sampling uniform data from \([-S,S]\) where the scale \(S=\sqrt{3 / F_{\text{in}}}\). Here \(F_{\text{in}}\) is the fan-in: the number of input nodes.
Heuristics are used to determine the fan-in: For a ND tensor parameter blob, the product of all the 1 to N-1 dimensions are considered as fan-in, while the last dimension is considered as fan-out.
[BengioGlorot2010] Y. Bengio and X. Glorot, Understanding the difficulty of training deep feedforward neural networks, in Proceedings of AISTATS 2010, pp. 249-256.
-
class
GaussianInitializer
¶ Initialize each element in the parameter blob as independent and identically distributed Gaussian random variables.
-
mean
¶ Default 0.
-
std
¶ Default 1.
-
-
class
OrthogonalInitializer
¶ Initialize the parameter blob to be a random orthogonal matrix (i.e. \(W^TW=I\)), times a scalar gain factor. Based on [Saxe2013].
[Saxe2013] Andrew M. Saxe, James L. McClelland, Surya Ganguli, Exact solutions to the nonlinear dynamics of learning in deep linear neural networks, http://arxiv.org/abs/1312.6120 with a presentation https://www.youtube.com/watch?v=Ap7atx-Ki3Q -
gain
¶ Default 1. Use \(\sqrt{2}\) for layers with ReLU activations.
-