ift6266: deep/stacked_dae/stacked

Changé le coût de reconstruction pour stabilité numérique, en ajoutant une petite constante dans le log.

comparison

equal deleted inserted replaced

-:e656edaedb48
+:e1f5f66dd7dd
 self.z   = T.nnet.sigmoid(T.dot(self.y, self.W_prime) + self.b_prime)
 # Equation (4)
 # note : we sum over the size of a datapoint; if we are using minibatches,
 #        L will  be a vector, with one entry per example in minibatch
 #self.L = - T.sum( self.x*T.log(self.z) + (1-self.x)*T.log(1-self.z), axis=1 )
-self.L = binary_cross_entropy(target=self.x, output=self.z, sum_axis=1)
+#self.L = binary_cross_entropy(target=self.x, output=self.z, sum_axis=1)
+# I added this epsilon to avoid getting log(0) and 1/0 in grad
+# This means conceptually that there'd be no probability of 0, but that
+# doesn't seem to me as important (maybe I'm wrong?).
+eps = 0.00000001
+eps_1 = 1-eps
+self.L = - T.sum( self.x * T.log(eps + eps_1*self.z) \
++ (1-self.x)*T.log(eps + eps_1*(1-self.z)), axis=1 )
 # note : L is now a vector, where each element is the cross-entropy cost
 #        of the reconstruction of the corresponding example of the
 #        minibatch. We need to compute the average of all these to get
 #        the cost of the minibatch
 self.cost = T.mean(self.L)

Mercurial > ift6266