In my neural network program, I refactored some code and produced an error which I did not notice for some time. When I would run the program, eventually, out of 100 networks, a single or few networks would learn the pathing problem. With no difference in how they are trained, they are all taught exactly the same way – and with a bug, too. All taught the same, but the results range from correct to odd, the differnence in training success is enormous. The only difference between these networks is their initial random configuration. The initial state of the network’s weights and biases, which are produced through some random number generation sequence.
Ok, onto the bug in code. I had a function which for some reason changed through error into merely:
LEARN_RATE = cos(trainNum)
This causes the learn rate to vary by cos function from -1 to 1. When learn rate is above zero, a correction is made toward the correct solution, and when it is below zero, a correction is made away from the correct solution. With all networks trained exactly the same way, sometimes learning, sometimes unlearning, maybe 1% learn in a reasonable amount of time.
What does this all mean, unlearning and learning, I don’t know! But I think it may show the profoundness of Lottery Ticket. It seems to boil down to “configurations of random variables” being an important part in learning capacity and ultimately problem solving capacity.