My AI study so far

I have been studying neural networks for some time, and recently during a YSU hackthon, I managed to make interesting progress. After about a year long break, I return to this code and make large amounts of progress and a number of topics have presented in C++ software.

I’m going to describe some of my journey into AI in C++ and talk about AI in blog format from the persepective of a modern C++ developer.

Neural networks are in concept similar to brains. What they are is a pattern processing algorithm. In typical neural networks, there are layers of patterns, each modified by the previous in sequence. When this sequence goes forward, we do something called Feed-forward, which is taking a pattern, running it through the network, and producing a pattern-deduced result. So, a network could take input images/patterns of dogs, and output whether it is a dog or not. Feed Forward can be reversed in a way through something called Back Propagation. During Back Propagation, Calculus is used to adjust the pattern’s error throughout the network. The result is training to recognize the pattern, making Feed Forward less error-prone.

This, at least, was my understanding after completing the hackthon code. What I had done during the hackthon was reference a blog-paper which describes the calculus of backpropogation and produce a seemingly functional algorithm in C++.

These sequenced Neural Networks are easily represented as series of vectors and matricies, and Feed Forward is as simple as performing maths accross the series to its end, the output. Feed-forward can produce accurate results and the effect of pattern recognition. A thought is the vector result of Feed Forward for any arbitrary input. This vector result has an accuracy in relation to a radius about the desired sequence, where it is said to converge depending on the result of a distance function. Matricies may not be the best way to represent this data in light of modern Data-Oriented concepts.

In my study of Backpropagation, it seems like the Backpropagation function produces a magnitude of error. That magnitude then is used to adjust existing error. Adjusting by this magnitude does not train the network in a single iteration, ( for some reason not apparent to me), therefore Backpropagation is done iteratively.

One reason it is done iteratively, which i have read about, has to do with the initial solution surface of the network, the initial configuration. The initial configuration of the network can have benificial or detremental effects on how the network converges if at all. This seems problematic and so we could move the problem from training networks toward finding solution surfaces for initial or final configurations of networks. In searching vast random solution space there is an immediate ally, genetic algorithms. Backpropagation and Genetic Algorithms combine to make an error direction and ability to move across solution surfaces effeciently with averaging and bounded random movement, or at least, I think so. This sort of algorithm also scales by increasing parallel hardware. The question is whether there is a better way to search for ideal networks. Out of a potential infinity of vast solution space, what is the most effecient way to converge on an ideal network? How long should it take? What is the ideal network, or how could it emerge?

This leads to a theory called: lotto ticket theory. The idea goes, for a given network, there is a smaller network that converges the same or similarly. Finding this network, that is winning the lottery. The question is how to find it, and in the end: it could be easy to find, stumbled upon eventually, or entirely impossible. In the algorithm for this theory, a network is trained using backpropagation. Next, it is pruned by the least contributing factors and then trained more, in a repeated fashion. I think what must happening, is that when a least contributing factor is eliminated, existing factors must take up slack or the network will not converge.

The existance of a converging network or its ability to take up slack, depends on the numbers that can be found within it. Future computers are going to need ever better and more accurate floating-point number representation, in order to make finding these networks more likely. It is entirely possible that the existance of networks has entirely to do about their ultimate hardware respresentation and they may not be very portable.

Comments, Concerns, feel free to post replies

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: