Audio version of the article
Over the last several years, deep learning — a subset of machine learning in which artificial neural networks imitate the inner workings of the human brain to process data, create patterns and inform decision-making — has been responsible for significant advancements in the field of artificial intelligence. Building on what is possible with the human brain, deep learning is now capable of unsupervised learning from data that is unstructured or unlabeled. This data, often referred to as big data, can be drawn from various sources such as social media, internet history and e-commerce platforms, among others.
These sources of data are so vast that it could take decades for humans to comprehend it and extract relevant information, but interpreting this data through deep learning allows models to detect objects, recognize speech, translate language and make decisions at remarkable speeds. Many companies realize the incredible potential that can result from unraveling this wealth of information and are increasingly adopting AI systems driven by deep learning to gain a competitive advantage through data and automation.
However, real-world deployments of deep learning remain very limited. While the technology is there to process the data, a recent project (download required) led by MIT researchers argues that the computational and storage demands required to do so are incredibly costly from an economic, environmental and technical perspective. These demands can increase exponentially with each incremental hardware advancement. Additionally, I’ve found that the storage space needed almost entirely restricts deep learning to the cloud, which creates latency, bandwidth and connectivity challenges.
To overcome these barriers, we should shrink the computational and storage requirements of deep learning. While it is easier said than done, luckily, we have the framework in place with our own brain. Just as we looked to the human brain for inspiration in developing AI, we can look to the human brain as a model for increasing efficiency — specifically, by taking the early development phase of the brain and mirroring it for deep learning.
Mirroring The Intricacies Of The Human Brain In Early Childhood
To continue to drive AI advancement in the decades to come, we need to reimagine deep learning at its core. A promising approach is to mirror how the human brain develops, particularly in early childhood.
During infancy, the brain experiences synaptogenesis — an explosion of synapse formation as the brain begins to develop. In early childhood, we have the greatest number of synapses that we will have in our lifetime, with totals increasing until about two years old. Over time, our synapses begin to “train” — strengthening, weakening and evolving as the connections in our brains begin to sparsify.
From this stage through our late teenage years, while learning is most prevalent, synapse usage and pruning occurs at more rapid levels. Our brain continuously removes unneeded synapses and cells, which sparsifies the brain even further. The connections themselves learn over time, and the entire structure of our brain is modified to remain lean.
This is why the brain of a child has a huge amount of plasticity, while the brain of an adult is thought to lose much of its plasticity. Because of this, a child’s brain can continuously reform and learn and may better recover from damage.
Replicating Neurological Attributes In Deep Learning
To improve and achieve real-world AI deployments, we should reinvent the training process of deep learning models to emulate the “training process” of the human brain.
For deep learning, the model training stage is very similar to the initial learning stage of humans. During early stages, the model experiences a mass intake of data, which creates a significant amount of information to mine for each decision and requires significant processing time and power to determine the action or answer.
But as training occurs, neural connections become stronger with each learned action and adapt to support continuous learning. As each connection becomes stronger, redundancies are created and overlapping connections can be removed.
This is why continuously restructuring and sparsifying deep learning models during training time, and not after training is complete, is necessary. After the training stage, the model has lost most of its plasticity and the connections cannot adapt to take over additional responsibility, so removing connections can result in decreased accuracy.
Current methods such as the one unveiled in 2020 by MIT researchers where attempts are made to make the deep learning model smaller post-training phase have reportedly seen some success. However, if you prune in the earlier stages of training when the model is most receptive to restructuring and adapting, you can drastically improve results.
When you conduct sparsification during the training phase, the connections are still in the rapid learning stage and can be trained to take over the functions of removed connections. You can thus continuously monitor the pruning progress and mitigate any damage to output accuracy while the model is at its greatest plasticity.
The resulting model can therefore be lightweight with significant speed improvement and memory reduction, which could allow for an efficient deployment on intelligent edge devices (e.g., mobile devices, security cameras, drones, agricultural machines, preventative maintenance and the like). I believe this will allow the devices to truly make autonomous decisions.
Can AI Reach Its Prophesied Heights?
Undoubtedly, to meet and exceed the enormous expectations on the future of AI, advancements still need to occur within deep learning research and execution, refining and building on the results we have seen so far. But the model is there to advance deep learning from the lab to real-world deployment.
Just as our brains evolve early in our lives, AI should evolve as we increasingly apply it in real-world scenarios at scale. By replicating the intricacies of our own cognition, we can improve AI’s ability to quickly and effectively make decisions and ensure that the technology meets its full potential.
This article has been published from the source link without modifications to the text. Only the headline has been changed.