A few months ago, Generative Pre-Trained Transformer-3, or GPT-3, the biggest artificial intelligence (AI) model in history and the most powerful language model ever, was launched with much fanfare by OpenAI, a San Francisco-based AI lab. Over the last few years, one of the biggest trends in natural language processing (NLP) has been the increasing size of language models (LMs), as measured by the size of training data and the number of parameters. The 2018-released BERT, which was then considered the best-in-class NLP model, was trained on a dataset of 3 billion words. The XLNet model that outperformed BERT was based on a training set of 32 billion words. Shortly thereafter, GPT-2 was trained on a dataset of 40 billion words. Dwarfing all these, GPT-3 was trained on a weighted dataset of roughly 500 billion words. GPT-2 had only 1.5 billion parameters, while GPT-3 has 175 billion.
A 2018 analysis led by Dario Amodei and Danny Hernandez of OpenAI revealed that the amount of compute used in various large AI training models had been doubling every 3.4 months since 2012, a wild deviation from the 24 months of Moore’s Law and accounting for a 300,000-fold increase. GPT-3 is just the latest embodiment of this exponential trajectory. In today’s deep-learning centric paradigm, institutions around the world seem in competition to produce ever larger AI models with bigger datasets and greater computation power.
The influential paper, On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? by Timnit Gebru and others, was one of the first to highlight the environmental cost of the ballooning size of training datasets. In a 2019 study, Energy and Policy Considerations for Deep Learning in NLP, Emma Strubell, Ananya Ganesh and Andrew McCallum of University of Massachusetts, Amherst estimated that while the average American generates 36,156 pounds of carbon dioxide emissions in a year, training a single deep-learning model can generate up to 626,155 pounds of emissions—roughly equal to the carbon footprint of 125 round-trip flights between New York and Beijing.
Neural networks carry out a lengthy set of mathematical operations for each piece of training data. Larger datasets therefore translate to soaring computing and energy requirements. Another factor driving AI’s massive energy draw is the extensive experimentation and tuning required to develop a model. Machine learning today remains largely an exercise in trial and error. Deploying AI models in real-world settings—a process known as ‘inference’—consumes even more energy than training does. It is estimated that 80-90% of the cost of a neural network is on inference rather than training.
Payal Dhar in her Nature Machine Intelligence article, The Carbon Impact of Artificial Intelligence, captures the irony of this situation. On one hand, AI can surely help reduce the effects of our climate crisis: By way of smart grid designs, for example, and by developing low-emission infrastructure and modelling climate-change predictions. On the other hand, AI is itself a significant emitter of carbon. How can ‘green AI’, or AI that yields novel results without increasing computational cost (and ideally reducing it), be developed?
No doubt, industry and academia have to promote research of more computationally efficient algorithms, as well as hardware that requires less energy. The software authors should report training time and computational resources used to develop a model. This will enable a direct comparison across models. But we need to have far more significant pointers to guide the future of AI. A strong contender for this role is the human brain.
Neuromorphic Computing is an emerging field in technology that understands the actual processes of our brain and uses this knowledge to make computers ‘think’ and process inputs more like human minds do. For example, our brain executes its multi-pronged activities by using just 20 watts of energy. On the other hand, a supercomputer that is not as versatile as a human brain consumes more than 5 megawatts, which is 250,000 times more power than our brain does. Many challenges that AI is attempting to solve today have already been solved by our minds over 300-plus millennia of human evolution. Our brain is an excellent example of few-shot learning, even from very small datasets. By understanding brain functions, AI can use that knowledge as an inspiration or as validation. AI need not reinvent the wheel.
Computational neuroscience, a field of study in which mathematical tools and theories are used to investigate brain function at an individual neuron level, has given us lots of new knowledge on the human brain. According to V. Srinivasa Chakravarthy, author of Demystifying the Brain: A Computational Approach, “This new field has helped unearth the fundamental principles of brain function. It has given us the right metaphor, a precise and appropriate mathematical language which can describe brain’s operation.” The mathematical language of Computational Neuroscience makes it very palatable for AI practitioners.
AI has a significant role in building the world of tomorrow. But AI cannot afford to falter on its environment-friendly credentials. ‘Go back to nature’ is the oft-repeated mantra for eco-friendly solutions. In similar vein, to build AI systems that leave a far smaller carbon footprint, one must go back to one of the most profound creations of nature—the human brain.
This article has been published from the source link without modificatons to the text. Only the headline has been changed.