Google LLC today introduced a new infrastructure option for its cloud platform that will enable enterprises to provision instances with Tensor Processing Units, the search giant’s internally developed artificial intelligence chips.
Cloud TPU VMs, as the new instances are called, are available in preview. Early adopters are using them for tasks ranging from AI-powered healthcare analytics to quantum chemistry.
Customers of Google’s cloud platform could already provision instances with TPUs before. Those instances, however, didn’t run in the same physical server enclosure as the TPUs and instead connected to them remotely via a network link. That slowed down processing because applications had to send data to a TPU over the network and then wait for the results to return.
Cloud TPU VMs remove the delay. The instances directly attach to Google’s AI chips, which avoids network-related performance slowdowns and latency.
The search giant also believes that customers could see their cloud infrastructure costs decrease in some cases thanks to the new instances. That’s because, in large AI projects involving a lot of data, the task of sending the data from a cloud instance to a TPU can in itself require considerable computing resources. As a result, a company may have to buy more expensive instances with faster processors. Cloud TPU VMs remove the need to send data over a network link and thereby spare customers the additional expense.
Companies can provision the instances with either Cloud TPU v2 units, the second iteration of the chip, or the newer Cloud TPU v3. The key difference is performance. A single Cloud TPU v2 can provide up to 180 teraflops of performance, which amounts to 180 trillion computing operations per second, while the Cloud TPU v3 can manage up to 420 teraflops.
One of the use cases Google sees for the instances is developing algorithms to run on its Cloud TPU Pods. A Cloud TPU Pod is a large cluster of TPU-powered AI servers that enterprises can rent to run particularly complex machine learning models. The fastest cluster on offer exceeds 100 petaflops or quintillion operations per second.
Developers can build their algorithms on a Cloud TPU VM for a small fraction of the cost of renting a Pod and then migrate the software to the more powerful hardware when it’s ready for production. Because Cloud TPU VMs and Cloud TPU Pods use the same chips, the task of moving workloads is considerably easier than when software has to be migrated across different processors.
For added measure, the new instances come with root access permissions. That means developers have more complete access to the software running inside the instances, which makes certain coding tasks easier.
“Google Cloud TPU VMs enabled us to radically scale up our research with minimal implementation complexity,” said James Townsend, a researcher at the UCL Queen Square Institute of Neurology in London. “There is a low-friction pathway, from implementing a model and debugging on a single TPU device, up to multi-device and multi-host (pod scale) training.”
Google is also using the new instances internally to support its efforts to develop quantum computers. “Our team has built one of the most powerful classical simulators of quantum circuits,” said Shrestha Basu Mallick, a product manager at Google parent Alphabet Inc.’s Sandbox@Alphabet research team. “The simulator is capable of evolving a wavefunction of 40 qubits, which entails manipulating one trillion complex amplitudes. Also, TPU scalability has been key to enabling our team to perform quantum chemistry computations of huge molecules, with up to 500,000 orbitals.”
Each TPU in Google’s cloud comprises multiple matrix units, processing cores optimized for the specific types of mathematical operations that AI models use to process data. The third-generation Cloud TPU v3 chips are supported by a water cooling system that absorbs the heat produced by the cores as they run.
AI models represent the data they process in the form of large numbers known as floating-point values. Google’s TPUs store those numbers in a data format known as bfloat-16 that the search giant has developed to help boost AI workloads. With bfloat-16, a chip can store a number that would normally take up 32 bits of space in just 16 bits, which speeds up calculations because the total number of bits that have to be processed decreases.
This article has been published from the source link without modifications to the text. Only the headline has been changed.