China Unicom, the leading Chinese telco, is going big on Alluxio, Apache Spark, and other top open source projects.
Driven in part by the advent of new 5G technologies, many of the world’s largest telecommunications providers—AT&T, BT, CenturyLink, Telefonica, and more—have gone public with their plans to migrate to a microservices architecture running in a cloud to handle the massive jump in data they anticipate. Even at this pace, they still trail the hyperscale cloud service providers like Microsoft Azure, Google Cloud Platform, Amazon Web Services, and Facebook who pioneered the initial adoption of container software in their data centers.
While these other telcos are lagging, China Unicom is not—in fact, the leading Chinese telco is already there. Even more interestingly, open source software is helping China Unicom lead the way to expand services and improve performance for its more than 320 million subscribers.
Thinking differently about software
The world’s fourth largest telco, China Unicom is deep into a broad rebuild of its internal software stack, built largely around open source. Its booming business in 4G and emerging 5G networks was already putting a heavy load on the legacy network data processing system based on IBM midrange computers, Oracle databases, and EMC storage systems.
According to big data engineer Zhang Ce at China Unicom, the giant’s telco cloud aimed to take the cloud computing model into telecommunications infrastructure by building software that can run on commercial off-the-shelf (COTS) hardware. A microservices architecture shortens the time to develop and deploy applications composed of independent, autonomous, and modular pieces of code. Rather than building software as a monolithic application, the application can be built as distributed applications based on software components that work across the cloud.
“The architecture of our incumbent computing platform was too complicated and didn’t let us effectively use resources,” Ce said. “Open source was our path forward.”
Building on an open source power suite
Ce said the key to Unicom’s project success was the emergence of new big data frameworks like Apache Kafka, Spark, Hive, and Alluxio, which allowed the company to re-imagine its software stack to support batch and stream processing business requirements by using very similar open source-based architectures orchestrated by Kubernetes.
SEE: How to build a successful developer career (free PDF) (TechRepublic)
Both the batch and stream processing workloads are run in Spark clusters and both store data in Alluxio for fast retrieval and access, in HDFS, object store (or even disk). Alluxio acts as a virtual pool for ingesting and accessing data from disparate storage devices across the network. Alluxio has a unified namespace feature that allows mounting HDFS paths of different clusters so all users get the performance benefits of data locality. Moving data, even large jobs, is almost eight times faster under the new architecture, Ce said.
The new open source-based architecture his team built now supports seven different lines of business services at China Unicom, Ce said. He estimates they run about 3 terabytes of data daily in Alluxio memory to support the batch and streaming requirements of the business workloads, or more than 200,000 daily Spark jobs.
“We’re absolutely planning to expand the use cases and deployments on this new platform,” Ce said. “Open source is the foundation.”