Meta, formerly known as Facebook, is making significant strides in the field of artificial intelligence (AI), unveiling its plans to develop proprietary hardware infrastructure for AI workloads. In an effort to keep up with competitors like Google and Microsoft, Meta aims to gain control over every layer of the AI stack, from datacenter design to training frameworks. This move reflects Meta’s commitment to pushing the boundaries of AI research at scale.
Despite investing billions of dollars in recruiting top data scientists and developing AI capabilities, Meta has struggled to transform its ambitious AI research into tangible products, particularly in generative AI. In the past, Meta relied on CPUs, which are less efficient for AI tasks compared to GPUs, and a custom chip designed for accelerating AI algorithms. However, Meta shifted its focus and decided to develop an in-house chip capable of both training and running AI models, known as the Meta Training and Inference Accelerator (MTIA). The MTIA, part of a family of chips dedicated to accelerating AI workloads, is an ASIC chip that can perform multiple tasks in parallel.
The trend of creating custom AI chips is prevalent among major tech companies. Google introduced the tensor processing unit (TPU) for training large generative AI systems like PaLM-2 and Imagen, while Amazon offers proprietary chips for both training (Trainium) and inferencing (Inferentia) in its AWS services. Microsoft is reportedly collaborating with AMD to develop its in-house AI chip, Athena. Meta’s MTIA, built on a 7-nanometer process, has demonstrated superior efficiency in handling low-complexity and medium-complexity AI models compared to GPUs. However, Meta acknowledges the need for further improvements in memory and networking to accommodate the growing size of AI models.
In addition to custom chips, Meta has invested in a research-focused supercomputer called the Research SuperCluster (RSC). Comprising 2,000 Nvidia DGX A100 systems with a total of 16,000 Nvidia A100 GPUs, the RSC enables Meta’s researchers to train AI models using real-world examples from the company’s production systems. While Meta’s supercomputer may not match the scale of its counterparts at Microsoft and Google, it empowers researchers to develop state-of-the-art AI models capable of analyzing text, images, and video across multiple languages.
Meta’s dedication to AI innovation extends beyond chip development. The company is also creating the Meta Scalable Video Processor (MSVP), an ASIC chip specifically designed for video processing in on-demand and live streaming applications. By offloading stable and mature video processing workloads to the MSVP, Meta aims to improve efficiency and deliver enhanced generative AI, augmented reality (AR), virtual reality (VR), and metaverse content.
Meta’s strategic focus on AI, particularly generative AI, aligns with its mission to unlock the vast potential of this technology. As the market for generative AI software is projected to reach $150 billion, Meta is under increasing pressure to capture a significant share of this market. The company’s recent hardware advancements, coupled with ongoing research and development efforts, demonstrate its determination to accelerate innovation in the AI space and address the demands of a rapidly evolving digital landscape.