When Hongzhi Gao was young, he lived with his family in Gansu, a province in central northern China near the Tengger Desert. Thinking about his childhood, he recalls the constant, constant wind of dirt outside their home and that for most months of the year it didn’t take more than a minute after he went outside before the sand filled every empty space and crept into his pockets, boots and his mouth. The monotony of the desert stuck in his head for years, and in college he turned that memory into the idea of making a machine that could bring plant life into the desert landscape.
Efforts to stop desertification – the process by which fertile land becomes a desert – are primarily focused on costly manual solutions. Hongzhi has designed a robot with deep learning technology to automate the process of planting trees: from identifying optimal locations to planting tree seedlings to watering. Although he had no experience with AI, as an undergraduate student, Hongzhi used Baidu’s Deep Learning Platform PaddlePaddle to combine different modules to make robots with better ability to detect objects from similar machines already available on the market. It took less than a year for Hongzhi and his friends to spin the final product and put it into operation.
Hongzhi’s desert robot serves as an example of the growing availability of artificial intelligence.
Today, more than four million developers use Baidu’s open source artificial intelligence technology to build solutions that can improve the lives of people in their communities, and many have no or no technical expertise in the field. “In the next decade, artificial intelligence will be the source of changes that are happening in every tissue of our society, transforming the way industries and companies function. Technology will expand the human experience by taking us deeper into the digital world, ”said Baidu CEO Robin Li at Baidu Create 2021, an artificial intelligence development conference.
As we enter a new chapter in the evolution of AI, Haifeng Wang, CTO of Baidu, has identified two key trends that support the progress of the industry: AI will continue to mature and increase its technical complexity. At the same time, the cost of deployment and barriers to entry will be reduced – which will benefit both companies building large-scale artificial intelligence solutions and software developers exploring the world of AI.
Combining knowledge and data with deep learning
The integration of knowledge and data with deep learning has significantly improved the efficiency and accuracy of the AI model. Since 2011, Baidu’s AI infrastructure has been collecting and integrating new information into a large knowledge chart. Currently, this chart of knowledge has more than 550 billion facts covering all aspects of everyday life, as well as industry-specific topics, including manufacturing, pharmacy, law, financial services, technology and media and entertainment.
This graph of knowledge and massive data points together form the building blocks of Baidu’s recently released pre-trained language model PCL-BAIDU Wenxin (ERINIE version 3.0 Titan). The model outperforms other non-graphical language models on 60 natural language processing (NLP) tasks, including reading comprehension, text classification, and semantic similarity.
Learning through modality
Intermodal learning is a new area of AI research that seeks to improve the cognitive understanding of machines and better mimic adaptive human behavior. Examples of research efforts in this area include automatic text-to-image synthesis, where the model is trained to generate images only from textual descriptions, as well as algorithms designed to understand visual content and express that understanding in words. The challenge with these tasks is for machines to build semantic connections between different types of data sets (e.g. images, text) and understand the interdependencies between them.
The next step for AI is to combine AI technologies such as computer vision, speech recognition and natural language processing to create a multimodal system.
In this regard, Baidu presented a variant of his NLP models that connects language and visual semantic understanding. Examples of real-world applications for this type of model include digital avatars that can perceive their environment as human beings and handle customer support for businesses, and algorithms that can “draw” artwork and compose songs based on their understanding of generated artwork. .
There are even more creative, striking potential results for this technology. The PaddlePaddle platform can build semantic connections through vision and language, which led a group of master students in China to create a dictionary for the preservation of endangered languages in regions such as Yunnan and Guangxi by making it easier to translate them into Simplified Chinese.
Integration of AI through software and hardware, and in industry-specific use cases
As AI systems are applied to solve increasingly complex industry-specific problems, more emphasis is placed on optimizing software (deep learning framework) and hardware (AI chip) as a whole, rather than optimizing each one individually, taking into account factors such as computing power, energy consumption and latency.
Furthermore, a huge innovation takes place on the platform layer of Baidu’s AI infrastructure, where third-party developers use deep learning capabilities to build new applications tailored to specific use cases. The PaddlePaddle platform has a number of APIs to support AI applications in newer technologies such as quantum computing, life sciences, computational fluid mechanics, and molecular dynamics.
AI also has practical uses. For example, in Shouguang, a small town in Shandong Province, AI is used to streamline the fruit and vegetable industry. Only two people and one application are needed to manage dozens of vegetable sheds.
And that’s significant, says Wang, “Despite the increasing complexity of AI technology, the open source deep learning platform connects processors and applications like operating systems, reducing barriers to entry for companies and individuals who want to incorporate artificial intelligence into their business.”
Reduced barrier to entry for developers and end users
In terms of technology, pre-training of large models such as the PCL-BAIDU Wenxin (ERNIE version 3.0 Titan) has solved many of the common bottlenecks faced by traditional models. For example, these general-purpose models have helped lay the groundwork for performing different types of downstream NLP tasks, such as text classification and answering questions, in one consolidated place, whereas in the past each type of task had to be solved. according to a special model.
PaddlePaddle also has a number of tools tailored to developers, such as model compression technologies to customize general-purpose models to suit more specific uses. The platform provides an officially supported industrial model library with more than 400 models, ranging from large to small, that retain only part of the size of general-purpose models but can achieve comparable performance, reducing model development and implementation costs.
Today, Baidu’s open source deep learning technology supports a community of more than four million AI developers who have jointly created 476,000 models, contributing to the transformation of 157,000 companies and institutions driven by artificial intelligence. The above examples are the result of innovations in all layers of Baidu AI infrastructure, which integrates technologies such as voice recognition, computer vision, AR / VR, knowledge graphs and pre-training of large models that are one step closer to perceiving the world as humans.
In her current state, AI has reached a level of maturity that allows her to perform amazing tasks. For example, the recent launch of Metaverse XiRang would not have been possible without the PaddlePaddle platform for creating digital avatars for participants around the world to connect from their devices. Furthermore, future discoveries in areas such as quantum computing could significantly improve the performance of the metaverse. This shows how different Baidu’s offers are intertwined and interdependent.
In a few years, AI will be close to the core of our human experience. It will be for our society what steam energy, electricity and the Internet were to previous generations. As AI becomes more complex, developers like Hongzhi will work more as artists and designers, given their creative freedom to investigate use cases that were previously thought to be only theoretically possible. The sky is the limit.
This content was produced by Baidu. It was not written by the editorial board of the MIT Technology Review.