👷🏻Transforming Rural China with AI, ERNIE Bot Hits 100M Users, AI Taylor Swift's Mandopop, and Huawei Proposes Transformers' Challenger
Weekly China AI News from December 18 to December 31
Happy New Year, dear readers! May all the joys of the holiday season fill your heart and follow you throughout the coming year.
In this issue, I will highlight a few interesting AI applications in rural China. Baidu’s ERNIE Bot has reached a milestone of 100 million users, its CTO announced. Also, be sure to check out the research section: An open-source software created a deepfake of Taylor Swift singing Mandopop, Huawei introduced a new model architecture that may challenge Transformers, and Tencent proposed an AI agent that can use smartphone apps just like humans.
AI's Opportunities in Rural Villages, Forests, and Deserts
What’s New: Much of my newsletter focuses on breaking news, advancements in LLMs, and the ever-evolving state regulations regarding AI. However, it’s crucial to recognize that a wealth of AI innovation is happening every day in the rural and developing regions of China, addressing small yet essential challenges.
I recently came across a series of stories titled “Smart China (打卡智能中国)” that shed light on these innovations, and wanted to share some excerpts from this series with you.
‘Sky Farmers’: Drone Pilots in Chinese Villages
In provinces like Zhejiang, Fujian, and Qinghai, a growing number of young people in villages have become “sky farmers” - operators of agricultural drones. Traditional methods of crop spraying were labor-intensive and hazardous. UAVs offer precision spraying, reduce pesticide use, and improve safety for workers. One drone can cover an area of 300-500 acres in a day, thirty times more than manual spraying.
Previously, drone-based plant protection was only outsourced to professional teams. This model faced trust issues among local farmers who preferred “one of their own” to manage their crops. The transition to individual farmers using drones was challenging due to high costs and technical barriers. However, advances in AI and telecommunications, alongside affordable training programs, have lowered these barriers and allowed anyone to operate these drones.
The story also reflected a cultural shift in rural China. Young people returning to their villages are finding new ways to contribute to their hometowns, and redefining agriculture for the next generation. In their hands, farming is no longer just a traditional livelihood but a frontier of modern technology and innovation. (You can find the original Chinese story here).
A Water Plant Clerk’s AI Role
Xia is a young man living in a small city of Fujian, hired as a clerk for government work at a local water treatment plant. However, thanks to AI, he now juggles multiple roles: Office clerk, digital technician, and communications specialist.
The water plant he works for, just an ordinary facility, has embraced AI to transform its operations. With AI, tasks like adding filtration chemicals, which once relied on manual labor and experience, are now automated. AI has also shifted on-site monitoring to digital oversight, allowing workers to address issues remotely via a mobile app.
The transition to AI isn’t without its challenges. Implementing AI in complex systems like water supply networks demands significant investment in data infrastructure, collaborative efforts with cloud service providers, and a focus on key application scenarios. (You can find the original Chinese story here).
Smart Grids Among Rugged Trails
The forest-covered region of Diqing Tibet Autonomous Region in Yunnan is a great tourist attraction, yet a nightmare to power line inspectors. The rough terrain and lack of 4G coverage made manual inspection the only option before, a task with dangers like wildlife attacks and tough environmental conditions.
In early 2021, a fire on the Jinge high-voltage power line led to a power outage for several days. Inspecting the 39-kilometer Jinge Line can take workers a week.
AI like image recognition and natural language processing are being used in power grid management, especially in-line inspection. In Diqing, after that forest fire incident, the power lines were upgraded with telecommunication. Then smart cameras were mounted along the line, edge computing and storage were deployed for immediate data processing, and AI algorithms were used for analysis.
The real-time video captured by cameras along the power line is transmitted back and analyzed using AI models on cameras, and the main station. This remote system has cut the inspection time from 7 days to just 2 hours. In addition, AI’s real-time alert system reduces unexpected power outages caused by external damages. (You can find the original Chinese story here).
AI Turns on the Lights of Tunnels
Deep in the mountains of Guizhou Province, Li, a retired man, is employed to guard tunnels on newly built expressways. These dozens of tunnels, part of a tranport network that has significantly improved connectivity in the region, face a unique challenge: maintaining lighting and ventilation systems operational is expensive.
To tackle this, local elders like Li were hired to manually switch the tunnel lights on and off. This task is time-consuming and risky. It takes Li 15 minutes to navigate the whole tunnel with flashlights and adjust the lighting.
OpenHarmony, a local open-source operating system, is here to help. With its distributed capabilities, OpenHarmony can integrate tunnel systems like sensors, emergency response, and energy management. What once took Mr. Li 15 minutes of manual labor can now be remotely managed in 30 seconds, also enabling smarter lighting solutions that go beyond manual switching. (You can find the original Chinese story here).
Tree-Planting Robots in Deserts
Growing up in the Tengger Desert-adjacent province of Gansu in northern China, Gao was familiar with the relentless sandstorms. This childhood experience was characterized by the ever-present sandy gusts and their invasive nature. It later inspired him during his university years to create a robot capable of transforming the desolate desert landscape.
Gao set out to create a robot utilizing deep learning to streamline the tree-planting process. This robot would not only identify the best locations for planting but also handle the planting and watering of tree seedlings. Despite his lack of prior experience in AI, Hongzhi leveraged a local open-source deep learning platform. He combined various modules to enhance the robot's object detection capabilities, surpassing existing models in efficiency.
In under a year, with the help of his friends, Hongzhi’s concept has been built into a fully functional robot, ready to combat desertification. (You can find the original Chinese story here).
Baidu Says ERNIE Bot Hits 100 Million Users
What’s New: Baidu’s chatbot ERNIE Bot has reached more than 100 million users since its public release on August 31, said Baidu CTO Wang Haifeng at the company’s deep learning conference on December 28.
OpenAI’s ChatGPT had reportedly reached 100 million monthly active users in two months, according to a UBS study.
Baidu executives also revealed that ERNIE Bot had generated 3.7 billion words of text for workplace-related queries, equvalent to 10,00 copies of The Three-Body Problem, a famous sci-fi novel, and written 300 million lines of code.
ERNIE for Hearing Impaired: One developer story using ERNIE that caught my attention is an application, named Talkbot (声桥AI), that assists people with hearing loss to improve their speech.
Those with impaired hearing often struggle with speech as they lack auditory feedback to gauge their pronunciation. Talkbot addresses this issue by offering a dialogue interface where users can record their speech. The application then analyzes these recordings and provides text feedback on pronunciation and speech quality. This presents a cost-effective solution for the 27 million people with hearing loss, especially considering the limited availability of only 10,000 rehabilitation therapists.
Weekly News Roundup
🔄 Google’s Gemini Pro model reportedly shows hallucinations in Chinese dialogue, mistakenly identifying itself as Baidu’s ERNIE Bot. The loophole has been seemingly fixed.
🏅 Alibaba Cloud’s Tongyi Qianwen, Baidu’s ERNIE Bot, Tencent’s Hunyuan, and 360’s Smart Brain have become the first models that pass the official “Large Model Standard Compliance Evaluation”, launched by China Electronics Standardization Institute.
📅 WeChat announces the 2024 WeChat Open Class PRO in Guangzhou on January 11, which will feature WeChat AI and mini-games.
🌐 Huawei Cloud launches public beta of CodeArts Snap, an AI-powered coding assistant based on its Pangu LLM. CodeArts Snap can assist developers with smart coding generation and Q&A across various software development scenarios.
🔍 Youku introduces an AI-powered search feature in its app, allowing users to engage in dialogues for search, movie queries, and plot indexing.
📱 OPPO says its 7-billion parameter AndesGPT model will be used in its Find X7 smartphones, supporting voice-to-text, natural language understanding, and content summarization.
🏥 Shanghai AI Lab releases OpenMEDLab2.0, a group of multimodal medical AI models aiming to bolster AI medical applications across various domains, diseases, and modalities.
🖥️ Beijing’s AI Public Computing Platform (Shangzhuang) has been launched, offering 500 petaflops of (FP16) computing power for AI applications.
🔍 Douyin is testing an AI search feature within its app, offering AI-generated responses to user queries.
🧠 Xiaohongshu is testing an AI feature named “Davinic”, providing intelligent Q&A and AI chat functionalities for its users.
🌐 Baidu AI Cloud announces its AI-native application development platform, Qianfan AppBuilder.
🔍 Baichuan AI releases its Baichuan2-Turbo series API, including Baichuan2-Turbo-192K, featuring enhanced search and a 192K context window.
Trending Research
Have you heard Taylor Swift singing Jay Chou’s Dao Xiang? Researchers from CUHK-Shenzhen, Shanghai AI Lab, and Shenzhen Research Institute of Big Data, open sourced Amphion (/æmˈfaɪən/), a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development. Amphion offers a unique feature: visualizations of classic models or architectures. Check out the project here.
Researchers from Huawei and Peking University introduced a new architecture for LLMs. This architecture, named PanGu-π, addresses the feature collapse problem in LLMs by enhancing model nonlinearity. It employs two key techniques: series-based activation functions in the feed-forward networks (FFN) and augmented shortcut connections in the multi-head self-attention (MSA) modules. The paper demonstrates that PanGu-π models, specifically PanGu-π-7B and PanGu-π-1B, achieve comparable or superior performance to existing state-of-the-art models in terms of accuracy and efficiency. Read the paper PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation.
Researchers from Tencent introduced a novel LLM-based multimodal agent framework designed to operate smartphone applications. The framework is built on a simplified action space, mimicking human-like interactions such as tapping and swiping. The agent learns to navigate and use new apps either through autonomous exploration or by observing human demonstrations. The results show the agent’s proficiency in handling diverse high-level tasks. Read the paper AppAgent: Multimodal Agents as Smartphone Users.
If you’re enjoying Recode China AI, chances are your friends and colleagues will, too! Feel free to share the link to this issue with others.