๐ China's Top Leaders Stress AGI Importance; Alibaba Controls Robot Using LLM; ByteDance Denies Rumors of Poaching OpenAI Staff
Weekly China AI News from April 24 to April 30
Dear readers, this week we cover AGI being discussed for the first time in a regular meeting of Chinaโs top leaders. Alibaba is experimenting with controlling robots using LLMs. ByteDance denies offering $1.4 million to poach an OpenAI engineer. Additionally, Baiduโs RT-DETR outshines YOLO as a leading object detector.
China Politburo Emphasizes General Artificial Intelligence
Whatโs new: Chinese top leaders brought attention to โGeneral Artificial Intelligence (AI)โ, for the first time ever, at its regular meeting. On April 28, the Politburo, the Communist Partyโs top decision-making entity, said โ(we) must pay attention to the development of AGI, foster an innovative ecosystem, and attach importance to risk prevention.โ
Why it Matters: This meeting serves as a key indicator of the political and economic direction for the year, providing valuable insight into the subsequent policy guidance for the industry. Itโs clear that AGI has become a high priority for the countryโs leadership.
The emphasis on AGI differentiates this meeting from previous important gatherings, such as the Central Economic Work Conference held in 2022, which broadly referred to โAI.โ This change is likely prompted by the rise of ChatGPT since November 2022.
Risk prevention: The emphasis on addressing potential risks also shows the countryโs awareness of the ethical and practical implications of AI technology. The Cyberspace Administration of China (CAC), the nationโs top Internet regulator, unveiled a draft regulation on generative AI two weeks ago.
Alibaba Explores Using Large Language Models to Control Industrial Robots
Whatโs New: Alibaba Cloud engineers are experimenting with integrating large language modals (LLMs) into industrial robots, said Daniel Zhang, CEO of Alibaba Cloud and Chairman of Alibaba Group, at a recent event. By entering prompts in DingTalk, Alibabaโs enterprise collaboration platform, a connected robot can be remotely commanded to perform tasks.
In a demo video, an Alibaba engineer sent the prompt โIโm thirsty, find something to drinkโ to the Qianwen LLM (probably Alibabaโs Tongyi Qianwen). The LLM responded, โAlright, let me find something to drink,โ and subsequently sent a set of code to a robot. The robot recognized its environment, located a bottle of water on a nearby table, grasped the bottle, and delivered the water to the engineer. The demo looks similar to Googleโs PALM-E multimodal modal that controls robot with natural language.
Why it Matters: The development and deployment of industrial robots have long been challenging, with specific task requirements requiring manual code writing and debugging by engineers.
Alibaba Cloud said LLM opens a new window for industrial robots.
Engineers can use these models to generate code instructions to develop robots, creating and debugging functions and even designing new capabilities.
LLMs can also help perform complex tasks by combining basic abilities like grabbing and moving.
Imagine frontline workers can send text commands, which a LLM then translates into machine-understandable code.
Manufacturing is an important battleground for AI models. The biggest opportunities in the next decade lie in the fusion of cloud, AI, and physical machines. Robots bringing water is just the first step; smart robots capable of conversing directly with humans will change the entire factory landscape, said Zhang.
What else: Alibaba Cloud announced that over 200,000 enterprise users have applied to access the AI model for testing.
ByteDance Refutes Rumor of Poaching OpenAI Employee with $1.4M Annual Salary
Whatโs new: ByteDance has denied a rumor that the TikTok parent company attempted to poach an employee from ChatGPT developer OpenAI by offering a $1.4 million annual salary.
The rumor first circulated on social media and Chinese media outlets about two weeks ago, claiming that ByteDance approached one of the five Chinese OpenAI engineers involved in ChatGPTโs development. According to the rumor, the engineer had not accept the offer.
Talent acquisition: Regardless of the rumorโs factuality, talent acquisition in the LLM field has become highly competitive, with ByteDance and other Chinese tech companies seeking top talent. Engineers working on Baiduโs ERNIE Bot project are reportedly in high demand, with three-year employees potentially doubling their salaries. The South China Morning Post reported that China's demand for AI talent has tripled over the past five years, with roles involving pre-training models, conversational bots, and AI-generated content (AIGC) being especially sought after.
Weekly News Roundup
๐ Microsoft President Brad Smith told Nikkei Asia that Chinese research organizations and companies will become major rivals to ChatGPT, developed by OpenAI. He saw three at the absolute forefront: OpenAI with Microsoft, Google, and the Beijing Academy of Artificial Intelligence.
๐ฎ miHoYo, the game developer of Genshin Impact and the newly released โHonkai: Star Railโ, has started exploring AI to assist in game development. In collaboration with Star Rail, the company has incorporated AI technology into NPC behavior patterns as well as 3D modeling.
๐ค Baidu AI Cloud unveiled the enterprise version of ERNIE Bot through Baidu AI Cloud LLM Platform, claiming it is the world's first one-stop enterprise-level large model platform. It aims to provide public cloud services and private deployment.
๐ iFlyTek started limited trials of Spark Desk, its ChatGPT-like LLM.
4๏ธโฃ Chinese AI company 4Paradigm unveiled SageGPT, their response to ChatGPT aiming to rebuild enterprise software using generative AI. Following three unsuccessful attempts, the firm filed for a Hong Kong IPO last week, despite being added to the US Entity List in March.
Trending Research
DETRs Beat YOLOs on Real-time Object Detection
Affiliations: Baidu
End-to-end transformer-based detectors (DETRs) have shown impressive performance but suffer from high computational costs. This paper introduces Real-Time DEtection TRansformer (RT-DETR), the first real-time end-to-end object detector. It features an efficient hybrid encoder and IoU-aware query selection to improve object query initialization. RT-DETR supports flexible inference speed adjustment without retraining, outperforming YOLO detectors in both speed and accuracy. Source code and pretrained models will be accessible at PaddleDetection.
Evaluating ChatGPTโs Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness
Affiliations: Peking University, Boston University
This paper evaluates ChatGPTโs capabilities using 7 fine-grained information extraction (IE) tasks, assessing performance, explainability, calibration, and faithfulness. Findings show that ChatGPT performs poorly in Standard-IE but excels in OpenIE. The model provides high-quality, trustworthy explanations but tends to be overconfident in its predictions, resulting in low calibration. ChatGPT is mostly faithful to the original text. The study releases annotated test sets of 7 IE tasks (14 datasets) to further research, with datasets and code available at the provided URL.
MVImgNet: A Large-scale Dataset of Multi-view Images
Affiliations: CUHKSZ
MVImgNet, a large-scale dataset of multi-view images, is introduced as a counterpart to ImageNet for 3D vision. Containing 6.5 million frames from 219,188 videos across 238 classes, it offers annotations for object masks, camera parameters, and point clouds. MVImgNet demonstrates promising performance in various 3D and 2D visual tasks. Additionally, a 3D object point cloud dataset, MVPNet, is derived from MVImgNet, benefiting real-world 3D object classification and posing challenges to point cloud understanding. Both MVImgNet and MVPNet will be publicly available to inspire the broader vision community.