Baidu’s ERNIE Bot, Kai-Fu Lee’s AI Venture, ChatGLM, and Ren Zhengfei's Opinion on ChatGPT
Weekly China AI News from Mar 13 to Mar 19
Hey readers, buckle up for a wild ride in AI land. This week, we’ll talk about Baidu’s ERNIE Bot, Kai-Fu Lee’s new AI venture, and ChatGLM powered by one of the best Chinese LLMs out there. Plus, I’ve added a new section to keep you updated on all the AI news from last week! 🚀🚀🚀
Weekly News Roundup
Baidu's ERNIE Bot Unveils With Multi-Modal Generation and Chinese Comprehension
What’s new: Last week, Chinese search giant Baidu unveiled ERNIE Bot, its highly anticipated AI product, making Baidu the first Chinese tech giant to fulfill its ChatGPT promise.
Capabilities: Similar to ChatGPT, ERNIE Bot is a generative AI product and a large language model (LLM) capable of understanding human intentions and providing human-like responses in a conversational interface.
Major differences: At a press conference, Baidu CEO Robin Li highlighted ERNIE Bot’s multimodal generation capabilities, including image, voice, and video, which were not present in ChatGPT. ERNIE Bot also demonstrated superior Chinese comprehension and writing skills.
Tech Deep Dive: ERNIE Bot is built upon Baidu's proprietary models ERNIE (Enhanced Representation through Knowledge Integration) and PLATO (Pre-trained Dialogue Generation Model).
Mixed Reviews: Baidu’s stock dipped during the ERNIE Bot press conference due to an underwhelming pre-recorded demo. However, the stock rebounded by 15% on the second day after analysts provided initial positive feedback on ERNIE Bot.
Bad timing: The ERNIE Bot debut came shortly after the unveiling of GPT-4, OpenAI's most powerful language model, which significantly improves upon GPT-3.5, understands visual inputs, and excels in various human exams.
Can I use it? Starting March 16, 2023, ERNIE Bot will be accessible to users with invitation codes and will soon be available to a wider audience. Baidu also offers access to the ERNIE Bot API via Baidu AI Cloud.
In just two days following the ERNIE Bot announcement, over 90,000 enterprises requested API access, and more than 850,000 individuals signed up for testing.
Early real user feedback: This Zhihu review of ERNIE Bot offers a comprehensive and objective perspective:
Relatively objectively, I would give ERNIE Bot a passing score of 65 points. It’s very courageous of Baidu to release the product, distribute invitation codes for testing, and let everyone have a try. I hope they can quickly expand the testing scope, iterate the product, and integrate it into domestic commercial scenarios as soon as possible.
One more thing: Baidu AI Cloud will hold an event on ERNIE Bot cloud services and applications on March 27.
Kai-Fu Lee Launches Project AI 2.0 to Build China’s ChatGPT
What’s new: Kai-Fu Lee, Sinovation Ventures Co-founder and former Google China Chief, unveiled his new company, aiming to capitalize on recent AI megatrends.
The company, named Project AI 2.0, is described as “a global company building an AI 2.0 platform and productivity applications.” Lee announced on his WeChat that the new company seeks talent in Foundation Models, Multi-modality, NLP, distributed computing, and infrastructure.
AI 1.0 vs AI 2.0: Lee said AI 1.0 is a period characterized by Convolution Neural Networks (CNNs) at its core, and machines begin to outperform humans in areas like CV and natural language processing (NLP). However, AI generalization remains distant.
According to Lee, AI 2.0 has three key features:
Massive data without manual annotation, utilizing self-supervised learning.
A large Foundation Model requiring thousands of GPUs for training.
Cross-domain knowledge obtained from the trained Foundation Model.
Lee envisions AI 2.0 as a platform shift opportunity that is “10 times bigger than the mobile Internet.”
China AI: In an interview with Bloomberg, Lee said China would catch up with the US in the cutting-edge AI technologies like ChatGPT.
What else do you need to know?
🚙 Baidu and Pony.ai received the first-ever permits to provide fully driverless robotaxi services in Beijing. Each is allowed to deploy 10 vehicles in an area of 60 square kilometers. Baidu now operates robotaxis, with no human drivers, in Beijing, Wuhan and Chongqing.
🤖 Zhipu AI, a tech startup originating from Tsinghua University, unveiled ChatGLM, a dialogue AI similar to ChatGPT and based on its GLM-130B model. ChatGLM was initially available through an invite-only beta testing phase, but the first round has now concluded.
💡 Huawei Founder and CEO Ren Zhengfei expressed his first take on ChatGPT. “What is the opportunity for us with ChatGPT? It will increase computing and traffic, so our products will have market demand. Huawei will focus on the underlying computing power platform of AI that adapts to social needs.”
🎇 SenseTime open-sourced a multimodal multitasking large model called INTERN 2.5, which has 3 billion parameters. INTERN 2.5 achieved 90.1% Top1 accuracy on ImageNet and 65.5 mAP on the COCO benchmark dataset for object detection.
🚙 WeRide, a Chinese self-driving tech startup, has confidentially filed for an initial public offering in the US and is seeking to raise up to $500 million, Bloomberg reported.
Trending Research
Composer: Creative and Controllable Image Synthesis with Composable Conditions
Affiliates: Alibaba Group, Ant Group
This work introduces a new generation paradigm called Composer, which allows for flexible control of output images while maintaining synthesis quality and creativity by decomposing images into representative factors and training a diffusion model with these factors as conditions, leading to a huge design space for customizable content creation and supporting various levels of conditions without retraining, and code and models will be available.
Masked Images Are Counterfactual Samples for Robust Fine-tuning (CVPR’23)
Affiliates: Sun Yat-sen University
This paper proposes a novel fine-tuning method that uses masked images as counterfactual samples to improve the robustness of the fine-tuning model by breaking the spurious correlation and refilling the masked patches with patches from other images, which can achieve a better trade-off between in-distribution and out-of-distribution performance, surpassing previous methods on the out-of-distribution performance.
One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
Affiliates: Tsinghua University, Renmin University of China, Beijing Academy of Artificial Intelligence
The paper introduces a unified diffusion framework, UniDiffuser, that learns diffusion models for marginal, conditional, and joint distributions in a single model, using perturbed data with different perturbation levels for different modalities, and is able to perform various image and text generation tasks with superior quantitative results compared to existing general-purpose models and comparable results to bespoke models.
Noteworthy Stories
Search engine giant Baidu is behind a number of digital employees in China, from McDonald's ambassadors to financial advisers. Others in the country are also experimenting with a more virtual workforce, including in the C-suite. —CNN
Last August, NetDragon appointed an “AI-powered virtual humanoid robot” called Tang Yu as CEO of Fujian NetDragon Websoft. The company’s stock has increased dramatically since then. —The Gamer
AI will bring social changes in China, just like every other country. How might the Chinese government adapt to these changes? — The Diplomat