Huawei Releases Large Autonomous Driving Dataset; US-listed Chinese AI Firm To Be Private; Introducing Prompt Learning
China’s AI news in the week of August 8, 2021
Huawei Releases Large-Scale Autonomous Driving Dataset With 10M Unlabeled Images
What’s new: Huawei has open-sourced a large-scale object detection dataset for autonomous driving, named SODA10M, containing 10 million unlabeled images and 20K images labeled with 6 representative object categories, including car, truck, pedestrian, tram, cyclist, and tricycle. The benchmark will be used to hold the ICCV2021 SSLAD challenge.
Huawei vs. Waymo: Before SODA 10M, large-scale open-sourced datasets like KITTI, BDD100K, and Waymo Open Dataset have raised research interests and helped push the boundary of autonomous driving technologies. Moving a step forward, Huawei indicates that these datasets either provide “only a small amount of data or covers limited domains with full annotation.” SODA10M is 10 times larger than Waymo Open Dataset, researchers claimed.
Images from SODA10M are collected every ten seconds per frame within 32 different cities under different weather conditions, periods, and location scenes, according to the paper SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous Driving. In comparison, Waymo Open Dataset records three cities while BDD100K from the University of California at Berkeley taps two.
Better nighttime performance: Experiments showed that by adding diverse unlabeled data into training, the self/semi-supervised methods show a more significant improvement in the night domain than supervised methods.
In another paper, One Million Scenes for Autonomous Driving: ONCE Dataset, Huawei introduced the ONCE (One millioN sCenEs) dataset for 3D object detection, encompassing 1 million LiDAR scenes 7 million corresponding camera images. The data is selected from 144 driving hours.
Bosche of China: Pressured under the U.S. sanction, Huawei is doubling down on smart vehicles and autonomous driving in a move to become the Bosche of China. At the Shanghai Auto Show this April, Huawei joined hands with state-owned BAIC and debuted the Arcfox Alpha S model, equipped with Huawei’s operating system HarmonyOS and self-claimed L4 autonomous driving capabilities.
Chinese AI-Driven English Tutoring Firm Is Set to Be Privatized
What’s new: LAIX, also known as Liulishuo, is a Chinese education technology company specializing in AI to personalize English tutoring. The company went public on the New York Stock Exchange at a debut price of $12.5 per share and raised $72 million. However, the company took a sudden turn for the worse and lost 90 percent of its value since then. The company’s founders now plan to take it private.
On August 4, LAIX announced that its board of directors had received a preliminary non-binding proposal letter from a buyer group comprising of founders for a proposed purchase price of US$1.13 per share in cash, 15 percent higher than the 7-day average share price.
What happened to Liulishuo? The significant drop in its share price is an exact reflection of Liulishuo’s embattled business operation. Earlier this year, Liulishuo reported its Q1Y21 revenue of RMB198.5 million yuan, down 15% QoQ and 13% YoY. In addition, R&D expenses were RMB33.7 million yuan (US$5.1 million), a 43.1% decrease from RMB59.2 million yuan for the same quarter last year.
The number of paying users who purchased the company’s courses and services for the first quarter of 2021 is 0.3 million, one-third of the same quarter last year. However, while AI remains a selling point of Liulishuo - leveraging deep learning to create customized English courses for users - the advantages could be offset by unappealing courses, dwindling interests in English learning among Chinese adults, and poor management. Liulishuo was also zeroing in China’s K-12 market but hammered by China’s recent crackdown on the education industry.
A Survey of Prompt-Based Learning in NLP
Tell me the background: Supervised learning is a machine learning task of learning a function that maps an input to an output based on example input-output pairs. This approach has been long a major training method in natural language processing (NLP). However, since the creation of Transformer, a new “pre-train, fine-tune” paradigm becomes mainstream. A language model with a fixed architecture is first pre-trained on massive text corpora, then adapted to different downstream tasks introducing additional parameters and fine-tuning.
What’s new: A team of Chinese researchers at CMU argued this paradigm is shifting as well to a new “pre-train, prompt, predict“ pipeline, which they coin as Prompt Learning. Instead of designing a training objective, researchers only need to devise appropriate “prompts” to train models for specific tasks, like cloze tests. For example:
“When recognizing the emotion of a social media post, “I missed the bus today.”, we may continue with a prompt “I felt so ” and ask the LM to fill the blank with an emotion-bearing word. Or if we choose the prompt “English: I missed the bus today. French: ”), an LM may be able to fill in the blank with a French translation.”
The advantage of such a method is given a suite of appropriate prompts, a single LM trained in an entirely unsupervised fashion can be used to solve a great number of tasks.
In the paper “Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing,” researchers reviewed existing works and created a structured typology of prompt-based concepts. Researchers said prompt learning is effective in knowledge probing, classification-based tasks, information extraction, reasoning in NLP, question answering, text generation, automatic evaluation of text generation, multi-modal learning, and meta applications.
Investment News
Chinese EV upstart Li Auto has announced the pricing of its global offering in its secondary Hong Kong listing. The company will raise $1.5 billion as the final price has been set as HK$118.00 per offer share, equivalent to US$30.36 per ADS. The company is expected to begin trading on the Main Board of the Hong Kong Stock Exchange on August 12, 2021.
Inceptio Technology, a Shanghai-based autonomous driving truck technology company, announced the closing of a US$270 million Series B equity financing. This round of financing was jointly led by JD Logistics, Meituan, and PAG. Deppon Express. Founded in 2018, Inceptio aims to build a nationwide freight network using autonomous truck technology. Key shareholders include Chinese battery giant CATL, NIO Capital, G7, and GLP.
Axera Tech, a Beijing-based semiconductor company developing AI SoCs for computer vision applications, has raised hundreds of millions yuan in its Series A+ funding round, led by Weihao Chuangxin, Meituan, and GGV Capital. Founded in 2019, Axera claimed its first AI SoC was successfully taped out to Taiwan Semiconductor Manufacturing Company (TSMC) in barely nine months after its conception.