💰ByteDance Invests $1B in GPUs & Tests AI Bot; How AI Content is Poisoning Chinese Internet; Highlights from Altman, Hinton, and LeCun at BAAI Event
Weekly China AI News from June 11 to June 18
Dear readers, my apologies for the missed issues over the past two weeks. I recently returned to China for the first time in three years and have been spending quality time with my family, friends, and colleagues! Shanghai and Beijing are soooo hot! This week's issue is packed with great content. Enjoy!
ByteDance Beefs Up AI Efforts with $1B GPU Purchase
What’s new: Chinese tech companies are scrambling to purchase GPUs from NVIDIA, including ByteDance. This year, ByteDance ordered over $1 billion worth of GPUs from Nvidia, including 100,000 A100 and H800 cards, according to Chinese media outlet LatePost. The H800 only started production in March of this year.
ByteDance's purchase alone is close to the total sales revenue of Nvidia's commercial GPUs in China last year, LatePost reported. The estimate is based on this: In September last year, when the US government imposed export restrictions on A100 and H100. Nvidia responded that this could potentially affect its potential sales of $400 million in the Chinese market in Q4, 2022. Based on this calculation, Nvidia's sales of data center GPUs in China for the entire year of 2022 would reach $1-1.6 billion.
Another Chinese company also ordered over RMB 1 billion worth of GPUs from NVIDIA.
More details: The exclusive report from LatePost revealed more juicy details:
After OpenAI released GPT-3 in June 2020, ByteDance trained a generative large language model with billions of parameters, primarily using V100 GPUs. Due to the limited scale of parameters, the model's generation capability was less impression. ByteDance saw no commercial potential and dismissed the project.
Before the boom of large models, the primary issue for China’s cloud manufacturers was not the scarcity of GPU computing power, but rather the difficulty in selling it. Last year, Alibaba Cloud reduced its prices six times, with the rental price of GPUs dropping by more than 20%. The situation has been reversed after ChatGPT was launched in November, 2022. Most Chinese tech giants believe LLMs or foundation models are the future.
Dr. Yaqin Zhang, Dean of Institute for AI Industry Research of Tsinghua University (AIR), said at a Tsinghua event this April, “If you add up all of China’s computing power, it's equivalent to 500,000 A100s.”
“Although the A100 is expensive, it is actually the cheapest to use,” one engineer commented.
One more thing: ByteDance is reportedly training its chatbot product, named Grace.
How AI Content is Poisoning the Chinese Internet
What’s new: Large language models (LLMs) like ChatGPT and Bing Chat are now connected to the Internet and can provide instant responses, often supplemented by third-party links for fact-checking references. However, it's crucial to be cautious, as AI might also generate some of this Internet content. LLMs can create seemingly flawless content that might turn out to be incorrect.
What happened: A Chinese user recently shared his event: He asked the new Bing a question about the availability of a cable car at Xiangbi Mountain, a popular tourist spot in China. Bing, powered by GPT-4, did provide a seemingly accurate answer, complete with ticket prices and operating hours. However, the user felt something was off and decided to dig deeper.
Upon further research, the user discovered that the source of the response was a profile named “Life of Change” on Zhihu, a Quora-like Q&A platform in China. To his surprise, he found out that “Life of Change” was actually an AI bot capable of responding to multiple queries within minutes. However, the bot’s answers, while fast, were not verified and largely incorrect.
What now: “Life of Change” has been silenced by Zhihu. Investigations have revealed several AI bots of a similar nature operating on the platform.
Why It Matters: These AI bots, according to the user, are polluting the Chinese internet with unverified information. This case serves as a reminder of the potential risks and challenges of integrating AI into platforms where reliable information is paramount. It's a wake-up call for AI developers and platform owners to monitor and ensure the credibility of the information provided by AI bots.
What Altman, Hinton, and LeCun Say at BAAI Event
What’s new: The Beijing Academy of Artificial Intelligence (BAAI) stands as one of the few Chinese institutions with the capacity to gather such distinguished guests as Sam Altman, OpenAI CEO, and Turing Award laureates Geoffrey Hinton and Yann LeCun under a single roof. At the BAAI Conference, these esteemed figures in the AI field, together with notable scientists like Stuart Russell from UC Berkeley, shared their insights on AI. But what did they say during their discussions?
Sam Altman: AGI needs governance and global cooperation.
Like many of his addresses, Sam Altman, the creator of ChatGPT, remains optimistic about the potential benefits AGI could bring to humanity. He foresees AGI “surpassing human expertise in nearly every domain” within the next decade. However, he notes that “we must manage the risks together to reach that point,” and that “the stakes for global cooperation have never been higher.”
Altman proposes the establishment of “international norms and standards through an inclusive process” and encourages “international cooperation to build global trust in the safe deployment of AGI.” You can watch the full speech here.
Altman further expressed his respect for the Chinese AI community, stating, “China has some of the best AI talent in the world. So I really hope Chinese AI researchers will make great contributions here.”
Yann LeCun: Nobody in their right minds will be using auto-regressive models within 5 years.
LeCun, Chief AI Scientist at Meta, is not a big fan of large language models (LLMs) like OpenAI’s GPT series. His argument is these models make factual errors, logical errors and inconsistencies because they have limited reasoning abilities and no knowledge of underlying reality.
LeCun has long championed a concept he coined as the “World Model,” which he also emphasized during his speech at the BAAI event. World Model allows the system to imagine scenarios and predict what will happen as a consequence of its actions.
Central to the World Model is what LeCun describes as “Model Predictive Control (MPC),” which involves using a wire model to predict the state of the world at time t+1 based on an imagined proposed action. You can watch the video for the full speech.
During the Q&A session, LeCun predicted that “within 5 years, nobody in their right minds will be using auto-regressive models.”
Geoffrey Hinton: Will artificial neural networks soon be smarter than real neural networks?
Ever since AI pioneer Geoffrey Hinton left Google in April, he has been increasingly vocal about the potential risks associated with AI. Hinton believes once AI start learning directly from The Real World, they'll be able to learn hugely more than people.
Hinton hopes that young researchers will figure out how to create super intelligences that can make life better for humans without taking control, but he is nervous because he doesn't know any examples of more intelligent things being controlled by less intelligent things when the intelligence gap is big.
“If frogs had invented people, who do you think would be in charge now? The frogs or the people?” You can watch the full speech here.
Weekly News Roundup
🎨 Visual China has joined the generative AI bandwagon with the unveiling of its AI-inspired drawing feature. This pioneering tool empowers users to convert text into different styles of images, including photography, cartoons, 3D renderings, and illustrations - a novel way of enriching communication and visual experiences.
🚙 Baidu’s Apollo Go secures licenses for commercial operation of fully driverless ride-hailing in Shenzhen, joining Beijing, Chongqing, and Wuhan. Robotaxis are allowed to operate within a 188 sq km area from 7 am to 10 pm.
🧮 iFlytek's ChatGPT-like model SparkDesk has been upgraded to version 1.5, supporting advanced functions such as voice and multimodal input, open knowledge Q&A, logic reasoning and mathematics, and multi-round dialogues.
🤗 BAAI has released the open-source Wudao 3.0. This project incorporates the "Vision" and "Skyhawk" series for visual and language models, respectively, alongside an innovative evaluation system, "Balance".
🤖 A digital human created by a Taobao store named “卢咪微Lumiwink” drew more than 160,000 viewers in just two hours during the recent 618 Taobao Bargain Festival.
🎥 Alibaba's DAMO Academy is making waves with the introduction of Video-LLaMA. This AI model, with its ability to perceive and understand audio-visual content, is paving the way for sophisticated video-content interaction.
💸 JD.com is using Baidu's text-to-image AI, Wenxin Yige, to generate offline ads for the 618 Shopping Festival, approximately cutting poster production time by 70% and cost by 80%.
Trending Research
Disentangling Writer and Character Styles for Handwriting Generation
Affiliations: South China University of Technology, National University of Singapore, The Hong Kong Polytechnic University, Pazhou Laboratory
RNN-based methods for generating stylized online Chinese characters often overlook subtle style inconsistencies. Therefore, we propose the style-disentangled Transformer (SDT) that extracts style at writer and character levels to produce realistic handwriting. SDT uses two contrastive objectives to capture commonalities and individual style details. Tests on different language scripts demonstrate its effectiveness. Results show that distinct style extractions provide information at varying frequencies, underlining the significance of this approach. Source code is publicly available.
Learning Imbalanced Data with Vision Transformers
Affiliations: Shenzhen International Graduate School, Tsinghua University, China
Existing Long-Tailed Recognition (LTR) methods rarely train Vision Transformers (ViTs) with Long-Tailed (LT) data, resulting in skewed comparisons. Our paper introduces LiVT, which trains ViTs from scratch using only LT data. Masked Generative Pretraining (MGP) is implemented for more robust feature learning. To overcome challenges with Binary Cross Entropy (BCE) loss, we propose a balanced BCE, which fast-tracks ViT convergence. LiVT, combining MGP and Bal-BCE, outperforms competing methods significantly without additional data.
baichuan-7B is an open-source, large-scale pre-trained language model developed by Baichuan Intelligent Technology. baichuan-7B is based on Transformer architecture, which contains 7 billion parameters and trained on approximately 1.2 trillion tokens. It supports both Chinese and English languages with a context window length of 4096. It has achieved the best performance among models of the same size on standard Chinese and English authoritative benchmarks (C-EVAL, MMLU, etc).