🤖Inside Tencent Hunyuan, Ant's Financial LLM, and Zhipu AI's Rising Valuation
Weekly China AI News from September 4 to September 10
Dear readers, in this issue I will delve into the latest LLMs introduced by Tencent and Ant Group. Additionally, I examine how Zhipu AI, with backing from Meituan, is reportedly surging in valuation beyond RMB 10 billion. I just discovered an interesting study by Chinese researchers that trains “GPT-3” on a budget of just $100,000!
Tencent’s Hunyuan LLM Cuts Hallucination by 30-50% Over Llama 2, Now Available for Enterprise Use
What’s new: Tencent has unveiled its foundation model, Hunyuan, consisting of over 100 billion parameters and pre-trained on 2 trillion tokens. Hunyuan is now accessible to Chinese enterprises through APIs on Tencent Cloud. The company has yet to open a widely-anticipated consumer-focused chatbot, akin to Baidu’s ERNIE Bot or ByteDance’s Doubao. Interested users can only join a WeChat waitlist for Tencent Hunyuan Aide.
How it works: Hunyuan boasts a trio of exceptional capabilities: Robust Chinese language creation abilities, complex logical reasoning in intricate scenarios, and reliable task execution.
Hunyuan has been woven into 50 of Tencent’s core products, including but not limited to Tencent Cloud, Tencent Games, and Tencent Meeting, which now features a Hunyuan-fueled AI assistant.
Hunyuan boasts its algorithms that can reduce hallucinations, the tendency of LLMs to produce nonsensical outputs, by 30-50% compared to mainstream open-source LLMs like Llama 2.
My two cents: Highlighting a reduction in hallucinations is great. Few other Chinese firms have publicized breakthroughs in addressing this issue, which is, in my view, the most significant hurdle in commercializing LLMs. However, comparing the performance to Llama 2 may not offer compelling evidence, given that Llama 2 has known limitations in handling the Chinese language.
The Hunyuan API, offering 8K context lengths, charges RMB 0.14 per 1,000 tokens—the same as the rate for Alibaba’s qwen-plus-v1. In contrast, the GPT-4 API with the same 8K context length comes at a higher cost of RMB 0.22 per 1,000 tokens.
Ant Group Introduces Specialized Financial Language Model
What’s new: Ant Group has unveiled its specialized Financial LLM, along with two notable applications: Zhixiaobao 2.0, a consumer-focused intelligent financial assistant, and Zhixiaozhu 1.0, an assistant aimed at financial industry professionals.
How it works: Ant Group’s Financial LLM is designed with industry-specific needs in mind. The company said it’s fine-tuned on a rich dataset comprising hundreds of billions of tokens from Chinese financial documents, supplemented by more than a trillion tokens from a general corpus. It also incorporates instructions from over 300 real-world financial scenarios, boosting its efficacy in specific tasks.
According to its in-house Fin-Eval benchmark, the Financial LLM outperforms general-purpose LLMs. Zhixiaobao 2.0, one of its new applications, offers services like market analysis and portfolio suggestions with a 95% accuracy rate in recognizing financial intentions. Meanwhile, Zhixiaozhu 1.0 caters to financial professionals, aiding in tasks such as investment analysis and content creation.
Both Zhixiaobao 2.0 and Zhixiaozhu 1.0 are currently in closed tests on Ant Group’s wealth management and insurance platforms. They will be publicly available pending regulatory approval.
Why it matters: The competition heats up between general-purpose and industry-specific LLMs in the enterprise market as companies like Ant Group enter the fray. An Ant executive said general-purpose LLMs will find it challenging to bring value to the finance industry.
ChatGLM Creator Reportedly Values Over RMB10 Billion
What’s new: Zhipu AI, a Chinese AI startup behind ChatGLM, has seen its valuation soar past the RMB10 billion ($1.38 billion) mark, according to Chinese media outlet Leiphone. The company is said to raise funding at a post-investment valuation of RMB14 billion ($1.93 billion).
Fundraising history: Founded in June 2019, Zhipu AI has completed multiple rounds of funding. It started with a Series A round in 2021, and its most recent funding was a Series B-2 round led by Meituan, raising over $100 million. An AI spinoff of Tsinghua University, the company has expanded from 200 to over 500 people in just the past few months, with plans to reach 1,000 employees by year’s end.
Zhipu AI’s model portfolio includes GLM-130B LLM and open-source ChatGLM-6B model, CodeGeeX coding model, and CogView text-to-image model. The company’s chatbot, ChatGLM, has received the government’s greenlight and has been open to the public starting August 31.
Business model: As reported by Chinese media, Zhipu AI strategically focuses on to-business services, particularly within the information technology sector.
Weekly News Roundup
📣 Alibaba announced the launch of its AI purchasing assistant, “Smart Assistant,” at the Co-Create2023 conference in the U.S. The tool aims to enhance procurement efficiency for global SMEs, particularly for sourcing Chinese goods.
🚀 On September 7, Baidu Marketing released Qingge, claiming it the world’s first AI Native marketing platform. Through generative AI, Qingge is said to improve ad conversion rates by over 20%.
🛡️ The Deputy Director of China’s Ministry of Industry and Information Technology, Du Guangda, emphasized a balanced approach between innovation and law-based governance, implementing “inclusive and prudent regulation” and “graded and categorized supervision” for the rapidly evolving generative AI technologies.
🔓 On September 6, Baichuan AI announced the open-source release of its fine-tuned Baichuan-2 LLMs, which are free to use commercially.
🌟 Baidu AI Cloud revealed its Wenxin Qianfan Foundation Model Platform now serves 400+ enterprise scenarios and is upgrading to V2.0. It also introduced 11 AI-native enterprise-facing applications.
📚 Kingsoft Office has opened its WPS AI to the public, initially incorporating its AI capabilities into WPS smart documents to solve daily office challenges across various WPS product lines.
🔑 iFlytek’s SparkDesk is now open for public registration through its app or official website.
🌐 360 Smart Brain is now available to the public across five platforms, including its app, and integrates with existing 360 products to provide users with enhanced services.
Trending Research
This study introduces a cost-effective approach to training large language models (LLMs), achieving comparable performance to well-known models like GPT-3 for just $100K. The paper also introduces a new IQ evaluation paradigm, focusing on aspects like rule understanding and pattern recognition, thus offering a more rounded assessment of LLM capabilities.
Paper: FLM-101B: An Open LLM and How to Train It with $100K Budget
Affiliations: Beijing Academy of Artificial Intelligence; Institute of Computing Technology, Chinese Academy of Sciences; University of Electronic Science and Technology of China; Harbin Institute of Technology; School of Computer Science and Engineering, Nanyang Technological University
ModelScope-Agent is a new, customizable agent framework designed to expand the capabilities of LLMs like ChatGPT. It allows seamless integration with various APIs and supports model training on multiple open-source LLMs. The framework enhances LLMs with tool-use abilities, covering everything from data collection to real-world application evaluation. Both the library and online demo are now publicly available.
Paper: ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models
Affiliations: DAMO Academy, Alibaba Group
CodeApex is a new bilingual benchmark dataset aimed at evaluating the coding comprehension and generation skills of LLMs. Comprising multiple-choice questions and algorithmic tasks, it provides a comprehensive assessment of LLMs’ capabilities in both conceptual and practical programming aspects. Initial results show room for improvement, positioning CodeApex as a key tool for advancing LLM development in programming tasks.
Paper: CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models
Affiliations: Apex Data & Knowledgement Management Lab, Shanghai Jiao Tong University