The second half of the "100 model war" will start, and the platform will become the key

Original source: Titanium Media

Image source: Generated by Unbounded AI

It has been nearly a year since the large model entered people's field of vision, and under the wave of AI large models, major technology companies have rushed to launch their own large model products. At the same time, enterprises in various industries also pay close attention to large models.

If it is said that the situation of major manufacturers launching large-scale model products to form a "100-model war" is the first half of the "battle" of large models, then the second half of this "battle" will focus more on the integration ability of large-scale model products, as well as the development direction of platform and industrialization.

The second half of the game, platformization and industrialization will become the key track

Taking ChatGPT, the "originator" of the large model, ChatGPT is an application and can be regarded as an APP, while GPT4 is a large model, building an ecology similar to a large model, so that enterprises can build their own large models based on this.

From the above cases, it can be seen that in the past nearly a year, all companies have focused on polishing products similar to "ChatGPT", and the landing is on the application side, while for the enterprise side, the industry still lacks a platform that allows enterprises to flexibly call each large model product, or open a large model for their own needs based on a certain product. According to Li Gang, vice president and CTO of Digital China, if a large model wants to achieve the explosion of applications on the enterprise side, it needs one, or even multiple open-source and open large model platforms.

When it comes to the application of enterprise-level large models, we have to mention the industry large model, titanium media observation found that the current industry-level large model is still in the initial stage of development, although there are many companies that have launched the industry large model, but the application is not very good.

Taking the fast-growing financial industry as an example, in March this year, Bloomberg launched BloombergGPT, a large language model for the financial industry, which attracted the market's attention to large models in financial verticals, and in June, Columbia University and NYU Shanghai launched FinGPT.

In China, in July, Huawei released the Pangu model, one of which is one of several industry-wide models. In September, Ant Group officially released its self-developed "Ant Basic Model" and the customized "Ant Financial Model" on this basis.

Li Gang told Titanium Media that the types of large models on the market are mainly divided into several categories, one is the general basic model, generally speaking, these large models build a database through the corpus of natural language, and after cleaning, training and other operations, the basic large model is created, "This kind of model, the larger the corpus, the larger the number of parameters, the stronger the ability." Li Gang said.

The other type is the industry model, which is highly professional and requires a large number of industry knowledge bases, "At present, the corpus of this industry knowledge base needs to be controlled at 20%, no more and no less." Li Gang emphasized, "If it exceeds 20%, the trained large model may 'not be able to speak', causing communication barriers, and less than 20% may not have the professionalism of the industry." ”

"PaaS" layer for building large models

Just as cloud computing is divided into IaaS, PaaS, and SaaS, in the view of Huang Fu Ziqiao, general manager of Digital China's strategic marketing department, in the era of large models, enterprises also need a PaaS platform similar to the cloud era.

In order to build a platform for enterprises to better use large models, Digital China recently officially released the Shenzhou Wenxue platform, talking about the significance of the platform release, Li Gang said to Titanium Media: "With the Shenzhou Wenxue platform as the core, we do not do the basic large model, but the integration and application development and delivery platform of the large model, so as to accelerate enterprise AI innovation; we are the service partner of big data, so as to accelerate the upgrading of enterprise data governance; we do ecological ties, model markets, data marts, app store, so as to accelerate industrial innovation and ecological breakthrough. ”

At the beginning of this year, HUAWEI CLOUD released the Pangu large model and graded it according to L0, L1, and L2. According to HUAWEI CLOUD, L0 refers to the basic model, L1 refers to the industry model, and L2 refers to the inference model for more subdivided scenarios.

In terms of basic large models, taking the graph network large model as an example, a large model can be adapted to multiple scenarios such as process optimization, time series prediction, and intelligent analysis, and can be applied to multiple industries such as finance, coal mining, and manufacturing.

In terms of industry models, HUAWEI CLOUD has launched industry models such as the Pangu Financial Model, Pangu Mine Model, Pangu Electric Power Model, Pangu Manufacturing Quality Inspection Model, and Pangu Pharmaceutical Molecule Model.

In terms of inference models, for example, based on the Pangu power model, HUAWEI CLOUD launched the Pangu power inspection model for UAV power inspection subdivision scenarios through a pre-training + fine-tuning of downstream tasks, which solves the problems of small-sample learning, active learning, and incremental learning in the UAV intelligent inspection system (defect detection), and solves the problems of large workload of massive data annotation and a wide variety of defects.

The above is HUAWEI CLOUD's understanding of large models and some of HUAWEI CLOUD's industry layouts. Based on this, Huangfu Ziqiao told Titanium Media that Digital China's learning platform will play the role of a "converter" in helping enterprises land from L0 to L2 industry application scenarios, "providing enterprises with the ability to provide a PaaS platform similar to the cloud computing era." Huangfu Ziqiao said.

Coincidentally, Baidu CTO Wang Haifeng has also publicly stated that in the face of the challenge of large-scale model industrialization, the industry needs a similar chip foundry model to adopt the model of "intensive production and platform-based application", that is, enterprises with comprehensive advantages in algorithms, computing power and data will encapsulate the complex process of model production, and provide large-scale model services for thousands of industries through a low-threshold and high-efficiency production platform.

According to Titanium Media, at present, this industrialization path has been verified in the practice of Wenxin large model industry, Baidu and various industry leading enterprises, institutions to build a large model including energy, finance, aerospace, manufacturing, media, city, social science and film and television and other industries.

Lower cost and lower threshold are the goal

Although large models have gradually penetrated into all walks of life, in terms of the development of large models at this stage, the cost of using large models is still prohibitive for many enterprises for enterprise-level users.

Taking GPT-3 as an example, Nvidia has disclosed that it takes 34 days to train GPT-3 with 175 billion parameters, using 1,024 A100 GPU chips, and the cost of a single training is as high as $12 million. To train ultra-large-scale AI models, Microsoft has even built one of the world's top five supercomputers for OpenAI.

At the same time, according to Guosheng Securities' "How Much Computing Power Does ChatGPT Need" report, the pre-training cost of large models is very high, with the cost of one training exceeding one million US dollars. This fee not only covers the model architecture, algorithm selection, and training data selection, but also includes the large amount of computing resources and time required for model training. And with the upgrade of the large model version, its training cost also increases exponentially.

Robin Li, the founder, chairman and CEO of Baidu, also pointed out: "No company can make such a large language model in a few months." Deep learning and natural language processing require years of persistence and accumulation, and cannot be achieved quickly. ”

In the face of such a high cost of using a large model and using the threshold, it is unaffordable for ordinary enterprises, and it is precisely because of this that so far, there is no real perfect landing of the industry large model products on the market. In this regard, Huangfu Ziqiao said that the cost of using large models is the biggest obstacle for many enterprises to apply large models to empower their businesses, and the positioning of the Shenzhou Wenxue platform is to allow enterprises to use large model products at a lower choice cost through open source. "There are two main parts, one is the platform, and the other is the out-of-the-box scenario application. Huangfu Ziqiao told Titanium Media, "On the one hand, these two parts hope to gather more ecological partners to jointly empower users, and on the other hand, they hope that enterprises can use large-scale model products faster and more conveniently." ”

It is the consensus in the industry to reduce the cost and threshold of large models, whether it is a "hard to find" GPU, or high electricity bills, which are the thresholds for enterprises to apply large models at this stage, and such as Shenzhou Wenxue, Baidu Qianfan, Kunlun Wanwei, etc., "different styles", but the same goal - the emergence of platform-level products that "help large models land", as well as the increasing number of partners in the large model ecosystem, the threshold and cost of enterprise application large models will be further reduced. We will also get closer and closer to the inclusiveness of the industry model.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)