Huawei's large model is finally here, my evaluation is: quite shocking

Original source: Bad review

Image source: Generated by Unbounded AI‌

Huawei, which has always been said to be lagging behind in the large-scale model competition, finally came with its guys this time.

No, at yesterday's Huawei Developer Conference 2023, Huawei showed off.

The nearly three-hour press conference still inherited Huawei's past hodgepodge style, which made Shichao dazzled.

However, summing it up actually highlights a theme: Pangu Large Model 3.0.

In fact, just a few days ago, when other large models were still comparing various ratings, Pangu entered everyone's field of vision in a unique way by relying on the golden signboard certified by the world's top journal Nature.

It is said that with the addition of the Pangea large model, the weather prediction speed has been increased by more than 10,000 times, and the results can be obtained in a few seconds. Where the typhoon will come, when it will come, and when it will leave, it can give you a clear prediction.

The most important thing is that its prediction accuracy even surpasses the IFS system of the European Meteorological Center, which is known as the world's strongest. It is the first AI prediction product that has won the traditional numerical prediction.

You know, most of the previous AI weather forecasts were developed based on 2D neural networks, but the weather is too complicated, and 2D is really too much.

Moreover, the previous AI model will continue to accumulate iteration errors during the prediction process, which will easily affect the accuracy of the results.

Therefore, AI prediction methods have not been popular.

The Pangu meteorological large-scale model is awesome. They used a three-dimensional neural network called 3DEST to process meteorological data. If 2D can't do it, they can use 3D.

3DEST's Network Training and Inference Strategy

Aiming at the problem of iteration error, the model also uses a "hierarchical time-domain aggregation strategy" to reduce iteration error and improve forecast accuracy.

Although this word sounds easy to be fooled, it is actually very easy to understand.

For example, the previous AI weather forecasting model FourCastNet, before the typhoon comes, it will make a forecast 6 hours in advance, and during these 6 hours, the model will calculate repeatedly when the typhoon will come.

It may be calculated for 5 hours for a while, and 4 and a half hours for a while, and the error will be large if these results are added together.

But the Pangu Meteorological Large Model thought of a way to train 4 models with different forecast intervals, one iteration per 1 hour, and one iteration per 3 hours, 6 hours and 24 hours.

Then, according to the specific weather forecast requirements, select the corresponding model for iteration.

For example, if we want to predict the weather in the next 7 days, let the 24-hour model iterate 7 times; predicting 20 hours means 3 iterations of the 6-hour model + 2 iterations of the 1-hour model.

**The fewer iterations, the smaller the error. **

This wave of operations has brought weather forecasting to a new level.

However, some friends may have started to mutter. People’s large models are all generated images and texts. How did Huawei become a weather forecast?

One thing to say, this Pangu model is indeed different from the ChatGPT and Midjourney we have come into contact with before. They are doing business in the industry.

To understand it simply, it means that we personally don't use the Pangu model.

It is not the ChatGPT "nemesis" that everyone expects, but it is aimed at the To B market that is not usually accessible. **

Let’s not mention the difficulty or not, at least the enterprise customer resources that Huawei has accumulated over the years are really easy to cash out.

Moreover, Huawei's press conference this time not only brought the ruthless role of the weather forecasting model.

No new antibiotics have been discovered for more than 40 years, and the super antibacterial drug Drug X was found as soon as the Pangea drug molecular model came, and the drug development cycle was shortened from several years to several months, and the research and development costs were reduced by 70%.

The large model of the Pangu Mine can also go deep into more than 1,000 processes of coal mining, and the selection of clean coal alone can increase the recovery rate of clean coal by 0.1% to 0.2%.

You know, for a coal preparation plant with an annual output of 10 million tons of coking coal, every 0.1% increase in the clean coal production rate can increase the annual profit by 10 million.

**This is all white money. . . **

In fact, in addition to the weather forecasting, drug development and coal preparation mentioned above, the Pangea model has been used in many industries.

At the press conference, Tian Qi, Chief Scientist of HUAWEI CLOUD AI, said that HUAWEI CLOUD AI projects have been applied to more than 1,000 projects, 30% of which are used in the customer's core production system, boosting customer profitability by an average of 18%. % .

Huawei is able to mass-produce these large models of various industries, thanks to the 5+N+X three-layer architecture of Huawei Pangu Model 3.0.

It is this structure that allows Pangu to quickly land in various industries.

Why do you say that?

Because AI is landing in the industry, data is a major difficulty.

Zhang Pingan said at the press conference, "Due to the difficulty in obtaining industry data and the difficulty in combining technology with the industry, the implementation of large models in the industry has been slow."

**Pangu is very ingenious, through the three-tier structure of 5+N+X, directly split this big problem into 3 small problems to solve. **

First of all, the five large models of Pangu's L0 layer learned hundreds of terabytes of text data such as encyclopedia knowledge, literary works, program codes, and billions of Internet images with text labels.

We can understand that first let the first-level L0 large models (the five basic large models of natural language large model, visual large model, multimodal large model, prediction large model, and scientific computing large model) establish basic recognition. Knowing, it is a bit like the quality education stage before our university.

Then, the model in the second layer L1 is formed by learning the data of N related industries from a certain basic large model in L0. This is like the undergraduate stage of a university, where you need to choose a variety of majors to study.

For example, the CT image inspection in the hospital and the image quality inspection in the factory use large visual models.

But after all, one is a hospital and the other is a factory, and the usage scenarios are completely different. It will definitely not work to rely on the basic large model alone, but if the industry data is added, there may be surprises.

The last L2 is similar to graduate students, and will be refined to a certain scene on the basis of specific industries. For example, in the warehousing and logistics industry, different deployment models may be required for the transportation, warehousing, and outbound of goods.

At the same time, Huawei has also added a feedback link, which is a bit like an internship in the company.

According to them, it usually took 5 months to develop a GPT-3 scale industry model in the past; with this set of tools, the development cycle can be shortened to 1/5 of the original.

At the same time, the limitations of small data sets in many industries can also be resolved. For example, a very detailed industry such as the manufacture of large aircraft can also have large models.

In addition to this set of large models, Huawei also proposed a very interesting thing this time-localization of computing power.

As we all know, we are really embarrassed in terms of AI computing power.

First, we cannot buy Nvidia's H100/A100, the core equipment of the AI industry. Second, even if Nvidia "intimately" released a replacement for the H800, we still have reservations. For example, the transmission rate has been cut a lot.

In the context of a large model that takes several months to train, it is easy to be overtaken by foreign counterparts with stronger computing power.

And this time, to solve this problem, Huawei still took out some real guys.

For example, in terms of performance on paper, Huawei's Ascend 910 processor is already better than Nvidia's A100.

However, in practice, there are still some gaps. And the A100 isn't Nvidia's ultimate weapon either.

However, Shengteng has been recognized by many friends. Huawei even directly stated at the press conference that "the computing power of half of China's large models is provided by them."

Of course, Huawei's bright spots in computing power at the moment are more likely to be brought about by the entire software ecosystem.

For example, according to the press conference, count the AI Ascend Cloud Computing Power Base and the computing framework CANN. . . In other aspects, Huawei's efficiency in training large models is 1.1 times that of mainstream GPUs in the industry.

Also, they have developed a full set of application packages for users.

For example, Meitu migrated 70 models to Huawei Ecosystem in just 30 days. At the same time, Huawei also stated that ** with the efforts of both parties, the AI performance has improved by 30% compared with the original solution. **

Still quite impressive.

Moreover, Huawei also said that they now have nearly 4 million developers. This number is aligned with the NVIDIA CUDA ecosystem.

This series of actions can be regarded as making up for part of the shortcomings. **

Generally speaking, after watching a Huawei press conference, the bad reviewers feel that Huawei’s layout in AI is very profound, and they have already begun to think about the question of "what AI can really bring us".

In the past six months, although the AI industry has received thunderous applause, it is somewhat embarrassing when it really falls to the industry level.

And this action of Huawei just confirmed what Ren Zhengfei said:

*" In the future, there will be a surge in AI large models, not just Microsoft. The direct contribution of artificial intelligence software platform companies to human society may be less than 2%, and 98% is the promotion of industrial society and agricultural society. " *

In the field of AI, the real big era is yet to come.

View Original
The content is for reference only, not a solicitation or offer. No investment, tax, or legal advice provided. See Disclaimer for more risks disclosure.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)