I am the only one who practices magic: I practice magic in the city
Chapter 523 Tower of Babel
Chapter 523 Tower of Babel (4827)
Breaking the Language Matrix! ?
Sergey Brin shuddered and immediately returned his attention to the densely packed graphs on the screen.
Snatching the mouse from Demis Hassabis, Sergey Brin kept sliding the mouse wheel down.
Dazhou language, Prussian language, Gaulish language, Anglo language, Japanese language, Mao language, Xibai language, Portuguese language, Italian language, Barat language...
More than 300 curve comparison pictures all illustrate this fact:
The various performance curves of Juzi2.5 in fifteen languages are almost all on the same level!
Whether it is understanding and memory, reasoning and cognition, autonomous planning and decision-making, self-optimization and learning, emotion and social simulation, tool calling...
Except for the large fluctuations in generation and expression, the performance of almost all other abilities, especially reasoning and cognition, is almost exactly the same in various language environments!
There is not even one percent difference!
How can this be! ?
This completely goes against the principles of the big model!
"Did they adjust the parameters to bring the performance of all languages into line? Or did they translate into English first, think in English, and then translate back into the thought chain?"
Sergey Brin felt sweat suddenly appear on his head and under his armpits.
Artificial intelligence, the so-called intelligence, is thinking, and thinking requires language. Whether it is a human or a computer, it is impossible to think without language.
This is especially true for large-model AI.
These big models may master all the languages in the world, but when thinking in different languages, the performance of the big models in different aspects is different.
Firstly, it is because of the differences in the amount and quality of training materials for different languages.
In today's Internet age, Anglophone corpus is of course the most abundant, accounting for more than 80% of the total amount of Internet data.
When AlphaZero uses English for understanding and reasoning, its accuracy rate is more than 5% higher than that of other languages.
Secondly, different languages themselves have different "expression preferences" and "performance differences".
For example, Prussian is faster than English in structural reasoning, while Spanish has a clear advantage in sensory corpus.
Large models usually use a single language to build their reasoning path in a single thinking loop.
Although it can recognize multiple languages at the input stage and translate at the output stage, its inherent cognitive tensor structure still tends to use the token space constructed by the input language for semantic calculations.
To put it in simple terms, when the big model is thinking about a problem, in a thinking loop, it will basically only use one language to think. If you use English, it will think in English. If you use Zhouwen, it will think in Zhouwen.
Even if it mixes other languages into its replies, it is just a citation of materials or an imitation of human writing style, rather than true cross-lingual thinking.
How is it possible that the Orange model has similar performance in all aspects when thinking in different languages?
This is totally against common sense!
The only possibility is to align the thinking performance of various languages.
To put it simply, it is like a wooden barrel. Take the shortest board as the base and cut off all the tall boards.
But what’s the point of doing this?
Demis Hassabis shook his head hesitantly: "Probably not. Lacy's operation wasted too many resources."
"As for whether to translate thoughts into English or other languages first..."
Demis Hassabis paused.
"I thought so at first, but after testing, it turned out not to be the case."
After saying that, Hassabis turned to the middle of the lab report.
"Look at the abilities of 'abstract induction' and 'formal reasoning'. Even when using 'Malayalam', Juzi2.5 can still accurately perform abstract induction and formal reasoning."
"For example, in this example, in terms of emotion understanding, our experimenters asked Juzi to think in Malay and output the results in English. In the task response to the experimenters, the Juzi model did not simply translate the Malay word 'manja' into 'pampered' or 'affectionate'."
"Instead, it uses different expressions with actual semantics."
"For example, in the first paragraph, its translation of 'manja' is 'cute and clingy'."
"In the fifth paragraph, the thought chain is still 'manja', but because the subject has changed, the meaning of 'manja' in Malayalam has also undergone a subtle change. At this time, it transforms the 'manja' here into 'being spoiled'."
Demis Hassabis took off his glasses, wiped them, and squinted his eyes. "Originally, there was no accurate translation of the word manja in the Anglophone language. But after this translation, even if a Malayalam who has never learned Anglophone speaks Anglophone, there will be no misunderstanding when he speaks it."
Sergey Brin looked at the graph that Demis Hassabis pointed to, and his hair stood up.
As a tech geek and one of the bosses of Google, he is certainly not the best in technology now, but his understanding and knowledge of artificial intelligence is definitely among the best in the world.
How is this possible?
Because Malayalam is an isolating language, its grammatical structure is relatively flat, and its cultural context is more colloquial and situation-driven, which makes it inherently limited in its expression in abstract, philosophical, technical and other fields.
This results in the lack of some high-level conceptual vocabulary in the language itself, and we can often only rely on descriptive translation or the direct introduction of foreign words.
Philosophical terms such as "consciousness", "existence", "subjectivity" and "objectivity" do not exist in Malayalam and can only be borrowed directly from foreign words or
But at the same time, there are also a considerable number of "flexible words" in Malayalam that are not found in Anglophone and Western languages.
The meanings of these words are often very subtle, and the corresponding words in the dictionary are more or less different.
This results in subtle differences in AI's understanding of the world and relationships when different corpora are used to train large models, or when large models are used in different languages.
This "subtle difference" may seem insignificant, but in fact it is often one of the important causes of cultural misunderstanding and conflict.
"Sergey, here is a more representative one, which is its understanding of 'sin'."
Demis Hassabis tapped the touchpad and swiped upwards: "Look here, this is Juzi2.5G's comparison of Zhou Wen and Ang Wen's thought chains on the same topic."
"Oh, maybe you don't know that sin, in Zhouwen, is generally translated as 'crime', but the meaning of the word 'crime' in Zhouwen is not the same as sin."
Demis Hassabis is of mixed Zhou descent and knows a little Zhou language, but he originally couldn't tell the subtle semantic differences here.
But for a genius like him, once he realized the problem and started to study this aspect, it wouldn't take him long to understand this conceptual difference clearly.
"But all along, no matter what translator, they basically ignored this point and just mechanically translated the Zhou word '罪' as 'sin' and the Ang word 'sin' as '罪'."
"Juzi 2.5 is different. When explaining legal issues, it translates sin normally into 'sin' in Zhouwen."
"In the later question of faith, it uses at least six different expressions of sin in Zhouwen, depending on the context, namely 'disobedience', 'offense', 'fault', 'ungratefulness', 'evil way' and 'suffering'."
"These different expressions, in their respective contexts, capture the closest expression to the original meaning of the word 'sin' in that context, and will not cause the other party to make an erroneous subjective judgment due to subtle differences in wording."
"Oh, yes, even the two common words 'subjective' and 'judgment' show subtle semantic shifts in Zhouwen and Angwen."
Demis Hassabis' eyes were filled with shock after thinking.
Sergey Brin frowned.
He certainly understood every term of Demis Hassabis.
But he has not yet established a complete causal mapping between "Hassabis's linguistic introduction" and "the convergence of Juzi2.5's thinking performance under multilingual conditions."
This seems like just... better translation software?
What does it have to do with the performance of the Orange large model?
No, this performance is beyond the capabilities of the existing multilingual large model - there must be some mechanism behind it that we have not yet mastered.
This mechanism enables the Orange model to deeply understand the precise meaning of different languages in different contexts, and even uses "explanation substitution" and "tone fitting" in translation to translate the original text more accurately.
Wait, what did Demi say before looking at this lab report?
Breaking through the language matrix?
I was a little confused before, what this language matrix is.
So to say……
"Demi, what you mean is... Juzi2.5's thinking is not done in a certain language, but... but..." Sergei scratched his ears and grinned anxiously, but he just couldn't find a suitable word to describe his understanding.
"It is a method that completely breaks down language boundaries and uses all the languages in the world to form a 'high semantic mapping map'."
Demis Hassabis took a deep breath and added to Sergei.
“High semantic mapping graph! That’s right! This is it!”
Sergey Brin slapped his thigh hard!
"However, this term is still too professional. To put it in a more figurative way... it combines languages from all over the world to create an unambiguous language that only it can use and understand!"
"A language that transcends language families and semantic differences... this language can be called a 'whole language'."
After saying this, Sergey Brin's face turned pale. He wiped the sweat from his head, his eyes struggling, as if he still refused to believe it.
"Omni-language, OMG, does such a language really exist?"
"Even in a language that only AI can use?"
"But the problem is that Juzi1.99DEC is open source and does not have this function at all."
Demis Hassabis nodded and said, "Yes, not only 1.99DEC, but also the earliest version 2.5 did not achieve this. At that time, the performance differences under different language inputs were still quite obvious."
"This feature was available since the first update after its release, which was 2.5N half a year ago."
"We actually discovered something at the time, but we didn't pay too much attention to it. We just guessed that they used Lazi's method or that their Anglo-Saxon training corpus was not rich enough."
"But then they had several version updates, and each time there was a performance improvement."
"A month and a half ago, when testing the ability of small languages such as Annamese, Li from the Google Brain team realized for the first time that the thinking ability of the new version of Juzi 2.5 in small languages has exceeded the limit of Annamese, and the reasoning ability is almost the same as that of English."
“After that, we started to analyze it in depth.”
“It’s so shocking, so shocking.”
Demis Hassabis shook his head, also looking pale.
"Sergey, you should know what this means."
"The Orange Big Model has even broken through the language barrier and has the ability to optimize and create languages. How could it not have the ability to optimize functions!?"
“Optimizing and creating languages is much more difficult than optimizing functions that only involve logical reasoning!”
"It's just that for some unknown reason, Yuzu Technology has not released this function!"
"More importantly, they use 'full language' for reasoning and cognition, which will be much more efficient, accurate, and even faster than any large model that uses a single language for reasoning."
"This of course includes our AlphaZero."
"Sergey, we have no chance of winning. Not at all."
Demis Hassabis took off his glasses again and closed his eyes in pain.
“Even if AlphaZero can continue to evolve and really achieve the transformation from 0 to 1, it is impossible for us to catch up with Yuzu Technology in terms of innate ability.”
"They are building the Tower of Babel, Sergei."
Tower of Babel?
Sergey Brin's face turned pale at first, then red when he thought of Ysou, and then darkened again when he thought of Youmi OS.
It's like drinking the royal wine.
I don’t know if he bought it for 180 yuan per cup.
Although Ysou's current market share is not enough to undermine Google's monopoly in the global search engine market, everyone inside Google is clear that its current lead is only due to the overwhelming mobile search volume brought by Android.
On the desktop side, Ysou’s market share erosion of Google has reached nearly 40%!
Fortunately, on the mobile side, because Google is the only default search engine for all Android phones except Da Zhou, and a considerable proportion of mobile users will not change the default search engine, Google's market share is still as stable as a rock.
But just recently, Dami actually cooperated with Youzi Technology to come up with Youzi OS!
When asked to provide the source code, DaMi refused to do so on the grounds that it "did not violate the MADA default agreement."
The overseas version of Mix released by Dami does not change the default search engine, and the browser engine is still Google, but does this system still need a default browser search engine?
Three days after the release of Rice Mix, Google held a top-level meeting to discuss whether to initiate direct prosecution procedures against Rice.
But in this case, you are afraid of both ends, just like using a stick to hit the wolf.
Xiaomi is afraid of having its GMS certification revoked, and Google is also afraid of backlash from public opinion and a ban, which would allow Youmi OS to openly connect with major mobile phone manufacturers.
Of course, if there is support and leadership from the US government, the risk of direct judgment will be much smaller.
However, due to the Alcatraz incident, anti-corruption sentiment in Miami was on the rise, and with the election approaching, it was basically impossible to reach a consensus to initiate direct prosecution at this time.
With Jude's background, Google is now really afraid to act rashly and can only put it on hold temporarily and resume lobbying with all its strength after the election.
But if this happens, it will take at least half a year.
No, it may take even longer, and there may be more variables, and direct judgment may never be initiated.
For example, after Warren is elected, it is highly likely that the Glass-Steagall Act will be re-instated to limit the reach of Jude Capital.
The most important thing is that the discovery of the Tower of Babel is so amazing that Sergey Brin can no longer sit still. He feels that every minute of waiting is a slow suicide!
Therefore, the elected leader must be able to initiate a direct investigation as soon as possible after the election!
"Larry, what are the search data like now? Who do you think has a better chance of winning the election?"
Sergey Brin had a gloomy face. He took out his pixel phone and sent a message to Larry Page on Google Duo.
"There's no doubt that Warren is seven points ahead of that idiot Thomas Cotton in search volume."
Not long after, Larry Page, who was in Fiji, sent a message back to Sergey Brin.
Warren...
Sergey Brin's face was gloomy.
The ideology of the Minzhu faction is very suitable for Internet companies, but since the end of the Great Purge, only one tenth of the internal Qiude forces remain, which has increased the cohesion. Now the anti-Qiu consciousness within the faction is on the rise.
The congratulations faction is still friendly to the Judeans and supports Israel without any bottom line, while at the same time clamoring for retaliation against anti-Jude speech and forces in the country.
But their ideas are not suitable for Internet companies!
The most important thing is that they can't win!
To ordinary people, the two factions in the election seem to be extremely tense, but to Google, which has the most powerful big data capabilities, it is as clear as day.
The search ratio is seven percentage points different. Although the search data and the election results are not completely consistent, the difference of seven points is enough to cover any error!
Damn ASF!
If it weren't for what he did over the UN, Google wouldn't be so passive now!
Now it's like choosing the less smelly piece of shit between two pieces.
"How's the lobbying going? Does she agree to investigate Yuzu Technology immediately after being elected?"
"We can't wait any longer. Since we are sure that Warren will be elected, let's lobby now."
"Make sure she launches an investigation into Yuzu Technology the day after she is elected!"
Competition is never just a business contest.
(End of this chapter)
You'll Also Like
-
Douluo: Wuhun Jinguang, I just want to add money
Chapter 93 9 hours ago -
Fights Break Sphere: Rebirth of the Sky-Swallowing Python, Medusa Takes It Home
Chapter 245 9 hours ago -
Douluo: Reincarnation of Godzilla, Seizing the Throne of Dragon God
Chapter 189 9 hours ago -
Douluo: The inheritance of the two gods, I am Qian Renxue's younger brother
Chapter 174 9 hours ago -
Zhu Xian: Get ten times the talent, the sword opens the gate of heaven
Chapter 361 9 hours ago -
In the world of no return, holding the sun and the moon
Chapter 199 9 hours ago -
Bright Sword: From Northwest Shanxi to Shangganling
Chapter 279 9 hours ago -
Under One Person: I, Qingcheng Zhao Yuzhen
Chapter 122 9 hours ago -
The Fuxi Road of the Heavens Beginning from the Immortal Realm
Chapter 233 9 hours ago -
People in the sandbox are good at hugging thighs
Chapter 311 9 hours ago