Technology invades the modern world
Chapter 348 The Meta That Was Desperate
Chapter 348 The Meta That Was Desperate
There is an ancient Chinese proverb that says, "Seven days in the mountains are like thousands of years in the world." Nilanjan's current feeling is similar to that proverb.
After a quick wash, he was taken to META's headquarters in New York. Zuckerberg looked at him with a reverent expression: "Professor Balasubramaniam, I know you have extraordinary expertise in the field of artificial intelligence."
Nilanjan thought to himself, "It's really impressive that Zuckerberg can pronounce my surname so clearly."
Then, Zuckerberg's words began to surprise him.
"Do I have extraordinary expertise in the field of artificial intelligence?" Nilanjan pondered this question, wondering if it was another trap. But then he thought, "Someone like Zuckerberg, a top billionaire in the world, wouldn't stoop to such a stunt."
Moreover, as a professor of artificial intelligence at Stony Brook University, it's not unreasonable to say that his expertise is extraordinary.
“I do have some insights into artificial intelligence,” Nilanjan said with a smile. The past year of torment in prison was finally over, and he was about to be reborn. A confident smile, a composed posture, and a wise mind had finally taken over his mind again.
Zuckerberg laughed even harder after hearing this, "As expected of a Randolph professor, I knew you were extraordinary!"
Zuckerberg managed to get Nilanjan out of prison without much trouble, considering he's a long-time staunch supporter of the Democratic Party and has donated a considerable amount of money to them.
Nilanjan didn't actually commit any crime. The FBI investigated and investigated but couldn't find any connection between Nilanjan and the Apollo moon landing, and they didn't find any decisive evidence.
He had been kept in custody before, but only as a scapegoat. It seemed quite fitting to have an Indian professor with no background take the blame for China's first moon landing in the 21st century.
But when Zuckerberg intervened, Nilan James became an insignificant figure again, easily getting the other party out of trouble.
Moreover, the fact that the other party was able to be imprisoned for more than a year shows that they must indeed have some skills.
"Professor Balasubramaniam, what are your thoughts on large language models?" Zuckerberg asked.
Nilanjan's brain started racing, after all, this concerned his own safety! He had to demonstrate his value in order to remain on bail, or even be acquitted.
He gave a bitter laugh to himself: What kind of situation is this? I am clearly innocent, but now I have to prove my worth in order to be innocent. What's wrong with this country?
"I think this is a very promising direction. My paper 'DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering', which I published at the ACL conference a few years ago, addressed the pain points of Transformer-based QA models - the slow computation and high memory usage caused by wide self-attention across all layers. I proposed DeFormer, a decomposed variant of Transformer."
At lower levels, DeFormer replaces full self-attention with question-width and paragraph-width self-attention to avoid cross-computation of question and paragraph sequences.
This allows for independent processing of input text and pre-computation of paragraph representations, thereby significantly reducing runtime computation.
The DeFormer structure is similar to the Transformer, and can be directly initialized with pre-trained weights and fine-tuned on the QA dataset.
Our experiments show that the DeFormer versions of BERT and XLNet are more than 4.3 times faster on QA tasks, with only a 1% loss in accuracy due to simple distillation loss.
Nilanjan was referring to his paper published at the ACL conference in 2020, which was a classic work in the field of LLM optimization at the time. The popular model for LLM at that time was called BERT. This paper was built directly on a pre-trained Transformer. The bottleneck of LLM, namely the computational cost, became prominent in downstream tasks, and this paper proposed a solution to some extent.
"Including another work I did in 2020, which actually shares a similar core logic with the core of LLM, namely multi-layered attention."
Nilanjan is certainly not a mediocre talent. He has indeed been immersed in the field of artificial intelligence for many years and has achieved remarkable results. He has several top conference papers, all of which are related to LLM.
That was back in 2020. At that time, large-scale models were still relatively unknown and considered a marginal area in the field of artificial intelligence.
Zuckerberg wasted a lot of money renaming Facebook to META and misjudged the arrival time of the metaverse, but that doesn't mean he's brainless. He didn't simply hire Nilan Zhan because he was Lin Ran's professor.
Nilanjan himself is truly skilled, which is also a very important reason.
Nilanjan has conducted in-depth research on key aspects of large models, including self-attention mechanisms, multi-head attention, and positional encoding, since one of his important research areas is NLP.
Zuckerberg was overjoyed, feeling he had found the right person.
"Professor Balasubramaniam, how do you handle overfitting or underfitting problems when training LLM?"
"Large-scale training, pre-training involves learning a general representation on massive amounts of unlabeled data, which we can model using masked language or predict the next sentence; fine-tuning involves adjusting weights on specific task datasets to achieve transfer learning."
For overfitting, I suggest using regularization and dropout, for example, a dropout rate of 0.1 in the BERT variant, and applying an early stopping mechanism; for underfitting, increase the model depth or data augmentation.
In previous projects, I addressed training instability through gradient clipping, reducing the overfitting rate from 15% to 5% on the GLUE benchmark. This helps large model training to be more efficient in multi-task adaptation. Nilan Zhan was confident.
Asking this is no small matter for me.
Zuckerberg then asked some further questions about efficient parameter fine-tuning, the main challenges of multimodal models, the causes of hallucinations and mitigation strategies, to which Nilanjan answered fluently.
After listening to him, Zuckerberg realized he had found the right person.
The other person had been imprisoned for more than a year, but when he came out he could still talk fluently and keep up with the latest developments. He was clearly a pioneer in the field of large-scale models.
Furthermore, the other side has produced a top genius like Randolph Lin. If they can create Crimson, it's not unreasonable for us at META, under the leadership of Professor Balasubramaniam, to create Deep Blue, right?
Zuckerberg's already smiling face beamed even brighter: "Professor Balasubramaniam, welcome to META. You will serve as META's Chief Scientist in the future, leading us forward."
He pressed a button on the table, and a META staff member came in with a contract. Zuckerberg handed it to Nilanjan: "Professor Balasubramaniam, congratulations, you will become a billionaire."
Nilanjan picked it up and was stunned; the annual salary was one hundred million US dollars.
This number made him hesitant to sign.
If Zuckerberg can get him out, he can also put him back in.
With an annual salary of one hundred million US dollars, if I can't produce anything, won't I be locked up until I die?
"Boss, isn't this number a bit too much?" Nilanjan asked cautiously.
Zuckerberg was also shocked. There were actually Indians who would complain about their salaries being too high? There are plenty of Indian executives at META, and quite a few Indian scientists as well. They would just tell him how much they had contributed and hint at whether they could get a raise.
Nilanjan was the first Indian he had ever met who he thought was paid too much.
"No, Professor Balasubramaniam, don't worry, this price is not high at all."
You've just been released from prison and have no idea what has happened in the world over the past year.
If you knew what happened, you would know that this number is reasonable.
After Zuckerberg finished speaking, Nilanjan glanced at the contract again before signing his name.
Zuckerberg smiled and shook hands with him for a photo.
The following day, META officially released a public announcement:
"The company will appoint renowned artificial intelligence expert Professor Nilanjan Balasubramaniam as its chief scientist, responsible for leading the research on large-scale artificial intelligence models."
This appointment comes at a time when the emergence of generative pre-trained Transformer models is sparking a global AI revolution. Meta is committed to accelerating open-source AI innovation and promoting the development of safer and more efficient AI technologies.
Professor Nilanjan Balasubramaniam is currently affiliated with the Department of Computer Science at Stony Brook University, State University of New York, and has over 15 years of research experience in Natural Language Processing (NLP) and Machine Learning.
His pioneering work includes developing the DeFormer framework to optimize the efficiency of pre-trained Transformer models in question-solving tasks, and exploring the application of event representation and attention mechanisms in user personality prediction. These achievements have been published in top conferences such as ACL and AAAI and have been widely cited.
The professor's expertise will help Meta continue to innovate on the Llama series of large models, ensuring that the application of AI technology in social, meta-universe and global connectivity is more inclusive and reliable.
As Chief Scientist, Professor Balasubramaniam will lead the Meta AI research team, focusing on key areas such as multimodal model optimization, hallucination mitigation, and sustainable computing. His joining marks a further strengthening of Meta's strategic investment in AI, aimed at providing smarter and safer digital experiences for users worldwide.
About Meta: Meta is the technology that builds upon people's connections, helps them discover communities, and grow their businesses. Through our apps and services, we are committed to making the world more connected.
Zuckerberg then posted on Facebook: "As the GPT model ushers in a new era of AI, we need top talent to lead the future of open source AI."
Professor Nilanjan's profound academic background and practical innovative capabilities will help us build more efficient and responsible large-scale models, promoting the harmonious progress of humanity and technology. I can't wait to collaborate with Professor Nilanjan Balasubramaniam!
The market is paying more attention to Nilanjan's other identity: Randolph Lin's doctoral advisor.
2022年11月,Meta才宣布将裁员11000名员工,2023年的3月中旬,Meta公司又宣布将再裁员1万人。
With successive layoffs, a focus on AI, and efforts to reduce costs and increase efficiency, META's message to the outside world is very clear.
On that day, META's stock price surged by more than 7%. The company, with a market value of $1800 billion, gained $100 billion in market value just by recruiting Nilanjan.
Nilan Zhan has earned enough for a hundred years' worth of salary.
Nilanjan only learned what had happened outside the prison when he returned home.
"What's with this 'big model' theory? Tech giants are always talking about it these days?"
"GPT is too strong. I always thought that the LLM path had great potential, and it turns out to be true."
"Why do I feel like Crimson is even easier to use than GPT?" Crimson only allows registration with Chinese mobile phone numbers and does not allow registration on external networks. It adopts a similar strategy to GPT, only opening up to specific regions.
Therefore, Crimson is not usable abroad, but there are videos of Crimson being used all over YouTube and TikTok.
After all, many foreigners in China and Chinese students studying abroad share these things on the internet.
There is indeed a certain degree of isolation between the Simplified Chinese internet and the external internet, but this isolation is very thin, like a thin layer of paper.
AI enthusiasts on the internet are drooling over Crimson, as it's free and seems even better than GPT.
GPT-4 is a paid service.
Nilanjan felt immense pressure. He wasn't even sure he could handle GPT-4, let alone Crimson.
He didn't dare call Lin Ran. His thick skin would allow him to do so, but his experience of being taken away by the FBI made him afraid to.
What if they later accuse you of some fabricated crime? Wouldn't that be the end of you?
Now he is a top-tier employee with an annual salary of hundreds of millions of dollars, the chosen one of Indian descent, and Indian newspapers call him the Father of Crimson, believing that Crimson is a technology that Lin Ran stole from Niranjan.
Indian media are just that confident.
Some Indian newspapers have even called on Niranjan to transfer his technology to Indian companies, saying that just as China has its deep red, India also needs its white elephant!
This is because of the competition between India and China, which is known as the rivalry between the dragon and the elephant.
Niranjan started calling his friends and acquaintances, bringing all his Indian friends to META so that the elephant herd could unleash its full power!
A single elephant in the jungle poses little threat, but a herd of elephants in the jungle would make even the king of beasts retreat.
As the chief scientist of META, he has the right to hire people, in addition to his salary.
The META LLM group will gather the strength of the elephant herd. Nilanzhan thought to himself, "Forget it, I still need to recruit some Chinese people to do the work. Let me think about which Chinese professors I have good relationships with, and ask them to recommend some students to do the work. They have to be the best of the best, only those from Tsinghua University and Yenching University. No, SJTU is fine too."
Once Niranjan regained his senses, he realized that if a group consisted entirely of people of Indian descent, there would be a problem: everyone would only have strategic thinking but lack tactical execution, which was clearly not going to work.
Nilanjan moved his family to California, and Zuckerberg changed the rule that he couldn't leave New York during his bail period to that he couldn't leave America during his bail period.
As the most watched and resource-rich department in the entire META, Zuckerberg visits the AI lab almost every day. He has noticed that there are more and more Indian and Chinese people there, and fewer and fewer white people.
A month later, in the entire AI research and development office, there were about two hundred people, only of Indian and Chinese descent, no white people left.
“Great, we’re on the right track,” Zuckerberg thought.
If Zuckerberg discovered a new continent, found Nilanjan as his chief scientist, and found a kindred spirit, then Baidu has fallen into an unprecedented slump.
The sudden appearance of the deep red color caused Wenxin Yiyan to become a roadside idiot before it was even born.
Wenxin Yiyan only held a press conference and opened applications; Baidu then opened up beta testing slots based on the applications.
As a result, a comparison video between Wenxin Yiyan and Shenhong was posted on Bilibili by a Bilibili up-loader. The contrast was too stark, like comparing a doctoral student to an elementary school student. Baidu couldn't hold back and shut down the internal test completely the next day.
Has it been released? Yes, it has been released. Are there any users who can use it? No, the main feature is "like hair".
The internal testing has become internal testing only, not open testing.
Baidu is facing an unprecedented crisis, and even those within Baidu themselves don't know what to do.
Because their original plan was to charge a fee, just like GPT. Wenxin Yiyan would charge 50 yuan per month, becoming the first AI big model, the first to charge a fee, the first to achieve break-even, and then expand investment, improve user experience, kill competitors, and dominate the AI big model simplified Chinese internet market.
Baidu's dream is to create a positive cycle that keeps snowballing.
As a result, it got stuck in the first step.
You're charging for a model meant for elementary school students?
Four days after its launch, Crimson surpassed ten million users. The most liked comment under the Weibo post celebrating this milestone was an image: a quote from Wenxin Yiyan on the left, Crimson on the right, and two numbers below:
"10000000:0"
This means that Shenhong already has ten million users, while Wenxin Yiyan only has 0. The latter even held a grand press conference, while the former only has an eight-second short video.
The contrast is so stark that the sheer number itself has driven the first nail into the coffin of Baidu in the minds of netizens.
The internal pressure at Baidu is immense, to the point of being explosive.
From the day after Wenxin Yiyan was published, the entire Baidu building was in a low-pressure state. No one dared to speak loudly, for fear of angering the big boss and being laid off.
In the conference room of Baidu's building, executives from all walks of life gathered together. In addition to the executives, experts in the field of artificial intelligence from within Baidu were also present.
“Boss, Tencent isn’t just slapping Wenxin Yiyan in the face, it’s slapping us in the face! This is blatant provocation!” the secretary said. “I think someone has to take responsibility this time. Clearly, someone’s technical assessment was inadequate, and someone misjudged the situation. Before Wenxin Yiyan was even mature, they were trying to seize the market in order to claim credit.”
An executive in charge of administrative work pointed the finger at CTO Wang Haifeng.
Wang Haifeng sincerely apologized: "I did misjudge the situation. I didn't expect Tencent to make such rapid progress, catching up with or even surpassing OpenAI's progress in just two months."
Before Wang Haifeng could finish speaking, Robin interrupted, "Hey, Haifeng, that's enough. I don't think anyone in this world could have predicted this beforehand."
This is similar to Professor Lin Ran's moon landing after more than a year; they accomplished the impossible. I won't blame my subordinates for that; it's a non-competitive act.
Robin's words made everyone realize that this meeting wasn't about finding problems or engaging in infighting; to put it more bluntly, it wasn't about finding someone to take the blame, but rather about hoping to solve the problems.
"Haifeng, how long will it take us to catch up?" Robin asked.
王海峰坦诚道:“我预计7月能够追上GPT-3百分之八十的效果,12月能够达到GPT-3的效果,明年7月预计能够达到深红的效果。”
After a pause, Wang Haifeng said, "The worst thing about Crimson is that it's not open source, so we can't learn from it."
It's not plagiarism, it's borrowing!
“GPT is not open source, that is, starting with GPT-3, while GPT-1 and GPT-2 were open source.”
Open source means that we at least know how it works and what technologies it uses. Even if we don't know the specific technical details or engineering implementation methods, we can figure it out over time.
But Crimson is different. Crimson is a complete black box, and we don't even know if it follows the same technological route as GPT.
Why does Crimson perform so well in Chinese contexts, and why is its handling of Chinese texts so outstanding? We don't know whether they use existing technologies or Professor Lin Ran's original techniques.
I believe there's something we absolutely must do right now.
"What is it?" Robin asked.
"We need to recruit people from Crimson to find out exactly how they do it. The more information we have, the faster we can catch up. Even knowing just one direction is much better than blindly wandering in the dark like we are now."
Moreover, Robin, we are a big company, and the biggest advantage of a big company is its resources and cash.
We need to figure out how Deep Red does it. We need to recruit people from both OpenAI and Deep Red, and internally have two teams working on the same project, one following the technical roadmap of GPT and the other Deep Red.
When it comes to resource scheduling, priority is determined based on performance.
This is standard practice for large companies; it's a blatant, overt strategy.
I just copy the products from small factories. You can't fight back anyway. I can't solve some core engineering problems in a short time, and I can't afford to waste time seizing the market, so I'll poach people from you.
“I know this is definitely the solution, but the problem is that it’s too fast. Crimson is progressing too quickly.” Robin looked worried, all his handsome and gentlemanly demeanor gone. “Unlike our past competition, where user growth was linear and would slow down after reaching a certain point.”
As a latecomer, while the other side's growth rate is slowing down, our growth rate is so fast that we can catch up.
However, the growth rate of large models is not a logarithmic function, nor a linear function; it is even an exponential function.
Even if we successfully poach key engineers, it will still take us at least six months to launch Crimson, right?
Wang Haifeng smiled bitterly to himself. Six months? Six months would only be possible if they poached Professor Lin Ran.
(End of this chapter)
You'll Also Like
-
Douluo Continent: The Pursuit of Her Husband Leads to a Heartbreaking Scene - Ning Rongrong Cries fr
Chapter 47 12 hours ago -
Battle Through the Heavens: I am Rock Serpent, Medusa's fiancé
Chapter 137 12 hours ago -
Battle Through the Heavens: I made Zhang Daxian in the chat group cry.
Chapter 186 12 hours ago -
Transmigrating into a Mortal's Journey to Immortality as a Homebody
Chapter 263 12 hours ago -
In my later years, I achieved the Great Sacred Body, enabling me to traverse the primordial world.
Chapter 520 12 hours ago -
Reborn into a farming family, I became incredibly wealthy thanks to the pressure my parents faced.
Chapter 122 12 hours ago -
Battle Through the Heavens: The Carefree Emperor
Chapter 160 12 hours ago -
Rebirth of the Poisonous Daughter
Chapter 215 12 hours ago -
Protecting the Little Nun in the Apocalypse
Chapter 616 12 hours ago -
The CEO's Sweet Wife
Chapter 760 12 hours ago