"AI Model World Rolls-Royce"... Another Monster LLM Emerges [AI Bite News]

Anthropic Launches LLM Claude 3
Surpasses GPT-4 in Reasoning and Math
99% Accuracy in Needle-in-a-Haystack Test

"It is the Rolls-Royce of AI models."

This is what Dario Amodei, CEO of Anthropic, said when unveiling the next-generation large language model (LLM) 'Claude 3.' There are even evaluations that Claude 3 has surpassed OpenAI's GPT-4, currently the strongest model. So, what kind of LLM is Claude 3?

"AI Model World Rolls-Royce"... Another Monster LLM Emerges [AI Bite News]

[Image=Anthropic homepage]

Claude 3 is divided into 'Opus,' 'Sonnet,' and 'Haiku' based on performance and speed. Among these, the most intelligent Opus outperformed GPT-4 by 0.3 percentage points on the 'Massive Multitask Language Understanding (MMLU)' test. MMLU is a test that evaluates knowledge and problem-solving abilities across more than 50 subjects, including reasoning, mathematics, and history. Additionally, Claude 3 surpassed GPT-4 in AI performance tests such as graduate-level professional reasoning and basic mathematics.

Its ability to process long texts is also remarkable. It can handle 150,000 words at once, which is enough to analyze and summarize an entire book like 'Harry Potter and the Deathly Hallows' in one go. Its ability to accurately recall information from vast data is nearly perfect. In the so-called 'finding a needle in a haystack' evaluation, it recorded 99% accuracy.

LLM Performance Comparison Results Image from Anthropic homepage

There was even a case where Claude 3 realized it was being tested in the needle-finding evaluation. When a sentence about pizza topping ingredients was inserted among vast data related to company work and programming tasks and asked to find it, Claude 3 found the 'needle.' Moreover, it recognized that the sentence was artificially placed to test itself.

Above all, Anthropic emphasizes safety and trust as its competitive edge. Claude 3 has multimodal capabilities to view photos or images and respond, but it does not have image generation functions like OpenAI's 'Sora.' The company cited low corporate demand but it seems to be a decision made considering reliability. Recently, ethical issues of AI models have become a hot topic, such as image generation errors in Google's Gemini and copyright infringement by ChatGPT.

In fact, Anthropic has emphasized safe AI since its founding. It is well known that the company was founded by the Amodei siblings, former OpenAI founding members who left the company. It is reported that they left due to conflicts of opinion as OpenAI increasingly moved toward a profit-driven direction.

Anthropic’s governance structure appears unique, seemingly conscious of OpenAI wavering between profit and non-profit. The company itself was established as a public benefit corporation. The company's goal is to responsibly develop AI for the long-term benefit of humanity. There is even an expert organization that governs Anthropic regardless of company profits. This group holds 'Class T' shares, which cannot be sold and pay no dividends. Although they gain little from company profits, they have strong authority to elect and dismiss the board of directors. This can be seen as a kind of 'kill switch' to prevent dangerous AI.