DeepSeek excels at CSAT questions... Tried solving difficult Korean language problems

Solved Relatively Easily but Weak in Complex Reasoning
Becomes Unresponsive Every Five Minutes... "Needs Improved Stability"

Deepseek, a Chinese artificial intelligence (AI) that surprised the world by developing an innovative chatbot at low cost, demonstrated its ability to easily solve questions from the College Scholastic Ability Test (CSAT).

DeepSeek excels at CSAT questions... Tried solving difficult Korean language problems

DeepSeek application screen. Photo by Reuters and Yonhap News.

On the 31st, Yonhap News revealed the results of training Deepseek with CSAT question passages and having it solve the problems. In the 2024 CSAT Korean common subject, which was considered relatively difficult, it got 5 out of 34 questions wrong, resulting in a total deduction of 12 points. The current CSAT Korean section is divided into a common subject (questions 1?34) and an elective subject (questions 35?45), where students choose between Language and Media or Speech and Writing. The 2024 CSAT Korean first-grade cutoff scores were 84 points for Language and Media and 88 points for Speech and Writing.

Deepseek quickly provided answers for passages related to modern literature and questions on spelling and vocabulary. In particular, when the 'Deepseek R1' feature was activated, it showed detailed problem-solving processes. However, it showed weaknesses in non-literary passages, such as analyzing hypothetical public opinion survey statistics based on a passage considered a high-difficulty question (question 7) and comparing and analyzing different data processing techniques (question 10).

Additionally, it revealed weaknesses in questions asking about the speaker’s intent expressed in specific phrases (questions 25 and 31) and questions about expressive techniques in classical poetry (question 34), where it either considered all answer choices as correct descriptions or provided incorrect interpretations.

Regarding math problems, it correctly solved 2-point calculation questions but struggled somewhat with high-difficulty problems requiring complex reasoning. It failed to recognize captured images in geometry problems and got stuck in an infinite loop by trying all possible values that satisfy initial conditions in subjective sequence problems.

Furthermore, Deepseek frequently displayed a "server is busy" message about once every five minutes, causing prolonged outages. Considering the global surge in users, this frequency and severity were reported to be higher compared to the early days of OpenAI’s ChatGPT, which sparked the previous generative AI craze.