A recent graduate’s claim of authoring over 100 papers in a single year has exposed deep problems in artificial intelligence research standards

A researcher in a lab setting analyzes data on a computer, surrounded by high-tech screens displaying scientific graphics.

A troubling case has emerged that highlights what many academics say is a crisis in artificial intelligence research. Kevin Zhu, who recently completed his bachelor’s degree in computer science at the University of California, Berkeley, claims to have authored 113 academic papers on artificial intelligence this year alone. Of these, 89 are scheduled for presentation this week at NeurIPS, one of the world’s premier conferences on AI and machine learning.

Zhu, who graduated from high school in 2018, now runs Algoverse, an AI research and mentoring company targeting high school students. Many of these students appear as co-authors on his papers. The research spans diverse subjects, from locating nomadic pastoralists in sub-Saharan Africa to evaluating skin lesions and translating Indonesian dialects. On his LinkedIn profile, Zhu boasts of publishing over 100 top conference papers in the past year, claiming citations by OpenAI, Microsoft, Google, Stanford, MIT, and Oxford.

Hany Farid, a professor of computer science at Berkeley, has called Zhu’s papers a disaster. In an interview, Farid said he is fairly convinced that the entire operation, from top to bottom, is just vibe coding, referring to the practice of using AI to create software. Farid drew attention to Zhu’s prolific output in a recent LinkedIn post, which sparked broader discussions among AI researchers about similar cases and the flood of low-quality research papers plaguing their newly popular discipline.

In response to inquiries from the Guardian, Zhu defended his work, stating that he had supervised the 131 papers as team endeavors run by his company. Algoverse charges high school students and undergraduates $3,325 for a selective 12-week online mentoring experience that includes help submitting work to conferences. Zhu explained that at minimum, he reviews methodology and experimental design in proposals and reads and comments on full paper drafts before submission. He added that projects on specialized subjects involve principal investigators or mentors with relevant expertise, and that teams use standard productivity tools such as reference managers, spellcheck, and sometimes language models for copy-editing or improving clarity.

The case points to a larger structural problem in AI research. Unlike most other scientific fields, work in AI and machine learning typically does not undergo the stringent peer-review processes common in chemistry and biology. Instead, papers are often presented less formally at major conferences. NeurIPS and similar venues are being overwhelmed with increasing numbers of submissions. NeurIPS fielded 21,575 papers this year, up from under 10,000 in 2020. Another top AI conference, the International Conference on Learning Representations, reported a 70% increase in yearly submissions for 2026’s conference, reaching nearly 20,000 papers, up from just over 11,000 for the 2025 conference.

Reviewers are complaining about the poor quality of papers, with some even suspecting that submissions are AI-generated. The Chinese tech blog 36Kr noted in a November post about ICLR that the average score reviewers awarded papers had declined year over year, asking why this academic feast has lost its flavor.

Meanwhile, students and academics face mounting pressure to accumulate publications and keep pace with their peers. It is uncommon to produce a double-digit number of high-quality academic computer science papers in a year, much less triple digits, according to academics interviewed. Farid admits that at times, his students have vibe coded papers to increase their publication counts, driven by the current frenzy around AI.

Jeffrey Walling, an associate professor at Virginia Tech, explained that NeurIPS reviews papers submitted to it, but its process is far quicker and less thorough than standard scientific peer review. This year, the conference used large numbers of PhD students to vet papers, which a NeurIPS area chair said compromised the process. Conference referees must often review dozens of papers in a short period with little to no revision, according to Walling.

Walling agreed with Farid that too many papers are being published, saying he has encountered other authors with over 100 publications in a year. He noted that academics are rewarded for publication volume more than quality, and everyone loves the myth of super productivity.

On Algoverse’s FAQ page, the company discusses how its program can help applicants’ future college or career prospects, stating that the skills, accomplishments, and publications achieved through the program are highly regarded in academic circles and can strengthen college applications or résumés, especially if research is admitted to a top conference, which it describes as a prestigious feat even for professional researchers.

Farid now counsels students not to go into AI research because of the frenzy in the field and the large volume of low-quality work being produced by people hoping to better their career prospects. He describes the situation as a mess where researchers cannot keep up, cannot publish, cannot do good work, and cannot be thoughtful.

Despite these problems, much excellent work has still emerged from this process. Google’s paper on transformers, Attention Is All You Need, which provided the theoretical basis for the advances in AI that led to ChatGPT, was presented at NeurIPS in 2017.

NeurIPS organizers acknowledge the conference is under pressure. In a comment to the Guardian, a spokesperson said that the growth of AI as a field has brought a significant increase in paper submissions and heightened value placed on peer-reviewed acceptance at NeurIPS, putting considerable strain on the review system. The spokesperson noted that Zhu’s submissions were largely to workshops within NeurIPS, which have a different selection process than the main conference and are often where early-career work gets presented.

Farid said he did not find this a substantive explanation for one person to put his name on more than 100 papers, stating he does not find it a compelling argument for putting your name on 100 papers that you could not have possibly meaningfully contributed to.

The problem extends beyond NeurIPS. ICLR used AI to review a large volume of submissions, resulting in apparently hallucinated citations and feedback that was very verbose with lots of bullet points, according to a recent article in Nature. The feeling of decline is so widespread that finding a solution to the crisis has become the subject of academic papers itself. A May 2025 position paper authored by three South Korean computer scientists that proposed solutions to the unprecedented challenges with the surge of paper submissions won an award for outstanding work at the 2025 International Conference on Machine Learning.

Meanwhile, major tech companies and small AI safety organizations now dump their work on arXiv, a site once reserved for little-viewed preprints of math and physics papers, flooding the internet with work that is presented as science but is not subject to review standards.

The cost of this, according to Farid, is that it is almost impossible to know what is actually going on in AI, whether for journalists, the public, or even experts in the field. He argues that average readers have no chance of trying to understand what is going on in the scientific literature, with the signal-to-noise ratio basically at one. He can barely attend these conferences and figure out what is happening.

Farid tells students that if they are trying to optimize publishing papers, it is honestly not that hard to do by producing really crappy low-quality work and bombing conferences with it. But if they want to do really thoughtful, careful work, they are at a disadvantage because they are effectively unilaterally disarmed.

Leave a comment

Trending