Sign In

Communications of the ACM

ACM Opinion

Why We Must Rethink AI Benchmarks

View as: Print Mobile App Share:
A grouping of organizational logos related to the GLUE AI benchmark.

The focus on benchmark performance has also brought a lot of attention to machine learning at the expense of other promising directions of research.

For decades, researchers have used benchmarks to measure progress in different areas of artificial intelligence (AI), such as vision and language. However, while benchmarks can help compare the performance of AI systems on specific problems, they are often taken out of context, sometimes to harmful results.

In a recent paper referenced in the article, scientists at the University of California, Berkley; the University of Washington; and Google outline the limits of popular AI benchmarks. According to the paper, "Progress on benchmarks is often used to make claims of progress toward general areas of intelligence, which is far beyond the tasks these benchmarks are designed for."

From TechTalks
View Full Article


No entries found