Artificial intelligence (AI) today is very successful at standard pattern-recognition tasks due to the availability of large amounts of data and advances in statistical data-driven machine learning. However, there is still a large gap between AI pattern recognition and human-level concept learning. Humans can learn amazingly well even under uncertainty from just a few examples and are capable of generalizing these concepts to solve new conceptual problems. The growing interest in explainable machine intelligence requires experimental environments and diagnostic/benchmark datasets to analyze existing approaches and drive progress in pattern analysis and machine intelligence. In this paper, we provide an overview of current AI solutions for benchmarking concept learning, reasoning, and generalization; discuss the state-of-the-art of existing diagnostic/benchmark datasets (such as CLEVR, CLEVRER, CLOSURE, CURI, Bongard-LOGO, V-PROM, RAVEN, Kandinsky Patterns, CLEVR-Humans, CLEVRER-Humans, and their extension containing human language); and provide an outlook of some future research directions in this exciting research domain.
Keywords: artificial intelligence; benchmarks; concept learning; diagnostic datasets; pattern analysis.
© 2023 The Authors.