Crowdsourcing Adverse Test Sets to Help Surface AI Blindspots

Welcome to the CATS4ML Challenge!

This challenge contributes evaluation data for AI models.

It serves as v.0 (a proof of concept with only one benchmark and a limited set of target labels) for a series of future data challenges as a continuous source of adverse examples for various AI models.

By participating in this challenge you will help gather experience on how to proactively discover adverse examples in existing AI benchmark datasets through crowdsourcing. For this you will explore a subset of target images from the Open Images Dataset (OID) to discover adverse image examples that you think will be difficult for machines to get right. We will provide you with a set of target labels.