Belebele is a dataset that tests the ability of machines to read and understand texts in 122 different languages. This dataset can be used to measure the performance of models that work with one or more languages, whether they are widely spoken or not.
Each question in the dataset has a short text from the FLORES-200 dataset and four possible answers to choose from.