🆕Free Working Datasets
For OpenThaiGPT
Pretraining
Finetuning
iapp_wiki_qa_squad
is an extractive question answering dataset from Thai Wikipedia articles. It is adapted from the original iapp-wiki-qa-dataset to SQuAD format, resulting in 5761/742/739 questions from 1529/191/192 articles by iApp Technology.thaiqa_squad
is an open-domain, extractive question answering dataset (4,000 questions in train
and 74 questions in dev
) in SQuAD format, originally created by NECTEC from Wikipedia articles and adapted to SQuAD format by PyThaiNLP.Unhealthy Comments Corpus
Last updated
Was this helpful?