Academic Datasets and Real-World Applications!

17.01.2022
Michael Wechner is developing Katie, a duplicate question detection system for slack chats. In addition to academic datasets such as SQuAD for question answering and FEVER for fact verification, we have one of the best academic datasets out there for duplicate question detection in QQP (Quora Question Pairs). Quora has published over 400K duplicate question annotations, and even hosted a Kaggle competition to develop this! I think this is an extremely interesting case of understanding how well these academic benchmarks generalize to startup ideas and real-world applications! HuggingFace Datasets: https://huggingface.co/datasets

Похожие видео

Показать еще