arrow-left

All pages
gitbookPowered by GitBook
1 of 1

Loading...

Data Creation

When you create a dataset, the followings need to be clearly described:

  • Data collection (e.g., sources of the data).

  • Preprocessing if performed (e.g., scripts that you write, existing tools used).

  • Annotation scheme and guidelines if conducted with justification.

  • People involved in this process (e.g., annotators, survey subjects).

  • Quality of the created data (e.g., inter-annotator agreement).

  • Statistics and analysis of the original, preprocessed, annotated data.

Here are a few papers presenting new datasets:

  • , Li et al., EMNLP 2020 (see Section 3).

  • , Yang and Choi, SIGDIAL, 2019 (see Section 3).

Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Modelsarrow-up-right
FriendsQA: Open-Domain Question Answering on TV Show Transcriptsarrow-up-right