Research Practicum in Artificial Intelligence
Jinho D. Choi
  • Overview
    • Syllabus
    • Schedule
    • Discussions
  • Speed Dating
    • Profiles
  • Faculty Interests
    • AI Faculty
  • Research Areas
    • AI Conferences
  • Task Selection
  • Introduction
    • Motivation
    • Overview
    • Exercise
  • Related Work
    • Literature Review
    • Exercise
  • Approach
    • Algorithm Development
    • Model Design
    • Data Creation
  • Research Challenges
  • Experiments
    • Datasets
    • Models
    • Results
    • 5.4. Homework
  • Analysis
    • Performance Analysis
    • Error Analysis
    • Discussions
    • 6.4. Homework
  • Conclusion & Abstract
    • Conclusion
    • Title & Abstract
  • Peer Review
  • Presentations
  • Team Projects
    • Fall 2023
    • Fall 2022
  • Assignments
    • HW1: Speed Dating
    • HW2: Research Areas
    • HW3: Team Promotion
    • HW4: Introduction
    • HW5: Related Work
    • HW6: Approach
    • HW7: Experiments
    • HW8: Analysis
    • HW9: Conclusion & Abstract
    • HW10: Peer Review
    • Team Project
  • Supplementary
    • LaTex Guidelines
      • Getting Started
      • File Structure
      • Packages
      • References
      • Paragraphs
      • Labels
      • Tables
      • Figures
      • Lists
    • Writing Tips
    • Progress Reports
    • Team Promotion
Powered by GitBook
On this page
Export as PDF
  1. Approach

Data Creation

PreviousModel DesignNextResearch Challenges

Last updated 8 months ago

When you create a dataset, the followings need to be clearly described:

  • Data collection (e.g., sources of the data).

  • Preprocessing if performed (e.g., scripts that you write, existing tools used).

  • Annotation scheme and guidelines if conducted with justification.

  • People involved in this process (e.g., annotators, survey subjects).

  • Quality of the created data (e.g., inter-annotator agreement).

  • Statistics and analysis of the original, preprocessed, annotated data.

Here are a few papers presenting new datasets:

  • , Li et al., EMNLP 2020 (see Section 3).

  • , Yang and Choi, SIGDIAL, 2019 (see Section 3).

Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Models
FriendsQA: Open-Domain Question Answering on TV Show Transcripts