Towards a career in language technology: linguistic annotation

Akhilesh Kakolu Ramarao, Summer 2022

Course Description

You do not have to be a computer wizard to break into the language technology industry. Computational models of human language (Natural Language Processing) are only as good as the annotations of different linguistic structures. Given the richness of human language, high quality and complex annotations can only be performed by humans equipped with formal linguistic training, native speaker’s intuition, and knowledge of the world. Linguistic data annotation is thus extremely important, making it a sought-after skill in the language technology industry.

“Data annotator” is an entry-level position at technology companies which requires linguistics training but not programming experience. Annotators work in an interdisciplinary environment with engineers, managers, linguists and so on, and can find opportunities for furthering their careers in the technology industry. This course offers a taste of annotation work for those curious about the field, and provides practical training and experience for applying for such a position.

In this course, students would gain a basic understanding of the annotation process, learn about the creation of different annotation schemes, work with annotation softwares commonly used in the industry and evaluation of annotations. Along the way, students will gain familiarity with common text-processing tools and data formats.