Python for Linguists: Introduction to Natural Language Processing

Prof. Dr. Kevin Tang, Winter 2025, Course Catalog

Course Description

Natural Language Processing plays a big role in our digital lives. We will demystify some of these everyday tasks that involve natural language processing: such as spelling and grammar correction, document classification, dialogue systems, machine translation, and forensic linguistics. On the practical side, we will focus on applying off-the-shelf tools that are often used in computational modelling of language data. Armed with these skills, you will be able to model language data quantitatively and ask measurable research questions.

By the end of the course, you will learn how to perform:

  • Pre-processing of text files (cleaning up raw text files),
  • Automatic linguistic annotation, such as Part of Speech tagging (automatically adding labels such as Noun, Adjective to each word), Name Entity Recognition (identifying proper names, time, date, places, events) and Sentiment (fear, anger, happy, surprise…)
  • Frequency analyses: counting and performing basic statistical tests such as the chi square test, t-test and correlation test.

Prerequisite

The programming language is Python. We do NOT assume background knowledge in Python and everything will be covered as needed. We will focus on using Python to run existing libraries in Natural Language Processing. Much of the scripts will be pre-written and examples will be given.