Data Science for Linguistics

Postgraduate course

Course description

Objectives and Content

This course aims to equip students with practical and theoretical knowledge of data science as it relates to language and linguistics. The course introduces the concept of reproducible workflows, from data management and processing of large datasets, to statistical analysis, data visualization and reporting of results. Students will learn how to set up a clean and shareable project workflow in a transparent format as well as the basics of a programming language (e.g., R) for the purposes of retrieving, processing and (statistically) analyzing data, including visualizing the results.

Learning Outcomes

Knowledge

The candidate is able to:

  • describe and motivate the goals of reproducible and open science;
  • critically assess practical and methodological choices in data management, processing and analysis.

Skills

The candidate is able to:

  • set up a research project in a reproducible workflow from start to finish.
  • perform basic data retrieval, processing and analysis using a programming language.

General competence

The candidate is able to:

  • critically assess data management and statistical analysis in linguistics;
  • adhere to best practices with regard to reproducible scientific workflows.

ECTS Credits

15

Level of Study

Master

Semester of Instruction

Fall

Place of Instruction

Bergen
Credit Reduction due to Course Overlap
LING310 (5 credits).
Access to the Course
Open to all who have been admitted to study at master's level at UiB.
Teaching and learning methods
Lectures and hands-on computer-oriented seminars
Compulsory Assignments and Attendance
Passing grade on one mandatory assignment is required to be allowed to submit the final assignment. The assignment is, for example, a short written draft of the final report of approximately 500 words (or equivalent).
Forms of Assessment

The exam consists of two parts, each accounting for 50% of the final grade:

  • An individual term paper in the form of a project report detailing all the steps of an empirical analysis of a dataset. The paper should be up to 3000 words in length (not counting references, appendices etc.) and must be submitted in a well documented project workflow structure, e.g., with linked or attached analysis scripts that will reproduce data (pre-)processing, analysis and visualization
  • An oral examination covering both the paper and the rest of the syllabus.

The semester assignment may be submitted in Norwegian, Swedish, Danish, or English.

Candidates must pass both parts of the exam in the same semester.

Grading Scale
A-F
Assessment Semester
Fall
Reading List
Reading list comprises about 500 pages of literature, resources and documentation
Course Evaluation
Course evaluation will be conducted in accordance with the University of Bergen's quality assurance system.
Programme Committee
Programme Committee for Linguistics
Course Coordinator
Programme Committee for Linguistics
Course Administrator
Department of Linguistic, Literary and Aesthetic Studies.