Skip to main content
 

COMP42415: Text Mining and Language Analytics

It is possible that changes to modules or programmes might need to be made during the academic year, in response to the impact of Covid-19 and/or any further changes in public health advice.

Type Tied
Level 4
Credits 15
Availability Available in 2023/24
Module Cap None.
Location Durham
Department Computer Science

Prerequisites

  • None

Corequisites

  • None

Excluded Combinations of Modules

  • None

Aims

  • To introduce students to cutting-edge techniques for automated analysis of textual data and their applications

Content

  • Preparation of textual data for machine learning
  • Representation and modelling of textual data
  • Advanced machine learning techniques for natural language analysis
  • Application of natural language analysis techniques within data science e.g. sentiment analysis, social media analysis, text classification and clustering

Learning Outcomes

Subject-specific Knowledge:

  • Upon successful completion of the module, the students will:
  • Have a critical appreciation of how natural language texts can be effectively represented for machine learning
  • Have an advanced understanding of automated natural language analysis through machine learning
  • Understand how natural language analysis can be applied effectively within data science

Subject-specific Skills:

  • Upon successful completion of the module, the students will:
  • Be able to prepare natural language texts for machine learning
  • Be able to train and apply machine learning models based on real textual data

Key Skills:

  • Effective written communication
  • Planning, organising and time-management
  • Problem solving and analysis
  • Reflecting and synthesising from experience

Modes of Teaching, Learning and Assessment and how these contribute to the learning outcomes of the module

  • This module will be delivered by the Department of Computer Science
  • Learning outcomes are met through practical workshops, supported by online resources. The workshops consist of a combination of taught input, group work, case studies, discussion and computing labs. Online resources provide preparatory material for the workshops, typically consisting of directed reading and video content.
  • The summative assessment is an individual written assignment based on the development of a program to analyse a real natural language data set. This is designed to test students' skills in problem identification, their theoretical understanding, and their ability to analyse the situation in order to categorise the potential solutions.
  • Teaching on this module will be delivered in a blended mode with specific elements delivered online where student numbers determine online teaching as the most effective method.

Teaching Methods and Learning Hours

ActivityNumberFrequencyDurationTotalMonitored
Lectures82 times per week (Term 2, weeks 16-19)1 hour8 
Workshops82 times per week (Term 2, weeks 16-19)2 hours16 
Online Surgery123 times per week (Term 2, weeks 16-19)1 hour12 
Self study114 
Total150 

Summative Assessment

Component: AssignmentComponent Weighting: 100%
ElementLength / DurationElement WeightingResit Opportunity
Individual written assignment based on the application of techniques to a specific problem1500 words100 

Formative Assessment

A range of formative assessment methods will be used, including case study based exercises, group presentations and group discussions, and simulation exercises. Oral and written feedback will be provided on an individual and/or group basis as appropriate.

More information

If you have a question about Durham's modular degree programmes, please visit our Help page. If you have a question about modular programmes that is not covered by the Help page, or a query about the on-line Postgraduate Module Handbook, please contact us.

Prospective Students: If you have a query about a specific module or degree programme, please Ask Us.

Current Students: Please contact your department.