Skip to main content

COMP42315: Programming for Data Science

It is possible that changes to modules or programmes might need to be made during the academic year, in response to the impact of Covid-19 and/or any further changes in public health advice.

Type Tied
Level 4
Credits 15
Availability Available in 2023/24
Module Cap None.
Location Durham
Department Computer Science


  • None


  • None

Excluded Combinations of Modules

  • None


  • To provide knowledge of, and the ability to apply, popular Python software packages currently used in industry settings.
  • To give students an understanding of how to programmatically gather, manipulate and process real-world data.
  • To introduce students to the key concepts of data analysis and data visualisation.


  • Programming in Python using popular packages such as Pandas, NumPy, SciPy and Matplotlib.
  • Reading, writing and parsing files in different formats.
  • Obtaining a data set through the use of web scraping.
  • Data munging, cleaning and preparing a dataset for analysis and visualisation.

Learning Outcomes

Subject-specific Knowledge:

  • By the end of this module, students should:
  • Understand advanced concepts of programming in Python.
  • Have a critical appreciation of the main strengths and weaknesses of a range of Python packages and understand how to use them.
  • Have a critical appreciation of how to acquire and clean datasets for analysis.
  • Understand how to manipulate potentially large datasets in an efficient manner.

Subject-specific Skills:

  • By the end of this module, students should:
  • Be able to write computer programs in python using industry standard packages.
  • Be able to select appropriate data structures for modelling various data science scenarios.
  • Be able to select the appropriate algorithm and programming package for a given problem.
  • Be able to write a computer program in python to collect or read data from available sources, and clean these datasets using the appropriate packages.

Key Skills:

  • Effective written communication
  • Planning, organising and time-management
  • Problem solving and analysis

Modes of Teaching, Learning and Assessment and how these contribute to the learning outcomes of the module

  • This module will be delivered by the Department of Computer Science.
  • Learning outputs are met through practical workshops, supported by online resources. The workshops consist of a combination of taught input, case studies, discussion and computing labs. Online resources will typically consist of directed reading and a programming environment with example code.
  • The summative assessment is an individual written report on the design, implementation and analysis of a program designed to solve a specific data science problem.
  • Teaching on this module will be delivered in a blended mode with specific elements delivered online where student numbers determine online teaching as the most effective method.

Teaching Methods and Learning Hours

Lectures123 times per week (Term 2, weeks 1-4)1 hour12 
Workshops41 time per week (Term 2, weeks 1-4)3 hours12 
Online Surgery123 times per week (Term 2, weeks 1-4)1 hour12 

Summative Assessment

Component: AssignmentComponent Weighting: 100%
ElementLength / DurationElement WeightingResit Opportunity
Individual written assignment based on development of a program2000 words100 

Formative Assessment

The formative assessment consists of classroom-based exercises on specific computer science topics, relevant to the learning outcomes of the modules. Oral feedback will be given on a group and/or individual basis as appropriate.

More information

If you have a question about Durham's modular degree programmes, please visit our Help page. If you have a question about modular programmes that is not covered by the Help page, or a query about the on-line Postgraduate Module Handbook, please contact us.

Prospective Students: If you have a query about a specific module or degree programme, please Ask Us.

Current Students: Please contact your department.