# MATH42715: Introduction to Statistics for Data Science

It is possible that changes to modules or programmes might need to be made during the academic year, in response to the impact of Covid-19 and/or any further changes in public health advice.

Type Tied 4 15 Available in 2023/24 None. Durham Mathematical Sciences

• None

• None

• None

## Aims

• To introduce the fundamentals of statistics needed for data science.

## Content

• Introduction to probability: independent events, conditional probability, expectation, probability distributions, computing probabilities.
• Exploratory statistics: descriptive statistics, data types and data collection.
• Statistical inference: sampling distributions, estimation, and hypothesis testing.
• Linear models: assumptions, estimation, inference, prediction.
• Classification and clustering methods.
• Transferable skills including teamwork, time-management, presentation, communication, organisation, and prioritisation skills.

## Learning Outcomes

Subject-specific Knowledge:

• By the end of the module students should be able to demonstrate the following statistical and computing skills:
• Understanding of principles of probability.
• The creation of descriptive and graphical data analysis, and knowledge of their use to make inferences.
• Creation of appropriate statistical models, with emphasis on formatting, presentation, and interpretation of data.
• Interpretation of statistical models including diagnostics and validation.
• Understanding the use of statistical methods as an underpinning of classification methods.

Subject-specific Skills:

• Students should be able to:
• Use statistical software R to conduct basic data analysis and manipulation including the creation of graphical data summaries and appropriate statistical models.
• Justify modelling approaches as well as draw appropriate conclusions and make appropriate recommendations.
• Critically assess the quality of models derived and conclusions drawn, and make recommendations for improvement.

Key Skills:

• Sufficient mastery of statistical concepts to enable engagement with data science methods.
• Ability to clearly communicate statistical models and relevant conclusions through writing and oral presentation.
• Understanding of how to function effectively as an individual and as a member or leader of a team.
• Ability to organise, prioritise, and manage time effectively.
• Ability to advance and extend their knowledge through significant independent learning and research.
• Ability to produce a clear and detailed written report with appropriate presentation.

## Modes of Teaching, Learning and Assessment and how these contribute to the learning outcomes of the module

• This module will be delivered by the Department of Mathematical Sciences.
• Teaching will be delivered primarily by workshops. Workshops describe theory and its application to concrete examples, enable students to test and develop their understanding of the material by applying it to practical problems, and provide feedback and encourage active engagement.
• Workshops are delivered in hybrid mode and are a combination of live lectures, computer practicals, problem classes, tutorials, and guided group work.
• Workshops will be supported by the distribution of materials such as video content, directed reading, e-assessments, reflective activities, opportunities for self-assessment, and peer-to-peer learning within a tutor-facilitated discussion board.
• Students will be able to obtain further help in their studies via scheduled office hours or surgeries as well as by approaching their lecturers by email.
• Students will be expected to work in between workshops, and to discuss their own work during the workshops. This work will be guided by the module leader, but will be organised by the students themselves, thereby enabling them to demonstrate their time management skills.
• Students will undertake independent research to further their knowledge of the topic and self-directed learning to further their technical and transferable skills.
• The workshops also provide opportunities for module leaders to monitor progress and to provide feedback and guidance on the development of ideas for the project, and for students to gauge their progress throughout the duration of the module.
• Student performance will be assessed through two individual assignments, short group report including reflective feedback, a group presentation, and a final group report.
• The individual assignments will provide the means for students to demonstrate their acquisition of subject knowledge and the development of their problem-solving skills.
• The group reports and presentation will provide the means for students to demonstrate their acquisition of subject knowledge, subject-specific skills, as well as key skills.

## Teaching Methods and Learning Hours

ActivityNumberFrequencyDurationTotalMonitored
Workshops10Once per week (Term 1, weeks 1-10)2 hours20
tOTAL150

## Summative Assessment

Component: AssignmentComponent Weighting: 50%
ElementLength / DurationElement WeightingResit Opportunity
Individual assignment 150Yes
Individual assignment 250Yes
Component: ProjectComponent Weighting: 50%
ElementLength / DurationElement WeightingResit Opportunity
Continuous group assessment3 short reports30Yes
Group project presentation30Individual video recording
Group project report40Individual project report

## Formative Assessment

Workshop discussion of students' ideas and experiences; informal discussions of student progress with module leader when necessary; interim feedback on group project via continuous group assessment.