Skip to content

Online Master of Data Science: Course Structure

Course structure details

12–16 subjects required

You can complete your online Master of Data Science course in just 14 months with 12 subjects if you choose to study full time and have an undergraduate degree in a related field. Many students graduate within two and a half years, but if your schedule becomes challenging, you have the choice to take up to five years to complete the course.

If your undergraduate degree is in an unrelated field, you’ll learn everything you need to know with four fundamentals subjects. You’ll graduate with 16 subjects so you can jump straight into this growing sector.

For more information about the duration of the program or the course structure, speak with an enrolment advisor on (+61 3) 9917 3009 or request more information now.

Core

The Academic Integrity Module will introduce you to academic integrity standards, so you’re informed about how to avoid plagiarism and academic misconduct.  You’ll complete four parts that cover academic misconduct and academic integrity decisions, such as cheating, plagiarism and collusion.  You’ll learn about the text-matching tool, Turnitin, that is used at La Trobe, how to get help and where to go to develop referencing skills.

This subject starts with an overview of the architecture and management of database systems, and a discussion of different existing database models. The main focus includes relational database analysis, design, and implementation. The students learn: relational algebra as the formal foundation of relational databases; relational conceptual design using an entity-relationship diagram; relational logical database design; security and integrity; and SQL implementation of relational database queries. Students will also learn advanced normalization theory and the techniques to remove data anomalies and redundancies. In this subject, students are required to design a database application that meets the needs of a system requirement specification, and to implement the system using a commercial standard database system such as ORACLE or POSTGRESQL. In addition, a selection of advanced topics in databases will be introduced and discussed.

In this subject, you will be introduced to the steps involved in designing and creating software solutions for a range of practical problems. To enable you to design and implement solutions, you will be introduced to methods for analysis of requirements, development of the overall structure of a solution, and identification of its key parts, and on this basis, to incrementally build and test the solution. To develop your problem-solving skills, problems drawn from different domains, with increasing complexity, will be presented for your practice. You will be introduced to the concepts of class and object, to represent real-world objects to solve problems arising from an application domain. Python is used as the programming language in the subject. The strengths of Python, in particular its supports for quick testing of ideas, are exploited to facilitate the development of your problem-solving skills and effective software development practice.

Important mathematical ideas which underpin the theory and techniques of data science are introduced and consolidated in this subject. Matrices are used to store and work with quantitative information, and the methods of calculus are used to find extreme values and accumulation. The Gamma and Beta functions are introduced, as are eigenvalues, eigenvectors and the rank of a matrix. Emphasis is placed on the relevance of the mathematics to data science applications (such as least squares estimators and calculation of variance in data), and on the development of clear communication in explaining technical ideas. This is a foundational subject for the Master of Data Science.

This subject develops an understanding of probability and statistics applied to Data Science. Probability topics include joint and conditional probability, Bayes’ Theorem and distributions such as the uniform, binomial, Poisson and normal distributions as well as properties of random variables and the Central Limit Theorem. Statistical inference and data analysis is also considered covering, among other topics, significance testing and confidence intervals with an introduction to methods such as ANOVA, linear and nonlinear regression and model verification. Applications to data science are considered and students will be exposed to the R statistical package as well as the mathematical type-setting package LaTeX.

In this subject you will be provided with specialist knowledge and tools required to formulate solutions to complex data p problems encountered by data scientists. You will learn various data exploration techniques and analysis tools. Selected topics include data cleaning, data normalisation, data visualisation and data exploration. One or more applications associated with each problem will also be discussed. You will learn the fundamentals of exploratory data analysis techniques, statistical learning, and correlation analysis to solve these problems. You will also learn to implement data exploration methods and analysis tools using the R programming language.

The purpose of this subject is to outline the basic principles of Entrepreneurship. It will examine the steps required in developing an idea into a business and will explore the tools and necessary insights to make a successful venture. The subject will involve theory, case studies and guest speakers on start-up issues, pitfalls, and ingredients for success. Students will also develop professional skills related to ethical and moral decision making and evaluate the social implications of their work and the broader global context. The subject requires active participation in group discussions and activities.

Core choice specialisation: Data Modelling and Analytics Select 60 credit points

Repeated measures data is used commonly in many disciplines including health, psychology, economics and biology. This subject provides students with the knowledge of how to perform the appropriate statistical analysis in a repeated measures data environment by using models such as the linear mixed model, correlated random effects model and marginal model. Students will learn how to examine research questions by applying these models using the R statistical package.

The literature abounds with findings that collectively may offer important new insights for the betterment of the medical, psychological and life sciences, to name just a few. This subject is designed to provide students with the ability to combine estimated measures of evidence, known as effects, from comparable studies to increase power. Estimators are introduced which are commonly found in meta-analytic research and pitfalls are discussed. On completion of the subject, the student will have an understanding of the different effects that can be collected from the literature as well as an appreciation of how effect sizes arising from data measured on different scales can be combined. Importantly, this subject also shows students how meta-regression can be used to account for study-specific covariates that cannot be adequately accounted for using random-effects models. The freely available software packages R and RevMan are used throughout the subject.

The advance in omics technology have seen an exponential increase in the volume of biological data in the last ten years. Statistical models play important roles in drawing conclusions from and making sense of the complex and often noisy omics data. This subject will introduce students to statistical issues and potential solutions to problems commonly encountered at various stage of omics data analysis, from data acquisition, alignment, quality controls, data analysis, visualization and interpretation. Topics covered will include introduction to next-generation sequencing and microarray technologies, batch effects and other unwanted variations, multiple hypothesis testing problems, statistical tests and models for high-dimensional data, data visualization and utilizing biological database via pathway-based analysis. Students will also be introduced to intermediate level of R programming language, including writing customized scripts and functions, developing R packages and working with ‘pipe’ operator. Bioconductor packages ( www.bioconductor.org ) and other freely-available Bioinformatics software will be used for all Lab sessions.

Quantitative analysis plays an important role in industrial data analytics and knowledge engineering, which makes it very useful to develop computing skills for data regression and classification. This subject covers fundamentals of machine learning techniques in theory and practice. The subject is designed to focus on solving industrial data modelling problems using neural networks. You will learn how to test various learning algorithms and compare performance evaluations. Some advanced machine learning techniques for data classification will also be addressed. You will work with industrial data modelling in labs and assignments to consolidate your knowledge and gain hands-on experience with machine learning applications

Data Mining refers to various techniques which can be used to uncover hidden information from a database. The data to be mined may be complex data including big data, multimedia, spatial and temporal data, biological and health data. Data Mining has evolved from several areas including: databases, artificial intelligence, algorithms, information retrieval and statistics. This subject is designed to provide you with a solid understanding of data mining concepts and tools. The subject covers algorithms and techniques for data pre-processing, data classification, association rule mining, and data clustering. The subject also covers domain applications where data mining techniques are used.

The subject introduces you to spatial data analysis. It surveys the theory of spatial random processes, spatial statistics models, and their applications to a wide range of areas, including image analysis and GIS (geographic information system). The subject will cover the methodology and modern developments for spatial-temporal modelling, estimation and prediction, spectral analysis of spatial processes and working with big spatial data. All the methods presented will be introduced and illustrated in the context of specific datasets with GRASS and R software. You will get experience with analysis of real-world data.

Core choice specialisation: Big Data and Cloud Computing Select 60 credit points

Quantitative analysis plays an important role in industrial data analytics and knowledge engineering, which makes it very useful to develop computing skills for data regression and classification. This subject covers fundamentals of machine learning techniques in theory and practice. The subject is designed to focus on solving industrial data modelling problems using neural networks. You will learn how to test various learning algorithms and compare performance evaluations. Some advanced machine learning techniques for data classification will also be addressed. You will work with industrial data modelling in labs and assignments to consolidate your knowledge and gain hands-on experience with machine learning applications

Data Mining refers to various techniques which can be used to uncover hidden information from a database. The data to be mined may be complex data including big data, multimedia, spatial and temporal data, biological and health data. Data Mining has evolved from several areas including: databases, artificial intelligence, algorithms, information retrieval and statistics. This subject is designed to provide you with a solid understanding of data mining concepts and tools. The subject covers algorithms and techniques for data pre-processing, data classification, association rule mining, and data clustering. The subject also covers domain applications where data mining techniques are used.

Creating web sites that scale to serve hundreds of millions of users with acceptable response times is a very challenging task. The main focus of this subject is on cloud computing concepts and tools that are needed to make web sites scalable. This subject assumes the technologies HTML, CSS and basic Javascript have already being taught in CSE4IFU. The subject will cover topics such as frontend fundamental (Git, responsive web design, popular frontend frameworks and the React framework), advanced frontend and backend development (Redux, Docker, RestAPI, stateless web servers and Nodejs), and web server storage and deployment in Microsoft Azure (fundamental cloud computing concepts, continuous integration and delivery with Microsoft Azure, database and no SQL storage with Microsoft Azure, authentication and authorization, and integration of third party services, such as Twitter, Google Maps and Weather, etc.).

The subject introduces you to spatial data analysis. It surveys the theory of spatial random processes, spatial statistics models, and their applications to a wide range of areas, including image analysis and GIS (geographic information system). The subject will cover the methodology and modern developments for spatial-temporal modelling, estimation and prediction, spectral analysis of spatial processes and working with big spatial data. All the methods presented will be introduced and illustrated in the context of specific datasets with GRASS and R software. You will get experience with analysis of real-world data.

The literature abounds with findings that collectively may offer important new insights for the betterment of the medical, psychological and life sciences, to name just a few. This subject is designed to provide students with the ability to combine estimated measures of evidence, known as effects, from comparable studies to increase power. Estimators are introduced which are commonly found in meta-analytic research and pitfalls are discussed. On completion of the subject, the student will have an understanding of the different effects that can be collected from the literature as well as an appreciation of how effect sizes arising from data measured on different scales can be combined. Importantly, this subject also shows students how meta-regression can be used to account for study-specific covariates that cannot be adequately accounted for using random-effects models. The freely available software packages R and RevMan are used throughout the subject.

Electives Select 30 credit points

As data becomes ever more complex to analyse, the need for tools to help integrate the user’s knowledge and inference capability into the analytical process becomes important. In analytics, this is generally referred to as visualisation, which this subject will cover in detail from various perspectives, for example, temporal, spatial, spatial-temporal, multi-variate, text/documents, graphs and networks, and more. These perspectives will be covered with various business applications in mind including, statistical and summary reporting, trends spotting and projections, process capture, and real-time reporting. These applications will be discussed with reference to various visual analytics frameworks and theories as well as case examples.

This subject introduces you to the various techniques of data wrangling with a strong focus on hands-on experience in R and Structured Query Language (SQL) programming. It will cover the basic concepts in relational database design including Entity Relationship (ER) modelling and SQL as a tool for basic data wrangling. You will also learn various types of data sources and common data formats. The subject teaches you R programming language for you to perform data wrangling tasks, including data import and export, basic data integration and data assessment. Upon completion, you will be able to perform a variety of data wrangling tasks using SQL and R for different kinds of data types

Request more information

Our enrolment team is here to support you and answer your questions about the application process, entry requirements, tuition fees and study assist options or specific course details.

Complete the form below for detailed course information and to be contacted by phone and email.

All fields required