Common Data Analysis Approaches

Time to complete: 1 hour

What will this topic cover?

This topic forms part of a wider learning pathway and is designed to help you gain advanced digital skills and apply them in your role. We would recommend completing the intro to data pathway first. This particular pathway is more in-depth and can take longer to complete as it will involve terminology, mathematical approaches and information regarding reporting data. However, if you are already familiar with some of the content covered, the topic is designed so that you can move freely between steps/activities based on your needs.

Please note that the topics within the advanced data pathway involve more in-depth analysis, mathematical approaches and discussion. Please be aware that these pathways are designed so that you can jump to the appropriate steps when needed but they may take more time to complete.

This topic will focus on the four key approaches to data analysis, although there are, of course, further options beyond these four, an overview of why you would choose each approach and an overview of some of the key steps for implementing that approach.

By the end of this topic, you will be able to:

Understand and Highlight some key data analysis approaches
Explore concept approach ideas and discuss how they may be applied to your discipline
Use key terminology when discussing data analysis approaches

How to use this topic page

This topic page is split up into different sections. Each section has a step and an activity to complete. These include scenarios and links off to instructions to try elements for yourself. Each topic also has a reflective section to think about how this will be used within your own practice.

Step 1: Regression Analysis

Regression analysis is a statistical method used to understand the relationship between two or more variables. In simple terms, it helps you figure out how changes in one variable (independent variable) affect another variable (dependent variable). This is usually achieved via trendlines.

What is a trendline?

Trendlines reveal the direction and strength of the relationship between the variables. Look at the trendline to see the direction and strength of the relationship between the variables. A positive slope means a positive relationship, while a negative slope means a negative relationship. The steepness of the slope can give insight into the strength of the relationship.

Positive and negative trendlines

A positive slope in terms of a trendline indicates that as the independent variable increases, the dependent variable also increases. This suggests a direct relationship between the variables, meaning that higher values of one variable are associated with higher values of the other. For instance, in a business context, a positive slope could indicate that an increase in marketing expenditure is associated with an increase in sales revenue. The steeper the slope, the stronger the relationship between the variables.

This image shows a graph with a positive slope, trendline. The graph demonstrates that a positive slope trendline will move from the bottom left of the graph reaching up to the top right of the graph, indicating a positive relationship between the two variables.

Negative trendlines

A negative slope in terms of a trendline indicates that as the one variable increases, the other variable decreases. This suggests an inverse relationship between the variables, for example a negative trendline might show that as cost of living increases, consumer spending decreased.

This image shows a graph with a positive slope, trendline. The graph demonstrates that a negative slope trendline will move from the top left of the graph reaching up to the bottom right of the graph, indicating a negative relationship between the two variables.

What types of trendlines are there?

There are several different types of trendline that can be applied to charts to highlight ley differences. An overview of each trendline is explained below.

Linear Trendline: Used when data points have a straight-line relationship, indicating a constant rate of change.

Exponential Trendline: Ideal for data that increases or decreases at a continually growing rate.

Polynomial Trendline: Best for data that fluctuates, suitable for analyzing large datasets with varying trends.

Logarithmic Trendline: Useful when the rate of change in the data increases or decreases quickly and then levels out.

Power Trendline: Applied to data that increases at a specific rate, often used in scientific data.

Moving Average Trendline: Smooths out short-term fluctuations and highlights longer-term trends or cycles.

Example

Imagine you want to know how study hours (independent variable) affect student grades (dependent variable) you may follow the steps below to look at common trends.

Steps:

Collect Data: Gather data on study hours and grades for a sample of students.
Enter Data into Excel: Organise the data in a table format. (Create and format tables – Microsoft Support)
Create a Scatter Plot: Use Excel to create a scatter plot of study hours vs. grades. (Visualising data – Digital Services (web)
Add a Trend Line: Add a linear trend line to the scatter plot. (Add a trend or moving average line to a chart – Microsoft Support)
Interpret the Results: Use the trend line to understand the relationship between study hours and grades.
Make Predictions: Predict grades for students based on their study hours and identify those who may need additional support.

Activity

Scenario

A school administrator aims to identify students who may require additional support by analysing their study habits. By examining the relationship between the amount of time students spend studying and their academic performance, the administrator can make informed decisions about which students might need extra help to improve their grades.

What type of trend line do you think would be beneficial for this case study?

Answer:

For the case study involving the analysis of study hours and student grades, the most appropriate type of trendline to use on a graph would be a linear trendline.

A linear trendline is suitable because it helps to identify and illustrate the direct relationship between the two variables – study hours (independent variable) and student grades (dependent variable). Since the goal is to understand how changes in study hours affect grades, a linear trendline can effectively show whether there is a positive or negative correlation between these variables.

By adding a linear trendline to the scatter plot, the school administrator can visually assess the strength and direction of the relationship. If the trendline slopes upwards from left to right, it indicates a positive correlation, meaning that as study hours increase, grades tend to improve. Conversely, a downward slope would suggest a negative correlation.

Step 2: Hypothesis-Based Analysis

What is a hypothesis-based analysis?

This approach is a way to test if certain ideas (or hypotheses) about a group of people or processes are true. It helps you determine whether there is enough evidence to support your theories and make better-informed decisions. For example, you might want to test if a new teaching method improves student performance.

Inside of hypothesis-based analysis, we often use the terms Null Hypothesis and an Alternative Hypothesis. Data sets, will enable you to correlate information and draw more accurate conclusions for data informed decision making.

Null Hypothesis

The null hypothesis (H0) is a statement that there is no effect or no difference, and it serves as the default or baseline assumption. For instance, in a higher education setting, a researcher might want to test if a new online learning platform affects student performance. The null hypothesis would be that the new online learning platform does not significantly impact student performance compared to traditional in-person classes. Essentially, it assumes that any observed differences are due to random chance rather than the new platform’s efficacy.

Alternative Hypothesis

In contrast, the alternative hypothesis (described as either H1 or Ha) is the statement that there is an effect or a difference, opposing the null hypothesis. The goal of hypothesis testing is to determine whether there is enough statistical evidence to reject the null hypothesis in favor of the alternative hypothesis.

For example, in a higher education setting, a researcher might want to test if a new online learning platform positively affects student performance. The alternative hypothesis would be that the new online learning platform significantly impacts student performance compared to traditional in-person classes. Essentially, it assumes that any observed differences are due to the new platform’s efficacy rather than random chance.

Activity

Scenario

A Director of Teaching and Learning aims to evaluate if a new teaching method enhances student performance compared to traditional methods. To investigate this, the director implements the new teaching method in a few selected classes while retaining traditional methods in others. Data on student performance, such as grades, participation, and retention rates, are collected over a term. This involves formulating hypotheses, collecting data, and interpreting the results to determine the effectiveness of the new approach.

Can you think of a Null and Alternative Hypothesis which would suit this particular scenario?

Answer:

Whilst there are a few potential hypotheses. One example may be:

Null Hypothesis (H0):

The new teaching method does not significantly improve student performance compared to traditional methods. Any observed differences in performance are due to random chance.

Alternative Hypothesis (H1 or Ha):

The new teaching method significantly improves student performance compared to traditional methods. Any observed differences in performance are due to the efficacy of the new method.

Did you come up with an alternative?

Step 3: Performing Descriptive & Predictive Analysis

What is the difference between descriptive and predictive analysis.

There are many different approaches that we can use with data, you can even combine these approaches in order to provide a comprehensive understanding of data.

Descriptive Analysis

This approach summarises historical data to understand past trends. For example, analysing average grades over the past five years can help identify patterns in student performance. Educational institutions can achieve this by collecting and organising grade data, then using statistical tools to summarise and interpret the results.

Predictive Analysis :

Predictive analysis uses historical data to forecast future trends. For instance, predicting which students are at risk of dropping out based on their attendance, grades, and participation can enable proactive interventions. By analysing historical data, institutions can identify at-risk students early and implement targeted support programmes. Predictive analysis approaches should always be partnered with a prescriptive approach,. In other words focussing on potential solutions to support your intended outcome.

For example, if the predictive analysis highlights that at-risk students will improve with wider support, you would think about the actions that can be used to help support the development that was identified e.g. Institutions can achieve this by developing intervention strategies and continuously monitoring development.

Together, these approaches allow educational institutions to create a dynamic, data-driven environment that adapts to student needs. By understanding past trends, anticipating future challenges, and implementing targeted solutions, universities can foster continuous improvement, leading to better student outcomes and a more effective educational system.

Activity

Scenario

A university aims to improve student retention and academic performance by implementing a peer mentoring program. The institution recognizes the importance of providing additional support to students who may be struggling academically or socially. By establishing a peer mentoring system, where experienced students guide and assist their peers, the university hopes to foster a more collaborative and supportive learning environment. This program could enhance students’ sense of belonging and engagement, ultimately leading to better retention rates and academic success. The institution needs to understand the potential benefits and best strategies for execution.

Which analysis would be best for this scenario?

Answer:

In the provided scenario, a university aims to improve student retention and academic performance by implementing a peer mentoring programme. To understand the potential benefits and best strategies for execution, the most suitable type of analysis would be Predictive Analysis.

Why Predictive Analysis is the Best Fit

In this scenario, we would recommend predictive analysis as this leverages historical data to forecast future trends and outcomes. In the context of the peer mentoring program, this approach can provide insights into which students might benefit most from the program and predict its impact on student retention and academic performance.

By anticipating student needs and challenges, the university can proactively offer support, enhance student engagement, and ultimately improve retention and academic success. This analysis ensures that the program is not only well-informed but also adaptable to the evolving needs of the student body.

Step 4: Sensitivity and Scenario Testing

What does sensitivity and scenario testing mean?

Sensitivity and scenario testing are techniques used to evaluate the robustness of decisions by identifying potential weaknesses in assumptions or options. In higher education, these methods can be used to assess the impact of a new approaches or changes.

To conduct sensitivity and scenario testing we need to work out best, worst, and moderate case scenarios which involves a systematic approach to evaluating different variables and their impacts. We have given you some advice below in terms of elements to consider for the three different types of scenarios, Information and draw more accurate conclusions for data informed decision making.

Best Case Scenario

In a best case scenario, we assume optimal conditions where all key variables align favourably. This could include variables such as high engagement rates, low costs, and full funding availability. To determine this scenario, we would need to

Identify Key Variables: Determine the factors that will influence the outcome
Assume Optimal Conditions: Consider the most favourable conditions for each variable (e.g., highest engagement, lowest costs).
Calculate Outcome: Use these optimal conditions to project the best possible outcome.

Tips:

Be Realistic: While optimistic, ensure the assumptions are still within the realm of possibility.
Consider External Factors: Include favourable external factors like market growth or technological advancements.

Worst Case Scenario

The worst case scenario involves unfavourable conditions where key variables are misaligned. This could include variables such as low engagement rates, high costs, and partial funding availability. To determine this scenario, we would need to:

Steps:

Identify Key Variables: Determine the factors that will influence the outcome.
Assume Adverse Conditions: Consider the worst possible conditions for each variable (e.g., lowest engagement, highest costs).
Calculate Outcome: Use these adverse conditions to project the worst possible outcome.

Tips:

Prepare for Contingencies: Develop strategies to mitigate risks identified in the worst case scenario.
Stress Test: Ensure your business can survive under these conditions.

Moderate Case Scenario

In a moderate case scenario, we assume average conditions where key variables are balanced. This could include variables such as moderate engagement rates, balanced costs, and sufficient funding availability. To determine this scenario, we would need to :

Identify Key Variables: Determine the factors that will influence the outcome.
Assume Average Conditions: Consider the most likely conditions for each variable (e.g., average engagement, typical costs).
Calculate Outcome: Use these average conditions to project the most probable outcome.

Tips:

Use Historical Data: Base your assumptions on historical performance and trends.
Adjust for Current Trends: Consider current market conditions and trends to refine your assumptions.

By performing sensitivity testing, where one key variable is adjusted at a time, and scenario testing, where multiple variables are changed simultaneously, both positive and negative impacts can be anticipated and strategies developed to address them. This comprehensive analysis allows for informed decision-making and effective implementation, ensuring resilience to varying conditions.

For example, a university may want to implement a programme to improve student performance. By identifying key variables such as student participation rates, programme costs, and funding availability, the university can create a base case scenario with the most likely values. Performing sensitivity testing by adjusting one key variable at a time shows how changes impact the programmes success, whereas, scenario testing by changing multiple variables at once helps understand the best and worst-case outcomes. This thorough analysis enables the university to make informed decisions and implement the programme effectively, ensuring it is resilient to varying conditions.

Activity

Scenario

A Programme Leader aims to implement a peer-assisted learning support programme and needs to evaluate the risks and impacts of various assumptions. The evaluation process will involve identifying potential challenges, assessing resource requirements, and determining the effectiveness of peer-assisted strategies in enhancing student learning outcomes. Additionally, stakeholder feedback and pilot testing may be incorporated to ensure the programme meets educational standards and addresses any unforeseen issues.

Can you identify the potential best, worse and moderate case scenarios of the approach mentioned?

Answer:

All the answers below show different examples of what you may want to consider. Did you find wider variables that you would like to include?

Best case scenario

We experience high participation, low costs, and full funding. This combination ensures optimal resource utilization and maximum engagement from all stakeholders. It represents the ideal situation where everything aligns perfectly to achieve our goals.

Worst case scenario

We face low participation, high costs, and partial funding. This outcome poses significant challenges as it requires managing limited resources with minimal engagement. The disparity between needs and available support creates obstacles that must be addressed strategically.

Moderate case scenario

This involves average participation, moderate costs, and partial funding. This balanced approach provides a realistic outlook where we can manage expenses effectively while ensuring reasonable levels of involvement. Although not ideal, it offers a practical path forward that allows us to make progress despite certain limitations.

Step 5: Reflection

What have I discovered from this learning topic?

This step is designed to help you think about what you have learned and how this applies to your own practice and context. The activity will ask you some questions to help you with this reflection.

Activity

Reflect

Use the following questions to help you think about your own practice.

Can you think of any data where these techniques would be useful?
Have you used any of these techniques before and what was the output of those techniques?
Do you have a list of common scenarios that can be used sensitivity and scenario testing?
If you want to do scenario testing, can you think of any wider departments which it would be useful to get input from to ensure the scenario is suitable?

Common Data Analysis Approaches

On this page

Time to complete: 1 hour

What will this topic cover?

How to use this topic page

Step 1: Regression Analysis

What is a trendline?

Positive and negative trendlines

Negative trendlines

What types of trendlines are there?

Example

Steps:

Activity

Scenario

Step 2: Hypothesis-Based Analysis

What is a hypothesis-based analysis?

Null Hypothesis

Alternative Hypothesis

Activity

Scenario

Step 3: Performing Descriptive & Predictive Analysis

What is the difference between descriptive and predictive analysis.

Descriptive Analysis

Predictive Analysis :

Activity

Scenario

Why Predictive Analysis is the Best Fit

Step 4: Sensitivity and Scenario Testing

What does sensitivity and scenario testing mean?

Best Case Scenario

Worst Case Scenario

Moderate Case Scenario

Activity

Scenario

Best case scenario

Worst case scenario

Moderate case scenario

Step 5: Reflection

What have I discovered from this learning topic?

Activity

Reflect

Other Advanced Data Learning Pathways

Introduction to Basics Statistical Concepts

Consideration for Data Analysis

Storytelling & Reporting Data

Was this helpful?