Note: This article has been adapted from an Advanced Performance Management article. Whilst the article does not use examples related to sustainability, it does provide key information in relation to types and methods of data analytics and the ethical considerations around data. Data analytics is being widely used in sustainability matters and so understanding the approaches is useful and these can then be applied to scenario with a sustainability focus.
Types of data analytics
The amount of data available to organisations has increased significantly in recent years. However, for this data to be valuable to organisations, they need to be able to analyse it, to identify patterns and trends in it. Data analytics allows organisations to do this.
This article discusses the types of data analytics organisations could use, and the benefits from using them.
As the notion of ‘big data’ highlights, the amount of data available to businesses (volume; variety) is greater than ever before, and the speed with which it is available (velocity) is faster than ever before. Consequently, business decisions are increasingly data-driven, as organisations can use data from a greater variety of sources, and dive deeper into analysis than they have historically been able to.
Data analytics is the process of collecting and examining data, in order to extract meaningful business insights, which can be used to inform decision-making to improve performance. This can be done through a variety of methods, such as statistical analysis (e.g. regression analysis) and machine learning1 and organisations are likely to use data analytics software (such as Tableau; Microsoft Power BI; or Qlik Sense) to facilitate their analysis. Although the detailed analysis and modelling of data takes place in the software, accountants need to be able to interpret the information provided by the software; for example, assessing whether patterns or trends identified by the software seem realistic or plausible.
Descriptive, diagnostic, and predictive analytics
Each type of analytics serves a different purpose but can be used in conjunction with the others to help an organisation gain a full understanding of the story its data tells. Descriptive analytics is typically carried out first.
Descriptive analytics
Descriptive analytics uses current and historical data to identify trends and relationships; in effect, to identify what has happened. Descriptive analytics is relatively accessible, and basic statistical software – such as Microsoft Excel – can be used to highlight trends in data and to produce visualisations, such as line graphs, bar charts or pie charts.
Nonetheless, descriptive analytics can be very useful in communicating changes over time, and identifying patterns and trends, which can then be analysed further.
ILLUSTRATIVE EXAMPLE 1 – Demand trends
The streaming provider, Netflix, gathers data on users’ behaviour, for example, what programmes they are watching, or what types of programmes they are watching. This allows Netflix’s team to analyse the data to determine which TV programmes and movies are trending at any given time, and Netflix then shares these trends with users.
Not only does this allow users to see what is popular – and to get suggestions about what they might enjoy watching – it also allows the Netflix team to know what types of programmes, themes, or actors are particularly popular at a given time. This knowledge can then help to shape decision-making about future programmes to commission and can also be used in re-targeting campaigns (recommendations of programmes for viewers to watch).
Once an organisation understands what has happened (descriptive analytics), it will want to understand why. That is where diagnostic analytics comes in.
Diagnostic analytics
Diagnostic analytics is the process of using data to determine the causes of trends. Understanding why a trend is developing, or why a problem occurred, is very important when making decisions. However, there is often more than one contributing factor to any given trend. Diagnostic analytics helps organisations understand the range of factors – internal and external – which affect outcomes, and which have the greatest impact, so that managers can focus on these when developing initiatives to improve performance.
It often involves the use of statistical software tools and can involve a variety of techniques including data drilling and data mining, regression analysis and time series analysis2. To investigate the root cause of trends, organisations may need to examine a wider range of data sources than they have historically examined, including non-financial as well as financial, and external as well as internal.
Data drilling: The most common type of data drilling is drilling down. Drilling down into summary information can reveal more detail about the data which is driving trends at the summary level. For example, a consolidated sales report may show that overall sales have increased but drilling down into this could reveal that sales of some products are rising rapidly while others are falling. Similarly, the sales team could drill down to get a more detailed view of sales in individual regions, or across different sales channels. This analysis could then help the sales team to decide where to focus its resources to maximise growth going forwards.
ILLUSTRATIVE EXAMPLE 1 – Analysing staff turnover
A company’s human resources information showed that one department was hiring significantly more people than any other department, but there was no net increase in the department’s head count, because its staff turnover rate was also much higher than the other departments. Drilling down into the data revealed that many of the positions were for a specific team, which paid its staff less than the industry average. Having identified this, the company reviewed its pay scales for that team, and took other measures to improve retention in the team.
ILLUSTRATIVE EXAMPLE 2 – Understanding customer demand
HelloFresh’s business model centres around selling and delivering pre-packaged meal boxes, each containing the exact ingredients required for a particular meal, including fresh produce, meat, dairy, and seasonings, on a subscription basis.
The perishable nature of the fresh food in the meal kits, presents HelloFresh with a daily supply chain challenge: ordering the right amounts of products so that customers receive the ingredients they want, when they want them, whilst avoiding food waste because of over-ordering.
As with any subscription-based business, customer retention is vital, because customer churn erodes revenue and profits, along with the cost of attracting and acquiring new customers. Keeping customers satisfied requires an understanding of their preferences, including what recipes, ingredients, and meals each household favours.
By collecting data from customers (for example, when they favour certain types of food over others, when they want to receive their orders) HelloFresh has developed algorithms which it can use to help forecast demand more effectively. At an individual level, if particular customers habitually eat fish on a Friday, this enables HelloFresh to target tailored seafood offers in the latter part of the week. At a wider level, analysing food trends across different regions can inform the marketing of special offers to customers in those regions.
Having an improved understanding of the customer allows the company to optimise product selections to them, limiting the amounts of products needed to meet customer requirements without generating waste.
If customers want to cancel their subscription, during the cancellation process they are asked to provide their reason for cancelling. By also gathering this data, HelloFresh can analyse the most frequent reasons for losing customers – in different regions, or among different demographic groups, as well as at an overall level. In turn, understanding why people are cancelling their subscriptions can help HelloFresh to improve its product and user experience, to help it retain more customers.
Overall, the insights gained from analysing customer demand and customer feedback have helped HelloFresh increase profit margins, improve order volume forecasts, and increase customer retention.
Diagnostic analytics is not only about statistics, though. It also involves thinking laterally, considering external factors that might be impacting the patterns in data, and finding additional sources of data to help build a broader picture. For example, a clothing brand may see an unexpected surge in sales if one of its products is worn by a high-profile celebrity or promoted by a celebrity influencer.
However, it is also important to use professional scepticism when looking at data; for example, asking whether the results and analysis fit with your understanding of a situation. One of the limitations of any kind of data analytics is the quality of the underlying data. If that data is not accurate or current, then the resulting analysis cannot be relied on either.
More widely, we also need a note of caution here, and to be aware of the potential limitations of diagnostic analytics, in particular:
- It relies on past data, which limits its ability to draw conclusions about possible future events; past performance is not indicative of future results.
- Regression analysis and correlation analysis examine how strongly different variables are linked to each other. However, correlation does not necessarily imply causality.
ILLUSTRATIVE EXAMPLE – Correlation not causality
Monthly data for ice cream sales and the number of shark attacks around the United States shows the two variables are highly correlated, increasing in the summer months and falling in the winter months.
However, this does not mean that eating ice cream causes shark attacks. Rather, people consume more ice cream when it is warmer, and also swim in the ocean more when it is warmer, which explains why the two variables are correlated. But although they are highly correlated, one does not cause the other.
Predictive analytics
Once you know what happened in the past (descriptive analytics), and you understand why it happened (diagnostic analytics) you can begin to try to predict what is likely to happen in the future based on that information.
Predictive analytics is the process of using data to help understand how trends might unfold in the future, and to help predict future trends and events. It uses statistics, computer modelling and machine learning to determine the probability of various outcomes. Predictive analytics still uses historical data but does so to help predict what is likely to happen in the future.
For example, let us say a mobile phone provider has experienced an increase in customer churn. Diagnostic analytics has revealed this is because the organisation’s promotional deals were not incentivising customers to renew their phone contracts. Predictive analytics could then be used to help predict what kind of promotional deals will result in more renewals. For example, if the diagnostic analytics revealed that the upfront cost was a key factor in influencing customers’ decision, but customers were less concerned about monthly fees, the company then use predictive analytics to forecast the impact of different combinations of upfront costs and monthly fees, and which are likely to be more attractive to customers.
Equally, in the retail industry, applying predictive analytics to historical data, market data, demographic data, behavioural data, browsing trends, and more, can help retailers make more accurate demand forecasts. In turn, these forecasts can help inform inventory management, staffing decisions, and advertising campaigns.
Predictive analytics not only forecasts possible future outcomes but also identifies the likelihood of those events happening. In doing so, it helps organisations with better planning and realistic target setting, as well as avoiding unnecessary risk. One of the most valuable forms of predictive analytics is ‘what-if’ analysis, which involves changing variables or factors to see how those changes will affect the outcome.
For example, one of the key factors that could affect a sales forecast are economic conditions and the industry environment. ‘What-if’ analysis could help organisations to understand the potential impact of different scenarios on its sales forecasts:
- How fast is the economy growing (or shrinking)? What impact will an increase (or decrease) in economic growth have on sales?
- Are new competitors entering the market? If so, how likely are they to take some of the organisation’s market share? And how much might they capture?
- Are there any opportunities to gain new customers (e.g. from launching a new product, or moving into new markets)?
ILLUSTRATIVE EXAMPLE – Predictive analytics and dynamic pricing in hotels
The aim of revenue management (RM) in the hotel industry is to have the right room for the right person at the right time. RM uses analytics and customer data to help predict customer behaviour, thereby helping hotels to forecast demand and optimise prices.
Hotel chains have information systems which integrate data such as historical and current reservations, occupancy, and daily rates to forecast demand. Additionally, these systems incorporate external data – such as the weather – and analyse competitor pricing, the presence of major events (music or sports events) help in local areas, and booking patterns on other sources, to suggest optimal room rates.
As a result, the hotels can implement dynamic pricing strategies, automatically adjusting room rates so that they increase at times when demand is forecast to be high but reducing them when demand is expected to be lower (to try to increase occupancy rates during these quieter periods).
However, as with diagnostic analytics, we still need to exercise a note of caution here, around the potential limitations of diagnostic analytics, because – although it is forward looking – it still relies on past data, which could limit its ability to draw conclusions about possible future events, as past performance is not necessarily indicative of future results.
Methods of data analytics, and ethical issues in data analysis
This article will now look at methods of analytics which can help to identify patterns and trends in different types of data. It will also highlight the potential ethical issues around capturing and analysing personal information, and the need for organisations to behave ethically when doing so.
One of the defining characteristics of big data is that it comes from a great variety of sources, both internal and external, and structured and unstructured. The variety of types of data requires distinct processing capabilities and specialist algorithms in analysing them. However, such analysis could help organisations identify performance issues or trends which they would historically not have been able to identify.
These are some methods of analytics:
- Text (e.g. emails; social media posts)
- Image
- Video
- Voice (e.g. customers conversations with a customer support centre)
- Sentiment analysis
Note: Sentiment analysis can be used within text analytics and voice analytics, so it will be discussed in the context of each of them. However, we will now consider the other four methods of data analytics in turn.
Text analytics
Text analytics involves large volumes of text (like emails, social media posts, customer support tickets) being translated into quantitative data to uncover trends or insights in the text.
Having tagged responses (according to the key words in them, and whether their tone is positive, neutral, or negative), text analytics can uncover patterns and insights across a dataset and create charts or reports to display the results.
For example, text analytics tools can be used to identify the main topics or issues being discussed in product reviews (‘topic detection’), or to identify people’s attitudes to a brand or product on social media (‘sentiment analysis’).
Sentiment analysis is a natural language processing technique used to determine whether data is positive, negative, or neutral. Sentiment analysis is often performed on textual data, and by tracking the tone, intent, and emotion behind messages it can reveal how positive or negative customers feel about a business, its products and services, or what customers feel about a business’ competitors, and their products and services.
Consider the following illustration: ‘I needed to go into the bank branch today, because I could not complete the transaction online. There was a long queue, but there were only two cashiers working, so it took forever to get served.’
The emotion behind the comment that ‘it took forever to get served’ is one of dissatisfaction and frustration, so would be identified as negative within sentiment analysis software. However, it could also provide a useful insight to the bank, about the need to monitor the number of cashiers working at different times of day, to reduce the length of time customers typically have to wait before they are served.
Text analytics can also be useful to identify patterns in the content of the text, or topics which are being discussed most frequently. For example, if there has been a sudden increase in negative feedback about a product, text analytics can be used to help understand the reasons behind this, by identifying key words or phrases which recur most frequently in the customer feedback. Having identified this, a business can take action to improve the aspects of the product which are causing the complaints.
ILLUSTRATIVE EXAMPLE – Phone retailer
CallHi, a company which manufactures and retails smartphones, has noticed that its revenue has fallen recently. The company has used text analytics to try to help identify the reasons for this, analysing customers’ comments on social media.
The company’s analytics software uses key words or phrases to categorise comments, according to the nature of their content, and whether they were positive, negative, or neutral. For example, a post saying, ‘Battery life in the new CallHi8 is very poor’ is tagged to ‘Product performance’ and ‘Negative’; while a post saying, ‘The advisor I spoke to was well informed and helpful’ is tagged to ‘Customer support’ and ‘Positive’.
The text analytics results for the last month are shown in the graph below.