Data analytics

Part 2: Methods of data analytics, and ethical issues in data analysis

This is the second of two articles about data analytics. The first article looked at different types of data analytics; this article looks at methods of analytics which can help to identify patterns and trends in different types of data.

This article also highlights the potential ethical issues around capturing and analysing personal information, and the need for organisations to behave ethically when doing so.

Methods of data analytics

In this second article about data analytics, we look at the methods of analytics which organisations could use to help identify patterns and trends in data. 

In the first article, we mentioned that, in order to investigate the root cause of trends (in diagnostic analytics), organisations may need to examine a wide range of data sources, including non-financial data as well as financial. This article now looks at some of those non-financial data sources.  

One of the defining characteristics of big data is that it comes from a great variety of sources, both internal and external, and structured and unstructured. The variety of types of data requires distinct processing capabilities and specialist algorithms in analysing them. However, such analysis could help organisations identify performance issues or trends which they would historically not have been able to identify.

The APM syllabus mentions the following methods of data analytics, and you could be required to discuss how they could be used in helping an organisation understand and manage performance:  

  • Text (eg emails; social media posts)
  • Image
  • Video
  • Voice (eg customers conversations with a customer support centre)
  • Sentiment analysis  

Note: Sentiment analysis can be used within text analytics and voice analytics, so it will be discussed in the context of each of them. However, we will now consider the other four methods of data analytics in turn.  

Text analytics

Text analytics involves large volumes of text (like emails, social media posts, customer support tickets) being translated into quantitative data to uncover trends or insights in the text.

Having tagged responses (according to the key words in them, and whether their tone is positive, neutral, or negative), text analytics can uncover patterns and insights across a dataset and create charts or reports to display the results.

For example, text analytics tools can be used to identify the main topics or issues being discussed in product reviews (‘topic detection’), or to identify people’s attitudes to a brand or product on social media (‘sentiment analysis’).

Sentiment analysis is a natural language processing technique used to determine whether data is positive, negative, or neutral. Sentiment analysis is often performed on textual data, and by tracking the tone, intent, and emotion behind messages it can reveal how positive or negative customers feel about a business, its products and services, or what customers feel about a business’ competitors, and their products and services.

Consider the following illustration: ‘I needed to go into the bank branch today, because I couldn’t complete the transaction online. There was a long queue, but there were only two cashiers working, so it took forever to get served.’  

The emotion behind the comment that ‘it took forever to get served’ is one of dissatisfaction and frustration, so would be identified as negative within sentiment analysis software. However, it could also provide a useful insight to the bank, about the need to monitor the number of cashiers working at different times of day, to reduce the length of time customers typically have to wait before they are served.  

Text analytics can also be useful to identify patterns in the content of the text, or topics which are being discussed most frequently. For example, if there has been a sudden increase in negative feedback about a product, text analytics can be used to help understand the reasons behind this, by identifying key words or phrases which recur most frequently in the customer feedback. Having identified this, a business can take action to improve the aspects of the product which are causing the complaints.  

WORKED EXAMPLE – Phone retailer
CallHi, a company which manufactures and retails smartphones, has noticed that its revenue has fallen recently. The company has used text analytics to try to help identify the reasons for this, analysing customers’ comments on social media.

The company’s analytics software uses key words or phrases to categorise comments, according to the nature of their content, and whether they were positive, negative, or neutral. For example, a post saying, ‘Battery life in the new CallHi8 is very poor’ is tagged to ‘Product performance’ and ‘Negative’; while a post saying, ‘The advisor I spoke to was well informed and helpful’ is tagged to ‘Customer support’ and ‘Positive’.    

The text analytics results for the last month are shown in the graph below.

apm-data-analytics-1

You have been asked to advise CallHi how it could use the analytics results to help improve performance.

The analysis shows that the areas customers comment on most frequently are product performance and customer support. This suggests that these areas are likely to be key to the company’s success, so should be covered in CallHi’s critical success factors (CSFs).

The analysis suggests that the poor product performance is a key factor in the recent fall in revenue. For example, customers will choose not to buy new CallHi8 phones because of their poor battery life, potentially buying phones from rival manufacturers instead. This suggests that CallHi needs to look at ways to improve product performance (for example, improving battery life) to address the issues causing dissatisfaction among customers.

By contrast, customers appear positive about customer service, so this could be helping to retain customers who might otherwise leave, thereby preventing the decline in revenue becoming even worse. As such, it will be important for the company to maintain its high levels of customer service.

Text analytics could also be applied in customer service. For example, an organisation could use text analytics to analyse the content of emails sent to customer support to help understand customers’ needs, and to help identify (and then address) issues which are the cause of the most frequent problems or queries. 

Using text analytics can help a business:

  • Improve customer satisfaction, by learning what their customers like and dislike about their products (and looking to enhance the things customers like, while addressing things they dislike)
  • Detect product or service issues, and help a business become more responsive to customer feedback or other negative sentiment
  • Monitor brand reputation

ILLUSTRATIVE EXAMPLE – Banks and customer experience
Banks use text analytics to analyse customer complaints. The banks’ analytics software uses natural language processing to analyse emails, survey responses and transcripts of customer calls to understand why a customer is complaining. They then use this insight to make changes to improve the customer experience going forwards.

Image analytics and video analytics

Image analytics uses algorithmic extraction and logical analysis to interpret information from images and graphics. In simple terms, image analytics is a computer’s ability to recognise elements pictured in image. For example, analytics software identifies whether the elements in an image depict physical features, objects, or movement, and these can then be logically analysed by a computer.  

Videos are a series of images, so similar principles apply for video analytics. The ability to interpret and understand images and videos, could have important implications for business.

ILLUSTRATIVE EXAMPLE – Video analytics in retail
For example, retailers have traditionally used point of sales data to learn about customer behaviour, but the insights from that are restricted to transaction statistics: what products they are buying, how much they are spending, how frequent their purchases are. However, retailers can use video analytics to get more insight into customers’ behaviour through their whole shopping experience: how long customers spend in the store, or how long they spend in different sections of the store; which areas of the store are visited most; what proportion of visitors enter a store and leave without making a purchase; what products are customers looking at but not buying.

The store’s management could then use these insights to try to help improve performance: for example, by introducing deals on products which are frequently browsed but not purchased, to convert customers’ interest into an actual sale.

Video analytics could also be useful in not-for-profit organisations, such as hospitals. For example, analytics systems could monitor patient rooms to keep track of the nursing staff. If a nurse hasn’t visited a patient within a certain amount of time, the system can notify the nursing team to check on a patient in a particular room.

Voice analytics and speech analytics

We have already noted that text analytics can be used to analyse written text. However, organisations also have access to potentially valuable data from conversations with customers (for example, through contact centres). And organisations can use speech analytics and voice analytics to help analyse this data.

Speech analytics software focuses only on what was said (that is, recording and transcribing the words used in a conversation), to identify and tag key words across conversations; for example, between customers and an organisation’s agents in a contact centre (call centre).

Organisations can use speech analytics to identify customer experience trends. For example, late deliveries could be a key area of performance monitored by logistics companies, so could be tagged as key words in calls. Tracking the numbers of calls where customers are complaining about late deliveries could help the logistics companies monitor performance in this area and identify the need to improve performance if the number of complaints is increasing.  

Equally, having a better understanding of the topics which are important to customers can help organisations build stronger relationships with their customers – for example, by briefing agents in contact centres on ‘hot topics’ to help ensure they can deal with customer queries as efficiently and effectively as possible.

Moreover, some analytics software systems have real-time assistance functions, that offer recommendations for agents, when certain keyword triggers occur during a call. For example, ‘refund’ could be identified as a key word. Therefore, if a customer asks a customer service agent about a refund during a call, the system could display real-time advice to the agent about the company’s policy on refunds or anything the customer needs to do to claim/receive a refund. 

Voice analytics

While speech analytics focuses on what was said in a conversation, voice analytics focuses not only on what was said, but also on how it was said. Voice analytics software analyses the patterns of the conversations themselves to identify different features, such as tone of voice, volume, pitch and speed of a customer or agent’s speech. As such, voice analytics reveals the emotions within the content of the call. (This is another manifestation of sentiment analysis, which we mentioned earlier in relation to text analytics).

ILLUSTRATIVE EXAMPLE – Calls to a contact centre
Using voice analytics to analyse customers calls to a contact centre can provide a more accurate reflection of a customer’s mood than could be obtained by simply analysing the transcript of the conversation (as would be the case in speech analytics). 

For example, an extract of a conversation may be:

Agent: I’m afraid your order has been delayed, so you won’t receive it until next week now.

Customer: Great, so now I won’t be able to do anything until next week.

The word ‘great’ normally denotes a positive sentiment, but in this case, it is being used sarcastically, which changes the meaning of the word. Voice analytics solutions include features that identify the context and emotions of conversations.

As such, voice analytics is often used to improve the customer experience, rather than relying on speech analytics alone.

We mentioned real-time assistance in relation to speech analytics, but this could also be an important feature in voice analytics. For example, if the software detects that a customer is becoming increasingly angry, or a customer service agent is struggling to answer a query, the software could highlight this on a supervisor’s dashboard, so that the supervisor can then intervene to help the agent with the call.

Ethical issues around capturing and processing data

Although using analytics software can help entities gather more information about their customers to understand customer behaviour more precisely, and to help drive decision-making, gathering this data (for example, by recording and transcribing phone conversations) could also raise significant ethical and privacy issues.

A key challenge for organisations to address is how they can collect, store, and use data ethically, and what rights they need to uphold. Data ethics encompasses the moral obligations of gathering, protecting, and using personally identifiable information, and how it affects individuals.

The following are key issues to consider when collecting and analysing data:

  • Ownership: Individuals (‘data subjects’) have ownership over their personal information. Therefore, it is unlawful and unethical to collect someone’s personal data without their consent. As such, if an organisation is going to collect data about any individuals (eg customers, employees) it needs to ask their permission to do so (for example, through digital privacy policies that ask users to agree to a company’s terms and conditions, or pop-ups with checkboxes that permit websites to track users’ online behaviour with cookies).

  • Transparency: In addition to owning their personal information, data subjects have a right to know how organisations plan to collect, store, and use that information. For example, if a company decides to implement an algorithm to personalise its website experience based on individuals’ buying habits and site behaviour, it should write a policy explaining that cookies are used to track users’ behaviour and that the data collected will be stored in a secure database and train an algorithm that provides a personalised website experience. It is a user’s right to have access to this information so they can decide whether to accept the site’s cookies or decline them.

  • Privacy: Another key ethical responsibility that comes with handling data is the data subjects’ privacy. Although an individual may have consented for an organisation to collect and store data, the organisation still has a responsibility to ensure that personal information is held securely to protect the individual’s privacy (for example, by storing data in a secure database, or by password protecting files containing personal information, or encrypting them).

  • Intention: Before collecting data, an organisation also needs to question why it needs that data, what it will gain from it, and what changes it will be able to make after analysing the data. An important issue, related to this, is that collecting and storing data when it isn’t necessary to do so, is unethical. Therefore, organisations should strive to collect the minimum viable amount of data, so they take as little as possible from data subjects while optimising the overall service they offer to them. This presents an inherent conundrum in big data analytics though. On the one hand – as we have said previously – organisations are gathering more data about individuals than ever before. On the other hand, organisations should only be collecting data when it is necessary and should be looking to collect the minimum viable amount of data. 

  • Outcomes: Even when intentions are good, the outcome of data analysis can cause inadvertent harm to individuals or groups of people.

This point about outcomes is particularly significant in relation to the use of algorithms in data analytics.

Ethical use of algorithms

Analytics software uses algorithms to sift through data and recognise patterns. Algorithms are sets of instructions that computers use for solving a problem or completing a task. However, although they are used by computers, algorithms are initially written by humans. Therefore, for the algorithm to work properly, the human programmer who writes it must include all the necessary rules and regulations. However, because algorithms are written by humans, bias may be present in them – intentionally or unintentionally.

Biased algorithms can cause serious harm to people, in particular by introducing prejudice against certain socio-economic or demographic groups. Two key ways that bias can creep into algorithms are:

  • Training: Machine-learning algorithms learn based on the data they are trained with. Therefore, an unrepresentative data set can cause an algorithm to favour some outcomes over others. As such, organisations need to ensure that training data is properly representative of the populations who will be affected by the algorithm.

  • Feedback: Algorithms also learn from users’ feedback. As such, they can be influenced by biased feedback. For instance, a job search platform may use an algorithm to recommend roles to candidates. If hiring managers consistently select candidates from one demographic group for specific roles, the algorithm will ‘learn’ and adjust and only provide job listings to candidates in that group in the future. The algorithm learns that when it provides the listing to people with certain attributes, it is ‘correct’ more often, which leads to an increase in that behaviour.

The inputs and operations of a ‘black box’ algorithm are not visible to users or people affected by its decisions. The algorithm takes a number of data points as inputs and correlates specific data features to produce an output.

However, because the workings of the software cannot easily be viewed or understood, errors can go unnoticed until they cause problems so large that it becomes necessary to investigate them. This is particularly true in relation to bias.

ILLUSTRATIVE EXAMPLE – Predictive policing
PredPol (which is short for ‘Predictive Policing’) is an artificial intelligence algorithm, used by police departments in the USA, which aims to predict where crimes will occur in the future, based on crime data collected by the police (eg arrest counts, number of calls made to police).

PredPol aims to reduce the human bias in police departments, by leaving crime prediction to artificial intelligence. However, researchers discovered that PredPol itself was biased, and it repeatedly sent police officers to particular neighbourhoods that contained a large number of minority groups, regardless of the level of crime in those areas relative to other areas.

Arrest data (which is one of the data sets used by the algorithm) biases predictive tools, because – overall it appears that – police arrest more people in neighbourhoods with large numbers of minority groups, compared to neighbourhoods with fewer minority groups. As a result, the PredPol algorithm directs more police to the areas with large numbers of minority groups, which in turn leads to more arrests being made in them, compared to areas where there are fewer police, even though crimes may be being committed in these areas as well.

However, one of the reasons for the higher numbers of arrests being made in the neighbourhoods which large numbers of minority groups was the higher police concentration in them, compared to other neighbourhoods. Consequently, this led to a self-reinforcing bias in the algorithm, which continued to send more police to all regions with a large number of  minority groups, and meant that police departments were over-policing areas which didn’t actually need a higher police presence.

The ethical issues around collecting and analysing data, and the use of algorithms could also be key issues to consider in the context of an APM exam scenario. If the scenario identifies that an organisation is looking to introduce an analytics system, it is important to consider whether there are sufficient controls over the way data is captured, stored, and used to ensure the process is ethical, and that it avoids bias as far as possible.

Conclusion

These two articles on data analytics have discussed how organisations can analyse the increasing amounts of data available to them to identify trends and patterns in it. This second article has discussed the variety of different sources of data which could be analysed, in helping an organisation understand performance trends or issues (diagnostic analytics), and in helping to predict how trends will unfold in future (predictive analytics).  

However, this second article has also highlighted the need for organisations (and their staff) to appreciate the potential ethical issues in collecting and analysing data, to ensure that they behave ethically – and legally.

Written by a member of the APM examining team