• Predictive Analytics
  • April 04, 2017

Predictive Analytics and Data Science – What Is It All About?

As phenomena, predictive analytics and data science are hardly new. However, we are living in an analytical era where people are thirsty for information, and it can be accessed and analyzed more easily and in greater quantities than ever before. It is this thirst for information for which predictive analytics and data science offer answers. And, as is evident from the word cloud below, there are heaps of buzzwords related to this theme, and it is easy to get confused with them. I now wish to erase some of that confusion and explain what it is all about.

At its simplest, data science means explaining and analyzing phenomena from data with the help of various advanced calculation methods. Various advanced calculation methods refer to mathematical, statistical and machine learning models. They are utilized to identify substantial information and to easily produce it into a comprehensible format.

The purpose of explaining and analyzing phenomena is often to make reliable predictions about the future or future events. This is what is known as predictive analytics. To clarify, these predictions are produced by models and should not be confused with human-made forecasts that may be familiar from sectors such as financial administration.

Usage expanding rapidly

For some industries and companies, data science and predictive analytics are part of normal, everyday activities. These are analytical industries and companies (e.g. banking and insurance) where fact-based knowledge management has been deeply embedded in the corporate culture. However, the majority of different industries are only now waking up to the idea of a fact-based culture where the aim is to manage decision-making processes with as comprehensive a set of information as possible, raising the following types of questions in the company:

  • Which factors influence decision-making and what is the influence of those factors?
  • How are the factors in question taken into consideration and is there some value in following them?
  • What consequences does a specific decision have?
  • How is the largest possible likelihood of success ensured?

In a company with a fact-based culture, people understand that there is usually not enough time or expertise to assess all the factors influencing the operations and decisions. There is an understanding to make use of the small inefficiencies in business operations and decision-making that go unnoticed by workers or competing companies. This is carried out by relying on the results produced by analytical models.

Question, learn, utilize

A fact-based culture does not mean that the company needs brainiacs or an army of statisticians. Instead, it means that the company needs to question its own methods. We, the data scientists at Valamis, often assume this role of questioner with our customers.

Let us look at an example. Customers are at the center of a company's operations. Information on customers is collected in the company's databases, CRM and ERP systems. Customers send feedback and e-mails to the company. Customers are active on public forums and in social media. How much of this available information are we actually using to understand the needs and behavior of our customers?

Customers leave traces in a number of places, and the help of data scientists provides a lot of opportunities to examine data to which we would not otherwise pay attention. We have only thought and assumed that some piece of information is not valuable. We can help identify the valuable information and interpret how to utilize that too, for instance, improve customer encounters. We can also help you process English or Finnish text through automation and shift through enormous quantities of text to identify data that is essential to the company's operations. This is known as text analytics.

Technically, this is carried out by utilizing various models. Descriptive models are used to better understand what has happened, in which case the method may be too, for instance, group customers on the basis of behavior with cluster models or to examine the regularities of customer behavior with association models.

Models of predictive analytics, in turn, are built to predict specific events on the basis of an array of significant key information. For example, regression models, decision tree models or neural networks may be used to predict the future purchase events, product exchanges or order cancellations of customers.

One well-known method, the Bayesian network, for example, structures and specifies variables related to customers and builds a chain of deduction on the basis of several cause-and-effect relationships. New information on customers updates the existing logic and teaches the Bayesian network to be more precise. The Bayesian network is one of the most frequently utilized methods of machine learning and works similarly to the human mind: learning through new information.

Comprehensive understanding

The utilization of predictive analytics and data science is not simply limited to customers. We have studied and helped to better understand the personnel, learning, work processes, machines and equipment and development projects of our customers, among other things. As long as people are willing to question existing habits and methods, there are so many opportunities. That is why the results produced by predictive analytics and data science are so fascinating.

About The Author

Jens Harju, Lead Data Scientist

Jens Harju

Former Lead Data Scientist at Valamis

Jens is responsible for data science-related projects and solutions in Valamis. He has been working in project management, consulting and solution delivery roles in multiple analytics projects across industries. He has also completed a double MSc degree with a specialization in Industrial Engineering, Finance and Econometrics.