technology: Predictive Analytics uses both new and historical data to forecast future activity, behavior and trends
Business applications for predictive analytics include targeting online ads at customers who are likely to be receptive to them, flagging potentially fraudulent financial transactions, identifying patients at risk of developing particular medical conditions and detecting impending parts failures and other problems in industrial equipment before they occur.
At the center of predictive analytics applications are variables that can be measured and analyzed to predict likely behavior by individuals, machinery or other entities. For example, an insurance company is likely to take into account potential driving safety variables such as age, gender, location, type of vehicle and driving record when pricing and issuing auto insurance policies.
As in that example, multiple variables are typically combined into a predictive model in order to enable it to assess future probabilities with an acceptable level of reliability. Testing -- or "training" -- models and then validating the accuracy of the results they generate is key to ensuring that predictive analytics initiatives produce trustworthy information.
Predictive analytics has grown in prominence alongside the emergence of big data systems. As enterprises have amassed larger and broader pools of data in Hadoop clusters and other big data platforms, it has created increased opportunities for them to mine that data for predictive insights. Heightened development and commercialization of machine learning tools by IT vendors has also helped expand predictive analytics capabilities.
Marketing, financial services and insurance companies have been notable adopters of predictive analytics, as have large search engine and online services providers. It is also commonly used in industries such as healthcare, retail and manufacturing.
The predictive analytics process
Predictive analytics requires a high level of expertise with statistical methods and the ability to build predictive data models. As a result, it's typically the domain of data scientists, statisticians and other skilled data analysts. In many cases, though, they're supported by data engineers, who help to gather relevant data and prepare it for analysis, and by software developers and business analysts, who aid in creating data visualizations, dashboards and reports and build data products that incorporate predictive models.
As part of predictive analytics applications, data scientists use predictive models to look for correlations between different data elements in website clickstream data, patient health records and other types of data sets. Once the data to be analyzed is collected, a statistical model is formulated, trained and modified as needed to produce accurate results; the model is then run against the selected data to generate predictions. Full data sets are analyzed in some applications, but in others, analytics teams use data sampling -- analysis of a representative subset of data -- to streamline the process. The predictive model is validated or revised as additional data becomes available.
Once predictive modeling produces actionable results, the analytics team shares them with business executives, usually with the aid of dashboards and reports that present the information and highlight future business opportunities based on the findings. Functional models can also be built into operational applications and data products to provide real-time analytics capabilities, such as a recommendation engine on an online retail website that points customers to particular products based on their browsing activity and purchase choices.
Applications of predictive analytics
Online marketing is one area in which predictive analytics has had a significant business impact. Retailers, marketing services providers and other organizations use predictive analytics tools to identify trends in the browsing history of website visitors to serve them advertisements based on their profile and predicted interests. In theory, this means ads are better targeted, increasing the likelihood that users on a site will be interested in them.
Predictive maintenance is emerging as a valuable application for manufacturers looking to monitor a piece of equipment for signs that it may be about to break down. As the internet of things (IoT) develops, manufacturers are attaching sensors to machinery on the factory floor and to finished products in the field, such as elevators and industrial mailing systems. Predictive analytics tools running machine learning algorithms enable data scientists to analyze the data streaming in from the sensors in an effort to forecast when maintenance and repair work should be done to head off possible problems.
IoT also enables similar predictive analytics uses for monitoring oil and gas pipelines, drilling rigs, windmill farms and various other industrial installations. Localized weather forecasts for farmers based partly on data collected from sensor equipped weather data stations installed in farm fields is another IoT-driven predictive modeling application.
Analytics tools and techniques
A wide range of tools and techniques is used in predictive modeling and analytics. IBM, Microsoft, the SAS Institute and many other software vendors offer predictive analytics tools, including machine learning software and related technologies supporting so-called deep learning applications.
In addition, open source software plays a big role in the predictive analytics market. The open source R analytics language is commonly used in predictive analytics applications, as are the Python and Scala programming languages. Several open source predictive analytics and machine learning platforms are also available, including a library of algorithms built into the Spark processing engine.
Analytics teams can use the base open source editions of R and other analytics languages or pay for commercial versions offered by vendors such as Microsoft. The commercial tools can be expensive, but they come with technical support from the vendor, while users of pure open source releases are typically on their own when working through problems with the technology.
Regardless of the types of predictive analytics tools that enterprises go with, they rely heavily on advanced statistical algorithms and techniques to analyze data and make predictions. These include techniques such as logistic regressions, time series analysis and decision trees.
This was last updated in October 2016