How AI can augment prediction and human decision making

We spend much of our time here at Red Marble exploring ways that AI and machine learning can elevate human performance. One area of particular interest is how AI can augment prediction and human decision making.

But before we can design AI, we need to understand the differences between how software and human brains make decisions.

How do humans make decisions?

Humans make thousands of decisions every day; subconsciously combining inputs from multiple parts of the brain, combining real-time data with historical information from our memory and blending rational thought with emotional cues.

In “Thinking, Fast and Slow,” Kahneman and Tversky explain how fast, instinctive and emotional decisions are blended with others that are made logically and deliberately - all with an assessment of risk, probability and judgement based on experience.

This is a pretty sound way to make decisions - however, it takes time and the quality of the decisions can depend on outside factors (ie the person’s health or mental state).

How does machine learning make decisions?

Machine learning (ML) models aim to emulate elements of this decision making, but clearly some areas are more accessible than others.

ML reflects the ability for software to ‘fit’ a particular model to a set of data by applying specific weightings to different parts of the data (called ‘features’). It uses this model and applies the weightings to extrapolate that data in order to predict future events.

A machine can process vast amounts of historical data to make its predictions, with a defined probability, based on past data - but it can’t apply an emotional lens (yet) to those decisions, and it struggles where context changes.

Where machine learning works well

A nice application of ML is to model the rational human decision making process and to make those decisions at scale. Let me share an example.

We recently worked with a client who was making predictions about stock levels of spare parts for machinery. Were they holding enough in stock? When spare parts were ordered, would they arrive on time? They needed to know they would have the parts when required.

Looking at the “health” of each material made the assessment fairly intuitive and simple for the human. A track record of late deliveries from suppliers, highly variable stock levels in the warehouse, parts being used for breakdown (rather than planned maintenance) all lead to a fairly simple judgement by the human worker.

The challenge is applying that process across 800,000 materials every day. Clearly not something a human can do.

Applying an ML model here does the heavy lifting superbly and creates a list of priorities that the human can work through and apply judgement to.

We apply similar models in our AI-driven employee engagement and digital adoption work. The software can analyse many users and predict which information is most valuable to help each individual use their technology to its full potential and to succeed in their role. It models the human analysis, and applies it at scale.

Where do ML models fall short? 

Machine learning takes historical data but can lack immediate context. Here’s another example.

We help one of our clients predict customer conversion for certain products and prices, helping them maximise margin. The model is trained on historical data, but within the context of COVID, historical data is not reflective of current buying behaviour. Companies need to adapt their models to make use of real-time context and the most recent data as it's coming in.

Machine learning can also fail to understand the behavioural aspects of decision making. Human decision making has lots of nuances; emotion, the time available, the paradox of choice as outlined by Barry Schwartz to name a few. AI models can be trained on aspects of these - and they will get better at this as technology improves - but for now will struggle to put it all together into cohesive reasoning.

When is emotional decision making a problem? 

There are areas where prediction models can expose harsh realities which the human mind might wish to overlook.

One of our areas of work is predicting the success of corporate projects. By tracking the data throughout the project and applying a model trained on historical information, there are clear ways of predicting which projects will succeed, and which will fail to deliver the predicted business benefits. However, many projects are “pet projects” of a particular group or person; and the human mind may already have an ideal end state in mind which will bias the decision making.

An ‘emotionless’ ML model will have no issues exposing the hard facts: “This project only has a 30% chance of delivering the desired benefits” quickly cuts to the reality and can save companies countless time and expense.

Human decision making + AI: what’s the best way to combine them?

Humans are amazing, and the ability to make complex decisions pretty quickly is significantly ahead of our software abilities. However - add in the capability of AI to apply models rapidly, constantly and at huge scale to the capability of human decision making, and you have some very exciting possibilities.


What IS GPT-3? And why do you need to know?

At Red Marble, our core belief is that artificial intelligence will transform human performance.

And as part of our everyday work, we’re continually coming across (and creating) ways that AI is improving workforce productivity.

Our projects generally fall under five technical patterns of AI: prediction, recognition, hyper-personalisation, outlier detection and the one I’ll talk about here...

Conversation and Language AI

Broadly, conversation and language AI deals with language and speech. There are 3 aspects to this pattern:

  1. The ability to have a conversation with software, either via text or voice. Common examples of this include Alexa or Siri, but we’re seeing an increasing number of voice-based interfaces within enterprises.
  2. The ability to understand language and analyse it; for example, we recently worked on a project where we analyse text in work notes  to understand if any contractual clauses may have been triggered.
  3. The ability to generate language - to create a natural language narrative based on input data, for example auto-generating a project status update narrative based on data collected.

There’s been a huge advance recently in natural language generation. It’s based on software called GPT-3 (Generative Pre-Trained Transformer, version 3) developed by California-based AI research centre OpenAI.

This technology was flagged in a research paper in May, and released for a private beta trial in July 2020.

What IS GPT-3?

GPT-3 is a ‘language model’, which means that it is a sophisticated text predictor.

A human ‘primes’ the model by giving it a chunk of text, and GPT-3 predicts the statistically most appropriate next piece of text. It then uses its output as the next round of input, and continues building upon itself, generating more text.

It’s special primarily because of its size. It’s the largest language model ever created, trained using around 175 billion variables (known as  ‘parameters’ in this context). Essentially it’s been fed most of the internet to learn what text goes where in response to certain input primes.

What can GPT-3 do? 

Some beta-testers have marvelled at what it can do - medical diagnoses, generating software code, creating excel functions on the fly, writing university essays and generating CVs just to name a few.

Others have rejoiced in posting examples showing that GPT-3, though sophisticated, remains easy to fool. After all, it has no common sense!

https://twitter.com/raphamilliere/status/1287047986233708546

https://twitter.com/an_open_mind/status/1284487376312709120

https://twitter.com/sama/status/1284922296348454913

Why is GPT-3 important? 

For now this is an interesting technical experiment. The language model cannot be fine-tuned - yet. But it’s only a matter of time before industry-specific variants emerge, trained to skilfully generate excellent quality text in a specific domain.

Any industry where text based outputs or reports are generated - market research, web development, copywriting, medical diagnoses, property valuation, higher education to name a few - could be impacted.

Is GPT-3 intelligent? 

This is the big question for us and cuts to the essence of what we think makes for great AI.

In our view, GPT-3 is great at mathematically modelling and predicting what words a human would expect to see next. But it has no internal representation of what those words actually mean. It lacks the ability to reason within its writing; it lacks “common sense”.

But it’s a great predictor of what a human might deem to be acceptable language on a particular topic, and - we believe - that means that through all the hype, it’s a legitimate and credible model. We’ll be keeping a keen eye on it!