The power of “small data”

8th October 2021

Author:
Evan Hurwitz & George Cevora

The power of “small data”

Limited data is better than no data when it comes to business decision making. Arca Blanca data scientists have found useful relationships between the UK labour market and performance of workforce reskilling programmes from only 14 existing datapoints. The benefit is 53% better estimation of performance compared to alternatives, enabling accurate assessment of value for money of these programmes by business decision-makers.

The benefits of using data to support business decisions are now generally well understood across most industries. However, large amounts of high-quality data being available is a luxury not available to many decision-makers. Arca Blanca recently encountered one such example: a client in need of understanding the performance of workforce reskilling programmes.

Attention to detail enables success

The usual way to estimate the performance of this kind of programme is to look at the historical performance of comparable programmes. However, the dual shock of Brexit and Covid-19 has had significant ramifications on UK employment, with an unknown impact on the performance and therefore a programme’s value for money. Understanding the relationship between workforce supply, demand and programme performance would allow the business to plug in the most up-to-date forecasts of these metrics from government sources and then estimate performance. However, after realising that only 14 useful data points exist, (there are only 14 known comparable situations) the idea of modelling the relationship between workforce supply, demand and programme performance was quickly rejected.

Using the 14 data points available, Arca Blanca’s data scientists built a reliable model relating workforce demand and supply to reskilling programme performance. It was the careful consideration of all aspects of the dataset and utmost attention to detail that enabled success – all stages of the data science process were carefully crafted to suit the dataset and the objectives of the project. The full technical detail of the approach can be found here.

Imperfect data, better than no data

As a result, the small data approach to the problem enabled proactive, rather than reactive, guidance based on changes in the labour market. Small data is not the only undervalued approach that could bring benefits to businesses. Another example is creating a dataset by manual research or manual corrections of low-quality data. While “more is more” is a common mantra in the data industry and is definitely true, “imperfect data is better than no data” should be repeated more often.

 

About The Authors

Evan Hurwitz has researched applications of Artificial Intelligence for over a decade both in academia and industry, applying his insights to both the popular fields such as deep learning and also the lesser-known fields such as meta-heuristic problem-solving.

George Cevora has spent 10 years on the forefront of Artificial Intelligence research and has a deep understanding of the strengths, weaknesses and the potential of the technology.

More in insights...

Building the case for AI-driven decision making  

Read More