How do we define data and why does it matter? In this blog, Research Fellow, Dr Matthew Collins, explores why interpretation and storytelling are key to infusing raw data with both meaning and context.
Tricky, slippery data
The words we use reveal a lot about the world in which we live. Take the word ‘data’: data can ‘flow’, we can ‘handle’ data, we ‘collect’ it, ‘store’ it and ‘transfer’ it. How we describe data reveals much about our understanding, use, and even feelings towards this slippery term.
The word ‘data’ can conjure up images of spreadsheets, figures running to several decimal places and dazzling bar charts. What did it make you think of? Was that what brought you to this blog in the first place?
So how do we define data? First, consider a question that recurs for researchers like me: is the word data singular or plural? Should we say data shows or data show? This semantic questionreveals the difficulties entailed of pinning down exactly what we mean by ‘data’.
All sorts of phenomena and ephemera can qualify as data, not all of it easily countable. Our own research involves capturing the experience and behaviour of school pupils. The data we must capture is undefined and fleeting. Marshalling experiences and behaviours into distinct, analysable categories – never mind logical arguments and hypotheses that underpin research results – can be a tricky and perilous task.
Data is everywhere
What broad definitions of data highlight is how pervasive it is. Survey the journalism, industry blogs, academic papers, textbooks or courses on data analysis and you find data, data, everywhere.
The poet Coleridge tells a tale of sailors cast adrift, The Rime of the Ancient Mariner (1798), with:
Water, water, everywhere,
Nor any drop to drink.
Like Coleridge’s sailors, the overabundance of data casts researchers adrift in an ocean of information.
An abundance of data is no substitute for the right kind of data. Just like saltwater, vast quantities alone are worthless. Data requires refinement to make it salient (to continue the nautical metaphor). Look up the use of the word ‘data’ in the British National Corpus and you see that it is most commonly linked to the word analysis. In other words, strongly associated with data is the idea that data requires interpretation.
An editorial on the Office for National Statistics (ONS) website (also titled ‘Data, data everywhere’) relates that “ONS releases contain a veritable plethora of great data, all of which act as a barometer of life in the UK.”
Those datasets include all sorts of demographic, educational, and health information. We have already made great use of the detail provided by these datasets, profiling schools for a spread of participants, and bringing in geographic, educational, and socio-economic factors. “Veritable plethora” hints at the danger of large datasets which, it admits can be a “nightmare […] that may overwhelm and confuse users”.
The value and power of data
The extensive, pervasive, ephemeral qualities of data, gives our project some particular challenges, beyond simply taking a qualitative or quantitative approach.
Qualitative methods can help us unearth the intuitions we want to corroborate through quantitative surveys, just as qualitative methods can provide a richer understanding of the trends and figures we collect through our quantitative measures. But our project isn’t simply about gathering new data, but also generating new data by developing new measures. For any data, a set of measures is the essential toolkit that determines what we collect and how we analyse.
The researcher’s task is a kind of alchemy – transforming data into information through analysis and interpretation.
In today’s world, data is a valuable commodity: with The Economist declaring in 2017, “the world’s most valuable resource is no longer oil, but data”. You can see this demonstrated in the ways in which we talk about ‘datastores’ and ‘data mining’.
Unearthing the value of data means that the researcher’s task is much more than simply organising and analysing the data. Two of the most effective means of turning raw data into compelling information are through storytelling and visualisation.
Narrative has long been a principle means by which we transmit information: its power lies in the cause-and-effect logic at its heart. In the past, literary epics served as data stores, ensuring the transmission of data across generations of listeners.
Today, there are plenty that champion the demystifying of hard data through the power of storytelling. One of our tasks, as research fellows, is how to navigate these waters but also what story we will tell.
Most importantly, including pupils and teachers in our research means we take on a further responsibility in charting these waters – it is their stories, after all, that we are telling.