Last year, a successful Turing-LIDA Data Study Group took place – an intensive two-week hackathon with 62 researchers, at different stages of their careers and from a number of different countries, rising to one of six challenges set by different organisations. Some amazing research was done in a very short time, and five of the six reports into their findings have now been published:
Data Study Groups are a perfect opportunity for organisations (Challenge Owners) to test questions/problems quickly and with minimal resource, gaining insight into potential ways to further develop those areas of research. They also give researchers, at all stages of their careers, the chance to collaborate with others from different institutions and different disciplines, and apply their knowledge to real-world problems. For more information, visit The Alan Turing Institute’s Data Study Groups page.
CDRC have recently replaced our venerable map website, CDRC Maps, with CDRC Mapmaker! This new flexible repository for visualising many of CDRC’s open datasets, was initially developed by Geolytix and further augmented by our own developers. The website is backed by the CARTO data platform and is cloud hosted.
Web mapping technology has come a long way since CDRC Maps was first released in 2015, with responsive, multi-platform high resolution mapping now commonplace, and CDRC wants to take advantage of these advances to create a better way of visualising our many datasets. Using a new platform also simplifies CDRC’s own processes for adding new maps, allowing our researchers to manage their map configuration using GitHub, and add and update their data using CARTO’s web-based front-end to its data platform. The “command line” is no longer needed to produce maps, and the power of the cloud ensures that our maps remain visible to the world regardless of university infrastructure changes or local issues.
Technical users might be interested to know that the website is built in ES6 JavaScript, using the Vue, Mapbox GL and Bootstrap frameworks, and built and deployed using npm and GitHub. The data is delivered to your browser in the Mapbox Vector Tiles (MVT) format.
The website is structured around presenting two types of maps – metric maps (which show various continuous variables associated with a particular dataset, sliced into groups) and classification maps which categorise areas into a single value (sometimes with a hierarchy of levels) and generally include a pen portrait description of the category.
Users can filter maps based on one or more classification categories or on multiple metric value ranges, and a PDF report can be easily produced with a view of the current map, a key and accompanying text and direct link. Clicking many of the maps will not only present the metrics or portrait, but include statistics on proportions in the current administrative area or a custom drawn region. The user interface is deliberately simple with standard pan/zoom controls, map selector, postcode search and layer toggles – that’s it.
For our initial release of Mapmaker, there are around 30 maps, covering CDRC classifications such as Consumer Vulnerability and the Internet User Classification (IUC), CDRC metric products such as Access to Healthy Assets and Hazards (AHAH) and Residential Mobility (Churn), and some popular government datasets like the Index of Multiple Deprivation (IMD), VOA building ages and Ofcom broadband speeds/availability. Some of our legacy maps remain on the old CDRC Maps platform which can still be accessed through a special menu option.
We plan to continue to refine and improve CDRC Mapmaker, including a tighter integration with our main CDRC Data platform soon, and make fuller use of the CARTO data platform as a canonical data store for our outputs.
We hope CDRC Mapmaker forms a useful visualisation tool for some of CDRC’s many data assets, and its filtering and reporting functionality allow CDRC’s data to be viewed and used in new ways.
New datasets from growing partnership with MIAC Analytics
The CDRC are pleased to announce the acquisition of new datasets and a developing data partnership with MIAC Analytics.
The House Price Index and the Rental Index contain more than 25 years of monthly county-level data from January 1995 to the present. The data is available in a research-ready state, having been compiled and cleaned by MIAC Analytics. It will be of particular interest to researchers, including Masters students, who are examining questions relating, for example, to movements in the housing market, gentrification of neighbourhoods and geospatial economic indicators.
The datasets can be found in the CDRC Data Store. Both are Safeguarded data – access is restricted because of license conditions, but data are not considered ‘personally-identifiable’ or otherwise sensitive. Access is available via a remote service with registration and project approval requirements.
Professor Mark Birkin, CDRC Director, said: “CDRC is delighted to announce these latest data acquisitions arising from our long-standing collaboration with MIAC. The data represent a timely and welcome addition to the CDRC’s data store, increasing the diversity of our data assets across our core research themes of urban analytics, sustainable and ethical consumption, and healthy lifestyles. We look forward to continued growth in this partnership, supporting new forms of social research and the development of skills and capacity for both business and the academic sector.”
MIAC Analytics is an independent asset valuation service provider, specialising in property analytics, behavioural modelling, model validation and stress testing. Established in 1989, their head office is in New York with the UK & Europe office based in Twickenham, working from there in over 16 countries. They also have an office in Bangalore, India.
David Pickles, Managing Director at MIAC, said: “MIAC place great value on the relationship we already have with CDRC, having previously provided our Property Analytics for use within their Academic Research function. As we look to the future, we plan to explore more ways to work together to address data and modelling challenges such as quantifying climate change risks to the financial system. We are also keen to further our engagement with CDRC in terms of internships – a fruitful way to enhance student experience and career opportunity for students interested in coming into this market.”
Fruit and Veg Findings – IGD behavioural insights report follow-up
November 2020 brought a series of sustainability goals to the forefront with COP26, and linked initiatives such as the Earthshot Prize, rightly monopolising media attention. Tackling climate change was not only of great importance to policy makers in Glasgow, but also to companies and consumers around the world.
Our global food system is the second biggest contributor to climate change [1]. Retailers and manufacturers in the UK food industry are responding to this and are working hard to be a driving force of change. IGD’s Healthy and Sustainable Diets Project Group and WWF’s recent Retailers’ Commitment for Nature are examples of the brilliant collaboration happening in the sector to reduce the toll that our weekly food shops have on the environment.
This November also saw the release of the Institute of Grocery Distribution’s (IGD) Behavioural Insights Report. The report shares the first findings from our ongoing research, in partnership with IGD and their Healthy and Sustainable Diets Project Group, where we look at how healthy choices and sustainable choices can be one and the same.
Can food retailers and manufacturers make the shift towards a healthy and sustainable diet easy, accessible and appealing to consumers at the point they are purchasing and planning their food for the week?
To investigate this, interventions have been co-designed by IGD, members of IGD’s Healthy and Sustainable Diets Project Group and the University of Leeds, to trial one or several of the behavioural levers below.
A 4-week national price reduction trial to encourage greater fruit and veg, tested year on year
The first exciting trial we have been studying is Sainsbury’s ‘60p Fruit and Veg’ campaign.
Looking to increase the amount and variety of fruit and vegetable products in shoppers’ baskets, Sainsbury’s ran a promotional intervention for four weeks in both January 2020 and January 2021. These promotions reduced the price of selected fruit and vegetables to 60p in store and online.
Signposting and placement were used alongside incentivisation to draw attention to the offer. Our research aimed to determine whether this multi-levered approach led people to make healthier and more sustainable choices, and monitor whether any change was sustained in the nine months following the trial.
The trial was varied each year in the selection of fruit and vegetables chosen for the offer: thirteen products were on offer in 2020 and seven in 2021. Furthermore, the first trial in 2020 was outside of national lockdowns, whereas the second trial, in 2021, occurred during one.
Studying 23.4 million baskets
For our initial findings, we analysed national unit sales data for 23.4 million baskets that contained a promoted fruit or vegetable item between January 2019 and March 2021. Following the sales over this period allowed us to observe purchasing trends and establish a comparative January 2019 baseline for each product.
Our findings
Our analysis is presented in terms of portions, reminding us of the impact our consumer choices have in terms of our plate. To calculate this, each unit sold was translated into portions as defined by product weight, where one portion of fruit or vegetables is equivalent to 80g [2].
An impactful intervention year on year
Our first finding was pleasing, yet slightly expected: when on offer, sales increased for the promoted items.
Promoting fruit and veg increased sales by over two million portions, compared to the control year
Each year we saw an uplift in sales during the promotional period, well above the January 2019 baseline for the selected products. In total, 2.8 million more portions of promoted fruit and vegetables were sold in 2020 during the four weeks of the promotion than the previous year, across 101 stores which ran the intervention in both years. Similarly, 2.1 million more portions were sold during the 2021 promotion than in the same period in 2019, for the 101 stores.
Less impact during the national lockdown
Consumers engaged with the promotion each year, purchasing promoted products well above their 2019 levels. While both years of the promotion were successful, the second year saw slightly less impact than the first. Sales of the promoted items increased from baseline by 78% (2.8 million portions) in 2020 and by 56% (2.1 million portions) in 2021.
The 2021 trial took place during a national lockdown, when food shopping became a more pragmatic procedure. Placement and signposting were therefore probably not as strong at directing shoppers’ attention and influencing their basket.
Prior to starting our analysis, we anticipated the data may reflect effects of the pandemic alongside the effects of the trial. It is important to keep this context in mind when interpreting the data. The lower portions sold in the second year may be partly attributed to the different food landscape, however, we also have to bear in mind the differing items on promotion.
Below you can see a timeline of average portions sold per store from 2019-2021, across the 101 stores (noting it only considers items offered in the promotion). The figure illustrates seasonal shifts in consumer behaviour, essential to developing a holistic understanding of the trial’s impact.
Looking at the highlighted intervention periods, we observed a sales spike in both January 2020 and January 2021 during the promotion, emphasising those 2+million additional portions sold in each year.
Impactful, but not sustained
The timeline above plots weekly sales data. Curiously, the high level of engagement drops off in the final week of each trial. This decline may reflect people’s finances prior to payday, or may suggest that using Placement and Signposting only interrupts behaviour for a short time before going unnoticed by shoppers [3].
Some products were more appealing than others
Higher-value items were most popular, such as kale, kiwi fruit, pineapple and mango. Noticeably, these items also require minimal preparation and mostly encompassed exotic fruits. In contrast, swedes, radishes and red grapefruits were less popular in the promotion.
January – a popular month for fruits and vegetables
Important questions to understand, with the goal of encouraging people to embrace healthier and more sustainable diets, are: were customers adding items to their basket to up their fruit and vegetable content overall? Was this a shift within their usual purchasing habits?
For the last two years, around four million more units of fruit and vegetables were sold in January than the following February. Promoted items made up 9% of produce purchases during the four-week intervention period in January 2020 (dropping down to 5% in the four weeks that followed), and 8% during the four-week intervention period in January 2021 (dropping down to 6% in the four weeks that followed).
These figures indicate that the promoted items contributed to the high fruit and vegetable sales in January, but did not fully account for the uplift. There are many reasons why people may have picked up more fruit and vegetables in January, such as the growing Veganuary movement or a healthy eating New Year’s resolution. So, how do we determine the promotion’s success apart from a January health kick? Studying one year pre-intervention allowed us to establish baseline sales for the selected products. The 2+million portions sold above baseline indicates that the sales spike in January 2020 and 2021 can be largely attributed to the promotion.
Next stages
So, are promotional interventions successful? Sainsbury’s 60p Fruit and Vegetable promotion was extremely successful initially, causing a short-term sales uplift for items on offer. However, we saw this was not sustained as sales declined in the fourth week.
Our next step is then to look across people’s baskets and see what other items they were purchasing. Was their basket closer to the Eatwell Guide (the model diet we use as a metric to measure success) when engaging with the promotion?
There are many exciting developments in the wider work ahead for this partnership. More trials are underway, watch this space!
About the Author: Alexandra Dalton currently works as a Data Scientist at the Consumer Data Research Centre following her data science internship at the Leeds Institute for Data Analytics (LIDA). Alex and a team of researchers from the University of Leeds have worked in collaboration with IGD, major retailers and UK manufacturers to evaluate strategies to promote healthier and more sustainable dietary choices. She is interested by consumer data insights in the intersectional field of sustainability, nutrition and lifestyle analytics.
Data News: County Court Judgements (new dataset available and potential project ideas)
With our data partner Registry Trust, CDRC can provide access to data on County Court Judgements, which offer a key measure of financial health, both at an individual level, and also at an area and country level. The data are available County Court Judgements, either at an aggregated level (MSOA and Local Authority District) as a Safeguarded product or aggregated to LSOA level and at individual Judgement level as a Secure data product.
These data can be used as input data to a wide range of analyses looking at financial health and a range of other factors. Millie Corless, Data Analyst at Registry Trust, works with these data and performs a wide variety of analyses. However, she only has limited time available to analyse these data, so there are many more potential projects that could be done, with some examples listed below and on the data page.
Prior to working at Registry Trust, Millie completed her masters dissertation through the CDRC Masters Dissertation Scheme (MDS), working with the Registry Trust, which gave her the time to analyse the data in new and interesting ways. Millie is very interested in health and she looked at the CCJ data and explored its relationship with health.
Her dissertation project assessed geographic and temporal patterns in consumer (individual level) County Court Judgment (CCJ) rate (as an indicator of financial vulnerability), and considered the extent to which general health influences personal financial vulnerability across England and Wales. The project then considered the influence of additional socio-economic variables, such as Tenure and Employment Status, on financial vulnerability. The outcomes highlighted spatio-temporal locales where specific socio-economic variables influence financial vulnerability more, thus where the implementation of health improving policy will tackle the instability. More details available on her blog post.
There are a number of potential topics listed on the CDRC Data page that could be undertaken with CCJ data. These would make a good Masters Dissertation project (Registry Trust will be offering projects through the MDS) or you are welcome to apply to access the data and complete a different project independently.
Potential projects could include:
Work on a way to derive and publish a set or range of economic health indicators
Predict the future trend of these economic health indices
Use data to highlight exceptions and process inefficiencies in public sector entities e.g. exception reporting on court timelines, outcomes that are outside expected benchmarks, highlighting court inefficiencies, bottlenecks or process flaws
Improve existing data accuracy and gaps, e.g. impute missing or inaccurate data
Explore issues around CCJs and fraud – tackle the myth of the ‘unsound’ CCJ
Look at the effect of politics on indebtedness – what relationships are there between Government, national political representation, local representation and indebtedness?
Develop the Financial Stress Tracker produced by Registry Trust, to include the self-employed, those on low income, those who have been impacted by COVID and other factors.
Focus solely on Scotland or Northern Ireland, as these regions have had less focus at Registry Trust.
Get a closer insight into those taking out a judgment, for example which are the most forthright? Why might this be? Are there spatial or temporal trends?
The CDRC is made up of numerous researchers from a range of different disciplines, faculties and organisations. There are constantly multiple research projects in progress. In fact, there’s always so much great work going on that it’s hard to keep on top of what’s happening!
Cue the CDRC Research Review 2020-21, where we’ve brought together many of the projects our researchers have been working on in the last couple of years.
Take a look for yourself at what we’ve been doing in the areas of:
The CDRC’s Local Data Spaces project has been chosen as the winner of the prestigious ONS Research Excellence Project Award 2021 from the 400+ projects who used Office for National Statistics (ONS) data this year! This award recognises innovative research that has delivered public good or informed policy decisions.
The Local Data Spaces project combined ONS datasets with the CDRC’s own consumer data, creating ten innovative reports for each English Local Authority. These reports delivered highly tailored insights about how their communities were affected by the pandemic, which could then help to shape policy strategies.
For example, in Liverpool, the reports identified that sections of the community with low confidence in using internet technologies were less likely to make use of lateral flow tests. This led to local efforts to promote testing beyond social media, feeding these insights into the national roll-out of lateral flow testing that helped re-open workplaces and schools following the January lockdown.
In Norfolk, the analysis shows that furloughed workers were three times more likely to have caught COVID-19 than individuals who were employed or self-employed, suggesting that more could have been done to prevent COVID-19 outside of workplaces. A similar pattern was observed in most places nationally.
The project also found that, across England, COVID-19 did not significantly vary between men and women within different work sectors or occupations. One exception was the higher rate for COVID-19 in women employed in personal services jobs (e.g. hairdressers, barbers, cleaners, beauticians) at the start of the second wave. Findings from this work were presented to SAGE (Scientific Advisory Group for Emergencies) to help inform national policy around gender inequalities.
“The pandemic has shown the importance of getting the right data into the right hands,” said Dr Mark Green, lead researcher. “Local Data Spaces has opened up new sources of data to local authorities and helped them proactively respond to COVID-19.
“We are honoured to receive this prestigious ONS award. I am really proud of how our team worked at pace to support urgent policy needs during a global crisis.”
Professor Alex Singleton, CDRC Co-Director, added: “The strategic ESRC funding enabling the Local Data Spaces project perfectly illustrates the value and impact that can be unlocked by the social sciences when integrating consumer and government data within trusted research environments.”
Each of the 10 reports for all Local Authorities across England are freely available from the CDRC website and tackle a variety of themes related to the local impacts of COVID-19, from demographic and occupational inequalities, through excess mortality, to economic vulnerabilities.
New partnership pilots trials to help change eating habits
What we choose to put into our shopping baskets and how we make those choices will come under the microscope in a series of pilot trials designed to encourage healthy and sustainable diets.
Data analysts from the University of Leeds have joined forces with social impact organisation, the Institute of Grocery Distribution (IGD), to test different ways to encourage healthy and sustainable eating.
They are working in partnership with 20 leading retailers and manufacturers, including Morrison’s, Sainsbury’s and Aldi, to trial different strategies, including signposting better choices, the positioning of products in shops and online and the use of influencers and recipe suggestions.
Some have already begun to use some of those techniques in real-life settings as part of the research designed and implemented by the Leeds Institute for Data Analytics (LIDA) and the Consumer Data Research Centre (CDRC).
Researchers from LIDA and CDRC will analyse the results by capturing and measuring sales data from each intervention, enabling the project group to see exactly what is going on in people’s shopping baskets and assess what truly drives long-term behaviour change.
Dr Michelle Morris, who leads the Nutrition and Lifestyle Analytics team at LIDA and is a CDRC Co-Investigator, said: “I am passionate about helping our population move towards a diet that is both healthier and more sustainable. I believe that unlocking the power of anonymous consumer data, collected by retailers and manufacturers, is a really important step towards this goal.
“Working with the IGD and its members to evaluate their healthy and sustainable diets programme is very exciting – testing strategies to change purchasing behaviour and evaluating the wider impact of these changes.”
The pilot trials have been funded by IGD and form a key part of the charity’s Social Impact ambition to make healthy and sustainable diets easy for everyone.
Hannah Pearse, Head of Nutrition at IGD, said: “We want to lead industry collaboration and build greater knowledge of what really works. Our Appetite for Change research tells us that 57% of people are open to changing their diets to be healthy and more sustainable, and they welcome help to do it. But we also know that people don’t like to be told what to do and information alone is unlikely to change behaviour.
“We believe consumers will make this transition if we make it easier for them; that’s why we are delighted to be partnering with our industry project group and our research partners at the University of Leeds, to pilot this series of interventions over the coming months. The team at LIDA are experts in capturing, storing and analysing big data and have a variety of academic specialties that will be critical for this work.”
The work being carried out by CDRC researchers at the University of Leeds is unique because it will use the secure infrastructure at LIDA to allow retailers and manufacturers to share anonymised transaction data over a sustained period of time.
It is hoped that the results of the first pilot trial will be published towards the end of this year.
CDRC to adopt key role in powerful new COVID-19 data alliance
The Consumer Data Research Centre will work through its parent organisation Leeds Institute for Data Analytics to provide a new COVID-19 data alliance with scientific expertise and access to global academic research networks.
Leeds Institute for Data Analytics (LIDA) has worked alongside consortium-leader Rolls-Royce to develop the concept and will take a founding position in a new alliance of data analytics experts challenged with finding new, faster ways of supporting the response to COVID-19 and subsequent global recovery.
Together the initial wave of members brings all the key elements of open innovation; data publication, licensing, privacy, security; data analytics capability; and collaborative infrastructure, to kick off its early work and grow its membership.
Emergent will combine traditional economic, business, travel and retail data sets with behaviour and sentiment data, to provide new insights into – and practical applications to support – the global recovery from COVID-19. This work will be done with a sharp focus on privacy and security, using industry best practices for data sharing and robust governance.
As part of LIDA’s involvement in Emergent, researchers will have the opportunity to access these data sets using collaborative platforms which have been established by CDRC. The academic community will be encouraged to articulate and engage in projects to help understand the changes we are seeing in human activity and social behaviour as a result of COVID-19.
Emergent models will help get people and businesses back to work as soon as possible by identifying lead indicators of economic recovery cycles. Businesses small and large around the world, as well as governments, can use these insights to build the confidence they need to take early decisions, such as investments or policies, that could shorten or limit the recessionary impacts from the pandemic.
The alliance is voluntary and insights will be published for free.
Professor Mark Birkin, who leads both the Consumer Data Research Centre and Leeds Institute for Data Analytics commented:
“Increasing numbers of academics and other commentators are now recognising the potential for commercial organisations to share important data to help in the battle against COVID-19.
An established investment in data sharing capability and analytics capacity makes LIDA ideally placed to lead such conversations.
We are delighted to bring our skills and expertise as a founder member in the Emergent consortium, which offers such enormous potential to deliver benefits to society – and which are so badly needed at this difficult time.”
Connecting business and the academic community
The Consumer Data Research Centre was created in 2014 from a substantial award in the ESRC Big Data Network. Leeds Institute for Data Analytics at the University of Leeds was then established from the union of the CDRC (Leeds) with the MRC Centre for Medical Bioinformatics.
Since then, both LIDA and the CDRC have been actively promoting the mutual benefit of collaborative projects between corporate partners and the academic community, with researchers working in cross industry teams to undertake scientific research that produces real world insights.
The COVID crisis has further highlighted the importance of these types of collaboration, with governments and their advisers seeking real world insights into mobility, behaviour and human contact networks.
LIDA and IBM will be providing the infrastructure to enable alliance partners to share and compute their data.
Where there is a need to use secure data, partners will be granted access to LIDA’s ISO accredited infrastructure, which will enable them to perform analysis in a safe and controlled environment. Partners using the LIDA infrastructure will be supported by project management and technical support teams from the Consumer Data Research Centre.
For projects using public data, partners will use IBM’s environment and any non-sensitive data will be shared via emergentalliance.org.
Join Emergent
Caroline Gorski, Global Director, R2 Data Labs, the Rolls-Royce data innovation catalyst which started the alliance, said: “We want the global economy to get better as soon as possible so people can get back to work. Our data innovation community can help do this and is at its best when it comes together for the common good.
“People, businesses and governments around the world have changed the way they spend, move, communicate and travel because of COVID-19 and we can use that insight, along with other data, to provide the basis for identifying what new insights and trends may emerge that signify the world’s adjustment to a ‘new normal’ after the pandemic.”
The first challenges have already been issued by the alliance, including one to identify lead indicators of economic recovery which businesses can use to build the confidence they need for investment or activities that will shorten or limit any recessionary impact from the virus.
Emergent hopes to rapidly expand its network of data owners and has set up a website for potential members to register their interest at emergentalliance.org.
CDRC (Leeds) also encourages prospective academic participants to contact us directly at k.r.norman@leeds.ac.uk to receive further updates.
Work has been transformed by the coronavirus crisis with remote working now the norm for millions of workers. But distance from the office is also providing some opportunities to take a wider perspective of the data landscape and to scan business horizons using data sources that we might have overlooked or never investigated in detail.
The CDRC Data Store remains open for business, and our Open and Safeguarded data products are available as normal. Our Secure labs are closed for the duration of the crisis, but we are still accepting Secure data applications for access when things return to normal.
For students, our Masters Dissertation Scheme is still running with a record number of projects for students to complete in the coming months using business and CDRC data. The scheme gives Masters students registered at any UK university a unique opportunity to engage with horizon scanning or other business problems using novel datasets and interesting business perspectives on applied problem-solving. In the past, many participating students have carried out work at the businesses office, but this year students are being offered opportunities to work with businesses through homeworking for the duration of the crisis. The Scheme still brings together the best of academic and business perspectives upon applied problem-solving. Academic supervisors similarly gain the opportunity to collaborate on potentially high impact research with the business community.
So… if you are a Master’s student interested in collaborating with business, but can no longer do this through fieldwork or primary data collection, why not click here to see if any of the CDRC projects interest you? A number of the organisations that we work with are very keen to use part of their homeworking to coach students in the workings of business, especially if you have relevant skills and ways of working to offer!
We also have the CDRC Data Store which has a wide range of data sets available, some of which may be very useful in your dissertation or current research.