The growing influence of citizen science and interesting challenges ahead​​​​​​​
Written by GeDIA member Suvodeep Mazumdar // Lecturer in Data Analytics at the University of Sheffield’s Information School.
As we are locked-in our homes during a global pandemic, one can’t help but appreciate the role technology plays in our daily lives. In such extraordinary times, we resort to technology to stay connected, interact with our friends and family, collaborate with colleagues, follow current news, and entertain ourselves. In fact, I can remember many lively discussions on how technology and the Internet are taking over our lives, changing the nature of how we live, and in some ways, making us ‘less human’. A few years down the line, it is this technology that in many ways is helping us stay sane, stay connected and ‘be human’. How the tables have turned!
With this in mind, it is fascinating to imagine how much we take technology for granted and how well technology is seamlessly interwoven into our fabric of daily life. Take the Internet, for example — having access to a huge repository of knowledge and content that we can tap into is incredible. We often drift-off into thought bubbles of strange questions, end up writing these questions on our favourite search engines and get immediate answers.
I find myself doing this without thinking about the seamless nature of the technology that enables me to express my queries, find relevant answers, and guide me to websites and hear the voices of people I had never met. While we often think of the web as an ever-growing resource for knowledge and information, we often don’t think about how we also contribute to this repository.
The traditional web was designed as a web of documents, connected via hyperlinks to (often static) pages, refreshed every year or so. The modern web, on the other hand, is far more powerful. Web pages are not static anymore but are highly dynamic. In fact, in most popular websites, much of the content is driven by user-generated information, and honestly, I do very often find myself crawling through all kinds of user-generated content (videos, blogs, product reviews, discussion threads, Youtube comments). In fact, that’s what I believe makes the web such a fascinating space! 
What might be interesting to discuss here is how much information we end up contributing to the Internet, leaving digital traces on every platform we visit and share information on.
These volumes of information are now available to study and understand how we interact with our environments as well as how we experience our lives. All of the data being collected, stored and archived at the same time — imagine the incredible level of detail that is being captured about our present lives that will be available to future historians to understand how we felt about events as they occurred. In fact, we already do that now — social media analytics is an area of research that studies how (some) citizens and communities react to events, share opinions, and ‘feel’ about specific topics. 
The impact of social media is immense - to the extent that we often find ourselves rushing to social media to voice our concerns, or hear from others what’s happening around them. While such opportunities have given rise to a new normal, it is essential to note that in the process of hearing the voices of millions of citizens as a default source of information, many millions are left unheard, losing their voice in the process. One might ask - how representative is this collective voice that we listen to? Enormous challenges lie ahead in understanding who and what we are listening to and who among us are deprived of having a voice. In our analyses of opinions, we need to be aware of the representations of different demographics, genders, occupations, minorities or deprived populations. We also need to urgently think about how social media can serve all of us better, as a mechanism for bringing equality and equity among all of us. 
Citizen science, or the participation of members of the public in scientific projects, is an area of research that has benefited immensely from the rapid development in information and communications technologies. Citizen Science is a highly evolving field of research thanks to the increasing footprint of the Internet, yet with a rich history going back many decades.
Citizen scientists are ordinary citizens who typically volunteer their time and resources to conduct scientific research, often using crowdsourcing to collect data. There are many ways how citizen scientists can participate in research — they can provide data as mere observers of certain phenomena (for example, cataloguing different types of birds and animals visible in their neighbourhoods) via crowdsourcing or offer their cognitive and visual perception abilities (for example, spotting different types of craters on the moon surface from satellite imagery).
A significant majority of citizen science research involves collecting data from citizens via crowdsourcing, where research projects are already designed and managed by professional scientists. With increasing access to the Internet and mobile devices, citizen science projects now have the potential to reach millions of citizen scientists all across the globe. 
This brings us to an interesting direction — an increasing number of citizens and communities are engaged in citizen science in various ways. Indeed, for a significant number of projects, much of traditional data collection sources can now be complemented by crowdsourced data. This is highly encouraging — citizens and communities are not just passive observers of scientific research but have more and more opportunities to be involved and engaged. In fact, within citizen science we are observing an increasing interest in community and (bottom-up) citizen-led projects where citizens and communities aren’t just data providers and collectors but are involved in every stage of the scientific process, acting as managers, designers and organisers of scientific research. This is an exciting time for citizen science! 
Yet, enormous concerns are raised about the process of engaging citizens. It is important to note that a significant proportion of citizen scientists volunteer their time and efforts. An obvious concern is often raised — how can we collect, store, analyse and process citizen data in a legal, ethical and privacy-aware manner? This is particularly important as we see the need for more and more projects being organised by non-experts.
To deal with this challenge, we can envisage an increasing co-production of science where citizens and professional scientists work collaboratively in developing standards and frameworks for facilitating the management and organisation of crowdsourced data.
At the same time, a more profound problem remains — while citizen science provides opportunities for citizens to participate in scientific research, very rarely do we consider what the facilitators for such activities are. As we mentioned, since much of the activities in citizen science are voluntary and we can expect citizens and communities with sufficient resources to participate. This leads to the question — the very nature of citizen science might lead to the risk of missing large populations’ voices.
Bottom-up citizen science initiatives (where citizens define their own problems, gather evidence, analyse data to influence policy or decision-making) are excellent means for solving problems of great importance to communities. However, we also need to ask — what happens if this is an approach that becomes the expected mechanism for communities to influence change? How do we make sure communities that are left out due to lack of resources or time still have a voice? 
As a data scientist, I have often had to deal with data gaps, missing data, messy or noisy data — we often use different techniques to spot outliers or data that doesn’t conform to particular expectations. We often find ourselves distant from the problem itself and try to develop mechanisms to understand an issue, essentially using numbers, graphs and charts to tell a story, which we hope is the story of the community the data represents.
However, when it comes to citizen science, how do we know that the story of communities left out — either due to a lack of access to resources or time, a lack of specific knowledge or skills, or even the wish to stay silent — are included?
Back to Top