A photograph of the author, Alex Kharouk

Alex Kharouk

Research Methods - Reviewing 5 Big Data Projects for the Social Good

I am currently studying for my Masters (Computer Science & Artificial Intelligence) at the University of York, where I'm in my second module. It's all about Research Methods, and learning how to research. At first, I thought this would be a trivial module, as I'd rather just deal with the practical aspects of engineering. Turns out, it's quite breathtaking to dive into the theoretical aspects of research. Reminds me of my Bachelors in English Literature!

Sometimes, I'll include some of the research work I've done on my blogfolio just so I have some public evidence of the research I've done, as well as (hopefully) see an improvement in my research, development and critical thinking. This post is reviewing this link, and providing a quick summary and some thoughts, in regards to research design and particular methodologies.

If that's not your cup of tea, I understand.

# Five projects that are harnessing big data for good

The article begins with a contrast of how Australian scientists and data analysts are busy improving the economy using tools such as AI, ML, and big data. On the other hand, there's a concern of the unethical data collection, in regards to commercial/the selling of one's data.

The authors' main argument is that this data science boom should not be limited nor within the realm of strict business insights or for the benefit of profit margins. Instead, it can be used to solve society's social and environmental issues. They give us five examples of projects currently holding that mission.

### Humanitarian hot spots

Using social media data, their team was able to find the humanitarian hot spots within Victoria. The volunteering and charity activity are located in, and around Melbourne CBD and its eastern suburbs. This point of this insight is to "help local aid organisations channel volunteering activity in times of acute need." Such needs include the "long-term struggle with drought", within the rural areas of Australia.

In this case the research method revolved around scraping Instagram posts. Other quantitative methods could have been used, such as surveys (asking if one volunteers), and qualitative methods (interviews on why one has decided to volunteer, what sparked their interest) which could lead to bringing in more volunteers if that were the goal of the research.

This research project belongs with the Constructivist worldview, as it's trying to look into a marginalised community, comparing those affected by drought, and those who volunteer or provide support.

### Fire safety in homes

The research done for this project is to bring awareness to the difficulties of gathering data regarding fire safety (specifically fire fatality and targeting homes without fire alarms). The team, Enigma Labs, used quantitative measures and instruments such as census data, a geocoder tool, and analytics. They've combined the data into a single risk score.

I feel like the research method is robust, and qualitative measures might not be as impactful as the data currently provided. However, if the point of the research was to showcase a technology that prevented fires (using quantitative evidence), combined with a qualitative narrative approach that paints a before and after, that could be interesting.

### Mapping police violence in the US

The team working in Mapping Police Violence project combines data from crowd-sourced databases, as well as social media, obituaries, criminal record databases, and more to tell us a narrative from a Transformative worldview standpoint about the current political environment within the United States in regards to police brutality.

The research methods pulls in a mixed-method approach, with a slight narrative design, that cleverly gives us the point they are making. It's a powerful piece, that is subtle. It is difficult for me to come up with a different way of gathering data.

If I were to contribute, I would analyse the data from the news reports on the days where there has been violence, and investigate which stations mention, react, or ignore. Further research can then explore geolocation perhaps.

### Optimising waste management

An IoT project that deals with gathering data on waste, using solar-power bin compactors "that regularly compress the garbage inside". According to the project, the bin eliminates waste overflow, reduces unnecessary carbon emissions, with an 80% reduction in waste collection.

What I enjoy reading is how not only do they have cloud-powered bins that has functionality in improving the environment, the project doesn't stop there in terms of data collection. It uses a tool known as CLEAN that collates trends in waste overflow, which helps with bin placement and planning of collection services.

To me, this project is a candidate of Postpositivism, where it's gathering the knowledge and statistics that benefit the environment, but that the knowledge is still growing and can be improved on. The researchers could approach this project differently, by analysing the data they have on bin placement and seeing which areas have the greatest returns in terms of reduction in waste.

### Hotbeds of street harassment

A project that uses anonymised, crowd-sourced data that uses geolocation to identify hotbeds where harassment incidents have occurred. The research was done to help prevent further cases of harassment, thus lending itself to a constructivist worldview. Having both volunteer supporters, anonymous data gathered, and then mapping the information shows a method of data collection that supports a good cause.

The authors comment on this project, stating that "mapping and informing are essential data science techniques for addressing social problems". This is worth pointing out, and provides a shared perspective between data science techniques, and mixed methods techniques when dealing with research that is trying to address sociopolitical issues.

Summary

Highlighting these five projects, the authors of the article show that data science, for "social good", is a difficult effort. As data privacy laws are protecting us from unethical gathering of our data, it also makes it challenging for those who work with big data on projects such as those listed above. Unethical data collection is a lose-lose situation, as it affects us when it happens, and affects those who strive for good when laws prevent them.

However, projects like this won't stop, and through correct digital & data literacy, the separation of objective mathematics & algorithms that drive data science and the subjective human bias, as well as the adoption of transparency within data analytics could "result in a better connected and a more caring society."