In the recent past, the geography community, like many scientific fields, has had to deal with a pressing challenge, the reproducibility crisis. The explosion of available data and the surplus of tools at our disposal to work with the data have triggered a few fundamental questions:
- How can we generate reliable knowledge?
- How can we enhance the transparency of research outcomes?
- How can we recognize and mitigate bias?
- How can we identify and eliminate misrepresentation?
In this context, reproducibility and replicability have emerged as foundational concepts (National Academies of Sciences, Medicine, et al. 2019). According to NASEM, “Reproducibility is obtaining consistent results using the same input data, computational steps, methods, and code, and conditions of analysis… Replicability is obtaining consistent results across studies aimed at answering the same scientific question, each of which has obtained its own data” (2019).
Open source GIS, driven by principles of openness and collaboration, aligns seamlessly with the reproducibility goals geographers and the scientific community are increasingly pursuing.
Rey’s 2009 study “Show me the code: Spatial analysis and open source” highlights the transformative potential of open source to enhance reproducibility in geography and spatial analysis. Rey characterizes open source as “a revolutionary collection of tools and processes through which individuals create, share, and apply new software and knowledge”, and that this movement has coincided with the resurgence of spatial analysis (Rey 2009).
Open source has a number of clear advantages. It provides access to geographic data and analytical tools, introduces a wider audience to new skills, fosters collaboration and education, and encourages knowledge sharing. Rey notes that “the value of the open source community is nested in not what you own, but in what you share and contribute” (2009). An open source community that nurtures innovation both removes barriers from science and ultimately can lead to better results.
In fact, the open source framework leads to a new diversity in research questions, and allows developers and scientists to adapt their tools to their projects, removing their dependence on closed softwares (Rey 2009). If open source continues gaining widespread acceptance, it has the potential to end the perpetual race among researchers and to foster a healthier, more productive and innovative research environment (TEDxTalks 2021).
However, for open source GIS to reach its potential, the scientific community must confront several challenges. As emphasized by NASEM, transparency is non-negotiable in effective scientific reporting, throughout the data collection, preparation, and sharing process (2019). In an open source environment, scientists must be transparent and include thorough documentation.
Another challenge, as Ray points out, is the “‘‘developer-centric’’ nature of open source projects, which can foster technological elitism in the sense that only those individuals with adequate programming skills can participate in the development” (2009). To avoid creating an echo chamber for engineers and developers, open source GIS must expand participation and prioritize the needs of users.
Finally, open source science must challenge the predominant academic/scientific model of “publish or perish” (TEDxTalks 2021). Open source science presents an alternate framework to the competition, prestige, and blame in academia, but its success depends on its recognition and adoption by prestigious institutions. If we can overcome these challenges, open source science is a shining opportunity to foster a new era of collaborative spatial science. Together, open source GIS and a commitment to openness and transparency can propel us closer to solving the reproducibility crisis in geography.