Virginia Tech® home

An Exploratory Mixed-methods Study on General Data Protection Regulation (GDPR) Compliance in Open-Source Software

Research Paper Showcase 2025

Abstract

Background: Governments worldwide are considering data privacy regulations. These laws, such as the European Union's General Data Protection Regulation (GDPR), require software developers to meet privacy-related requirements when interacting with users' data. Prior research describes the impact of such laws on software development, but only for commercial software. Although open-source software is commonly integrated into regulated software, and thus must be engineered or adapted for compliance, we do not know how such laws impact open-source software development.

Aims: To understand how data privacy laws affect open-source software (OSS) development, we focus on the European Union's GDPR, as it is the most prominent such law. We investigated how GDPR compliance activities influence OSS developer activity (RQ1), how OSS developers perceive fulfilling GDPR requirements (RQ2), the most challenging GDPR requirements to implement (RQ3), and how OSS developers assess GDPR compliance (RQ4).

Method: We distributed an online survey to explore perceptions of GDPR implementations from open-source developers (N=56). To augment this analysis, we further conducted a repository mining study to analyze development metrics on pull requests (N=31,462) submitted to open-source GitHub repositories.

Results: Our results suggest GDPR policies complicate OSS development and introduce challenges, primarily regarding the management of users' data, implementation costs and time, and assessments of compliance. Moreover, we observed negative perceptions of the GDPR from OSS developers and significant increases in development activity, in particular metrics related to coding and reviewing, on GitHub pull requests related to GDPR compliance.

Conclusions: Our findings provide future research directions and implications for improving data privacy policies, motivating the need for relevant resources and automated tools to support data privacy regulation implementation and compliance efforts in OSS.


Authors

  • Lucas Franke, Virginia Tech
  • Huayu Liang, Virginia Tech
  • Sahar Farzanehpour, Virginia Tech
  • Aaron Brantly, Virginia Tech
  • James C. Davis, Purdue University
  • Chris Brown, Virginia Tech

Publication

  • Venue: International Symposium on Empirical Software Engineering and Measurement (ESEM 2024)
  • Date: 10/24/2024

Related Papers

CTINEXUS: Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models

Principled and Automated Approach for Investigating AR/VR Attacks

Security Enhancement in UAV Swarms: A Case Study Using Federated Learning and SHAP Analysis

Scale-MIA: A Scalable Model Inversion Attack against Secure Federated Learning via Latent Space Reconstruction

S2M3: Split-and-Share Multi-Modal Models for Distributed Multi-Task Inference on the Edge

"This is not a scam!": Assessment of an awareness raising program tackling older adults' scam victimization in a multi-method study

Unraveling the Complexities of MTA-STS Deployment and Management in Securing Email