In this episode of the Beyond Data podcast series, Tessa Jones (Calligo’s Chief Data Scientist) and Peter Matson (ML Solution Architect) are joined by Martin Hoskin, Chief Technologist at VMware and Advisory Board Member for the Centre for Data Ethics & Innovation. In this enlightening discussion, we delve into the concept of data sovereignty and its implications for ethical data use, as well as explore how federated learning offers a promising solution to the challenges we face. 

Understanding Data Sovereignty

Data sovereignty encompasses the notion of data residency, access control, and governance. The dominance of American cloud providers, subject to U.S. laws, raises concerns about data privacy and security, particularly in the European context. For certain organizations, like government agencies and defense suppliers, data sovereignty becomes a critical factor. VMware has introduced a program to certify partners as Sovereign, ensuring data storage, processing, and governance are specified, differentiating them from major hyperscale cloud providers. 

The Challenge of Data Sharing

Data sovereignty also touches upon the ethical dilemma of sharing data for legitimate purposes like law enforcement investigations. Striking a balance between data privacy and the greater good is complex. For instance, the case of Apple’s cloud security raises questions about when governments should access personal data to combat serious crimes. 

Federated learning emerges as a promising solution to data sharing challenges. This approach enables entities to collaboratively train machine learning models without sharing raw data. Instead, local models are trained on separate datasets, and only aggregated model updates are shared with a central server. This preserves privacy and protects sensitive data, making it suitable for applications like fraud detection in the banking industry. 

Experimenting with Federated Learning

The Centre for Data Ethics & Innovation (CDI) conducted an experiment using federated learning for government-provided services. The CDI set up two data sets—one for detecting fraud in financial transactions using SWIFT data and another for studying the spread of COVID-19. The experiment highlighted the complexities of sharing data, including obtaining government buy-in and ensuring data anonymization to protect privacy. 

While federated learning is ingenious, it comes with its own set of challenges. Concerns arise about the aggregator potentially being reverse engineered to extract sensitive information. Additionally, the scale of data involved in real-world applications may make reverse engineering even more difficult. 

As data continues to play a critical role in various industries, addressing data sovereignty and privacy concerns remains paramount. Federated learning offers a way to enable collaboration without compromising data privacy. However, continuous innovation is necessary to tackle challenges like reverse engineering and fully realize the potential benefits of this approach. 

Ethical Considerations in AI and Data Technology

The conversation takes a broader turn, exploring the intersection of AI, data, and ethics. AI development should consider risks, probabilities, and potential biases to build robust and ethical systems. Ethical implications of sharing genetic data and the responsibility of pharmaceutical companies in handling such information are discussed. 

Regulating AI Ethics and the Divide between Academia and Industry

The need for clear regulations to define and enforce ethical standards in AI and data technology is acknowledged. Balancing philosophical academic perspectives with industry practicality becomes essential as AI progresses toward stronger AI with self-learning capabilities. 

Navigating Legal Frameworks and Data Sharing in Healthcare

Enforcing ethical standards and regulations on a global scale, especially with rogue states, poses challenges. Collaboration through global forums, like Gaia X, can facilitate trust, data security, and individual interpretations of frameworks. Standardized data-sharing frameworks and data portability regulations can address data sharing challenges in healthcare. 

Autonomous Weapons and the Role of Global Forums

The ethical challenges of deploying AI in autonomous weapons, especially in making life and death decisions, raise profound moral dilemmas. The hosts stress the importance of engaging in public discourse and involving the global community to shape AI and robotics’ future. 

The Impact of Social Media on Data Privacy

The podcast concludes with a discussion on the influence of social media on data privacy and the ethical considerations surrounding its use. Addressing the impact on young minds and the potential implications on decision-making, including voting rights for 16- and 17-year-olds, is highlighted. 

In conclusion, data sovereignty, AI ethics, and federated learning are crucial components of an evolving data landscape. Ethical considerations must be at the forefront of AI development and data sharing to ensure responsible and equitable data-driven futures. By embracing ethical practices and fostering interdisciplinary collaboration, we can harness the potential of AI while respecting individual rights and privacy. Establishing global forums and transparent public discussions will play a pivotal role in shaping the future of AI and robotics in a manner that benefits humanity as a whole. 

Listen on Spotify or watch below