Non-personal data
Non-Personal Data (NPD) is electronic data that does not contain any information that can be used to identify a natural person. Thus, it can either be data that has no personal information to begin with (such as weather data, stock prices, data from anonymous IoT sensors); or it is data that had personal data that was subsequently pseudoanonymized (for example, identifiable strings substituted with random strings) or anonymized (such as by irreversibly removing all personal data).[1] NPD is part of the overall Data Governance Strategy of a region or country. While personal data are covered by Data Protection Legislation such as GDPR, other kinds of data would fall under the scope of NPD Regulation.
Importance of Non-personal Data
It has been pointed out that the future is data-driven.[2] What this means is that much of the present innovation taking place in domains such as Machine Learning and Artificial Intelligence is fueled by data, which is needed for calibrating the complex models (comprising neural network-based as well as other kinds). The larger the volume, diversity and quality of the data, the higher is the quality of the model, leading to better predictions and explanations.
However, there is a flip-side to data availability. The newly-emerging awareness of privacy and the consequent need for powerful Data Protection Regulations (such as GDPR) makes it increasingly difficult or impossible to obtain data in the quantities required. This is a contradiction, and the only way out would be to remove all personal data from data sets (either by Data anonymization or Pseudonymization coupled with noise injection, at which point it becomes NPD.
Therefore, many innovation-friendly countries are coming out with regulatory regimes that would ensure that personal data is protected, while, at the same time, non-personal data can be extracted from personal data so that innovation is fostered. In other words, NPD 'unlocks' value that was locked away in data sets that have personally-identifiable information. It is expected that multiple NPD data sets will begin to be available on free or commercial basis from different providers once the regulations are in place.
Emerging Regulatory Frameworks
Non-Personal Data has significant uses that may be economic, social, political or security-related. Several countries and regions are in the process of regulating the use of NPD. In May 2019, the European Union operationalized its Regulation of the Free Flow of NPD.[3] India announced a nine-member expert committee to make recommendations on the regulation of NPD in 2019, which published its first report in mid-2020. The report was opened for public comments, after which it was revised and published in December 2020.[4]
Objectives of the Proposed Indian NPD Regulatory Framework
The following were the objectives of the proposed Indian regulation as per the revised report:
- Sovereignty: India has rights over the data of India, its people and organisations.
- Benefit India: Benefits of data must accrue to India and its people.
- Benefits the world: Innovation, new models and algorithms for the world.
- Privacy: Misuse, reidentification and harms must be prevented.
- Simplicity: The regulations should be simple, digital and unambiguous.
- Innovation and entrepreneurship: The data should be freely available for innovation and entrepreneurship in India.
Concerns
The major concern in the use of NPD is if there are techniques (statistical or AI-based) by which multiple data sets can be used to extract personally-identifiable data.[5]
References
- "Explained: What is non-personal data?". 27 July 2020.
- Colgan, Maria (29 June 2020). "The Future is Data-Driven". blogs.oracle.com. Oracle Database Insider. Retrieved 4 October 2022.
- "Non-personal data | Shaping Europe's digital future".
- "Report by the Committee of Experts on Non-Personal Data Governance Framework" (PDF). mygov.in. Ministry of Information and Technology. 16 December 2020. Archived (PDF) from the original on 17 August 2022. Retrieved 4 October 2022.
- "Why 'Anonymized Data' Isn't So Anonymous". 25 April 2019.