This is the tip of the ice berg. Every time we shop online, reserve a ticket or make a financial transaction, data is generated. Additionally, smart services for homes, businesses, manufacturing and governments require sharing masses of personal data between individuals and organizations or individuals and governments.
The amount of data we create will only keep growing. Research by the International Data Corporation (IDC) estimates that by 2021, 75% of enterprise applications will use artificial intelligence (AI). As AI technologies evolve and are used to improve products, services and our lives, we need now to ensure the privacy and security of all our personal data.
The role of standards
Many questions remain, for instance, how much of our personal data already exists, where is it, who can access it, who is using it and for what purpose? How anonymous is our data, in other words, can we be identified by this data or the different data sets which already exist, if algorithms are able to put this information together? Finally, what recourse is there if the data is used in a way that has negative consequences?
One way to protect data is through standardization. IEC and ISO develop international standards for information technology through their joint technical committee (ISO/IEC JTC 1), which covers different aspects of data, including privacy, security and storage. Work is also under way to address ethics and other societal concerns, such as transparency and accountability, as well as bias of data sets for algorithms, which are then used in health, financial and many other applications.
Interview with Ian Oppermann, Chief Data Scientist and CEO of the NSW Data Analytics Centre
e-tech caught up with Ian Oppermann, President of the JTC 1 Strategic Advisory Committee in Australia, to learn about the need to develop international standards for data sharing frameworks.
What are the main issues with data?
We live in a world where we share data all the time which includes our personal preferences and we like it, provided we don’t think there are detrimental outcomes. Social media is a great example, where we connect and share with friends. This is all good, but what if someone uses social media to understand your voting preferences and nudge you in a certain direction?
Would you consent to the authorities monitoring your whereabouts? In order to answer the question, you would need to know why they wanted the information. If it were to improve transport services you may agree; if it were to fine you for jaywalking, you may think twice. The issue of how much personal data can be used and in what circumstances is a debate which needs to happen in society. It will differ from country to country, because of diverse customs and attitudes.
The examples of data sharing are growing by the day and it is complicated. You have to think of the entire data set about you, that is out there, who you shared it with and what they could do with it.
How could a data sharing framework help?
These issues are so important that JTC 1 has decided to create an advisory group on data usage, which will conduct a study of potential standards for data sharing frameworks. The study would describe factors to consider when sharing data, including identifying concerns relating to data sharing frameworks, existing standards that address these concerns, and any gaps, such as:
- lack of guidance and best practices for data sharing
- why many data custodians remain hesitant to share data (cultural, economic or other reasons)
- privacy, security and safety following concerns raised by advocates as the capability of data analytics increases
The group will cooperate with other IEC and ISO technical committees on definitions and relationships between personal information and personally identifiable information. It will also work with other standards development organizations involved in data sharing framework standardization.
It is really important to distinguish between personal information and personally identifiable information. An example of the former is a person’s features, such as hair or eye colour. Personally identifiable information would be the different sets of information about a person, which can uniquely identify that person, if successfully connected together by an algorithm, which can mine masses of separate data sets.
This distinction doesn’t exist at the moment and we need to develop a quantifiable measure to describe and assess the risk and potential outcomes of the use of data sets. The data sharing framework would ensure that while we don’t know all the possible detrimental outcomes, we should be able to understand one outcome versus another, based on what we choose to do with our data. It would also mean that if we decide to prevent a particular outcome, we would know what not to do.
JTC 1 will also ensure appropriate actions are identified to clearly describe these two very distinct concepts of personal information and personally identifiable information, within the relevant standards developed by its subcommittee for IT security techniques.
What other steps need to be taken?
The world is changing rapidly. Many factors influence this rate, such as growing urbanization, aging populations and climate change. Digitalization continues to change how we live and work and we will need to adapt much faster to it than we are currently doing.
This couldn’t be more true for standards development, which is a consensus-building process that happens over time, through confidence and trust building. However, there may well be parts that need to adapt at much faster rates, in order to keep up with new unforeseen risks and situations that arise as technologies evolve, and for standardization bodies to remain relevant.
At the same time as developing frameworks to define what personal data is and how to best protect it, we also need to think urgently about what we do when things go wrong with our data and we suffer as a consequence. In other words, what happens if someone is wrongfully refused a job or a mortgage because an algorithm was biased, or someone hacks one of our smart appliances? Even if we have rules about what personal data can be gathered, used and stored by organizations, we will need to develop a framework for a non-digital right of redress, to deal with people who don’t follow the rules and do bad things with our data.