Sunday, 24 February 2013

How Does De-Identification Software Work?


To protect the privacy of the data source, many companies and research institutes are resorting to data de-identification using specific software designed for this particular job. This type of software allows users to measure accurately the risks of re-identification as it deidentifies information to preserve individual privacy and retains the data's value.
De-identification offers the following benefits:
* Leverages peer-reviewed and scientifically-validated risk-based methodology on the amount of de-identification to be applied to the data
* Ensures high utility of data and makes it acceptable by the biggest number of analysts possible
* De-identifies longitudinal, cross-sectional and geospatial data optimally
* Accurately measures re-identification risks under various scenarios
* Utilizes tools to simulate re-identification attacks to enable users to test other assumptions and to perform sensitivity analysis
* Generates certificates which document data sets that possess very small chances of re-identification
* De-identifies all your data sets from local to massive databases
* Saves the specifications of the de-identification to allow it to run on other databases
Aside from the good things that you can get from using de-identification software, you also need to know how it works to attain a better understanding of the entire process. There are four simple and easy steps that you can do yourself:
Choose the indirect identifiers - Choose and rank the variables, which can be used for the data re-identification. The ranking will be used during the de-identification to decide the optimal anonymization. This will then balance data utility and the re-identification risk.
Set the re-identification risk limit - The software allows the user to adjust the re-identification risk threshold according to what is acceptable to him based on the profile of the organization or person requesting for privacy. This is vital to balance data granularity with privacy because the risk-based de-identification process guarantees that individual privacy is safeguarded while data utility is maintained.
Carry out the risk analysis - Calculate the data set's risk for the three types of re-identification assault-marketer, journalist and prosecutor.
De-identify to protect your data - There are different deidentification techniques that you can use in the process. These include suppression or the removal of high risk values in your database, as well as generalization or the reduction of a given field's resolution. Also, the de-identification software automatically de-identifies the data to minimize the risk of re-identification according to acceptable levels as set by the user and in compliance with the legislation. In addition, the software can also create a data sharing agreement for the data set that has been de-identified.
If you are very much involved in a field that uses big data regularly, it will be very handy to install this particular software. For example, in the medical research field, privacy as well and the utility of data are of equal importance so the users take every measure to ensure that these two goals are achieved.
Now that you are already knowledgeable about how de-identification software works, it's about time to apply what you have learned and start working on your computer!


0 comments:

Post a Comment