Differential privacy tools from MS Research and Harvard

栏目: IT技术 · 发布时间: 4年前

内容简介:Data not only drives our modern world; it also bears enormous potential. Data is necessary to shape creative solutions to critical challenges including climate change, terrorism, income and racial inequality, and COVID-19. The concern is that the deeper yo

Data not only drives our modern world; it also bears enormous potential. Data is necessary to shape creative solutions to critical challenges including climate change, terrorism, income and racial inequality, and COVID-19. The concern is that the deeper you dig into the data, the more likely that sensitive personal information will be revealed.

To overcome this, we have developed and released a first-of-its-kind open source platform for differential privacy. This technology, pioneered by researchers at Microsoft in a collaboration with the OpenDP Initiative led by Harvard, allows researchers to preserve privacy while fully analyzing datasets. As a part of this effort, we are granting a royalty-free license under Microsoft’s differential privacy patents to the world through OpenDP, encouraging widespread use of the platform, and allowing anyone to begin utilizing the platform to make their datasets widely available to others around the world.

Cynthia Dwork, Gordon McKay professor of CS at Harvard and Distinguished Scientist at Microsoft said, “Differential privacy, the heart of today’s landmark milestone, was invented at Microsoft Research a mere 15 years ago. In the life cycle of transformative research, the field is still young. I am excited to see what this platform will make possible.”

Differential privacy does this via a complex mathematical framework that utilizes two mechanisms to protect personally identifiable or confidential information within datasets:

  • A small amount of statistical “ noise ” is added to each result to mask the contribution of individual data points. This noise works to protect the privacy of an individual while not significantly impacting the accuracy of the answers extracted by analysts and researchers.
  • The amount of information revealed from each query is calculated and deducted from an overall privacy budget to halt additional queries when personal privacy may be compromised.

Through these mechanisms, differential privacy protects personally identifiable information by preventing it from appearing in data analysis altogether. It further masks the contribution of an individual, essentially rendering it impossible to infer any information specific to any particular person,­ including whether the dataset utilized that individual’s information at all. As a result, outputs from data computations, including analytics and machine learning, do not reveal private information from the underlying data, which opens the door for researchers to harness and share massive quantities of data in a manner and scale never seen before.

“We need privacy enhancing technologies to earn and maintain trust as we use data. Creating an open source platform for differential privacy, with contributions from developers and researchers from organizations around the world, will be essential in maturing this important technology and enabling its widespread use,” said Julie Brill, Chief Privacy Officer, Corporate Vice President, and Deputy General Counsel of Global Privacy and Regulatory Affairs.

Over the past year, Microsoft and Harvard worked to build an open solution that utilizes differential privacy to keep data private while empowering researchers across disciplines to gain insights that possess the potential to rapidly advance human knowledge.

“Our partnership with Microsoft – in developing open source software and in spanning the industry-academia divide – has been tremendously productive. The software for differential privacy we are developing together will enable governments, private companies and other organizations to safely share data with academics seeking to create public good, protect individual privacy and ensure statistical validity,” said Gary King, Weatherhead University Professor, and Director Institute for Quantitative Social Science, Harvard University.

Because the platform is open source, experts can directly validate the implementation, while researchers and others working within an area can collaborate on projects and co-develop simultaneously. The result is that we will be able to iterate more rapidly to mature the technology. Only through collaboration at a massive scale will we be able to combine previously unconnected or even unrelated datasets into extensive inventories that can be analyzed by AI to further unlock the power of data.

Large and open datasets possess an unimaginable amount of potential. The differential privacy platform paves the way for us to contribute, collaborate and harness this data, and we need your help to grow and analyze the world’s collective data repositories. The resulting insights will have an enormous and lasting impact and will open new avenues of research that allow us to develop creative solutions for some of the most pressing problems we currently face.

The differential privacy platform and its algorithms are now available on GitHub for developers, researchers, academics and companies worldwide to use for testing, building and support. We welcome and look forward to the feedback in response to this historic project.

Tags:AI, artificial intelligence , data privacy , Data Protection , Open Data , Privacy


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

代码大全

代码大全

迈克康奈尔 / 电子工业出版社 / 2006-12 / 148.00元

《代码大全(第2版)(英文版)》由电子工业出版社出版。一起来看看 《代码大全》 这本书的介绍吧!

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码

在线进制转换器
在线进制转换器

各进制数互转换器

图片转BASE64编码
图片转BASE64编码

在线图片转Base64编码工具