Differential privacy tools from MS Research and Harvard

栏目: IT技术 · 发布时间: 5年前

内容简介:Data not only drives our modern world; it also bears enormous potential. Data is necessary to shape creative solutions to critical challenges including climate change, terrorism, income and racial inequality, and COVID-19. The concern is that the deeper yo

Data not only drives our modern world; it also bears enormous potential. Data is necessary to shape creative solutions to critical challenges including climate change, terrorism, income and racial inequality, and COVID-19. The concern is that the deeper you dig into the data, the more likely that sensitive personal information will be revealed.

To overcome this, we have developed and released a first-of-its-kind open source platform for differential privacy. This technology, pioneered by researchers at Microsoft in a collaboration with the OpenDP Initiative led by Harvard, allows researchers to preserve privacy while fully analyzing datasets. As a part of this effort, we are granting a royalty-free license under Microsoft’s differential privacy patents to the world through OpenDP, encouraging widespread use of the platform, and allowing anyone to begin utilizing the platform to make their datasets widely available to others around the world.

Cynthia Dwork, Gordon McKay professor of CS at Harvard and Distinguished Scientist at Microsoft said, “Differential privacy, the heart of today’s landmark milestone, was invented at Microsoft Research a mere 15 years ago. In the life cycle of transformative research, the field is still young. I am excited to see what this platform will make possible.”

Differential privacy does this via a complex mathematical framework that utilizes two mechanisms to protect personally identifiable or confidential information within datasets:

  • A small amount of statistical “ noise ” is added to each result to mask the contribution of individual data points. This noise works to protect the privacy of an individual while not significantly impacting the accuracy of the answers extracted by analysts and researchers.
  • The amount of information revealed from each query is calculated and deducted from an overall privacy budget to halt additional queries when personal privacy may be compromised.

Through these mechanisms, differential privacy protects personally identifiable information by preventing it from appearing in data analysis altogether. It further masks the contribution of an individual, essentially rendering it impossible to infer any information specific to any particular person,­ including whether the dataset utilized that individual’s information at all. As a result, outputs from data computations, including analytics and machine learning, do not reveal private information from the underlying data, which opens the door for researchers to harness and share massive quantities of data in a manner and scale never seen before.

“We need privacy enhancing technologies to earn and maintain trust as we use data. Creating an open source platform for differential privacy, with contributions from developers and researchers from organizations around the world, will be essential in maturing this important technology and enabling its widespread use,” said Julie Brill, Chief Privacy Officer, Corporate Vice President, and Deputy General Counsel of Global Privacy and Regulatory Affairs.

Over the past year, Microsoft and Harvard worked to build an open solution that utilizes differential privacy to keep data private while empowering researchers across disciplines to gain insights that possess the potential to rapidly advance human knowledge.

“Our partnership with Microsoft – in developing open source software and in spanning the industry-academia divide – has been tremendously productive. The software for differential privacy we are developing together will enable governments, private companies and other organizations to safely share data with academics seeking to create public good, protect individual privacy and ensure statistical validity,” said Gary King, Weatherhead University Professor, and Director Institute for Quantitative Social Science, Harvard University.

Because the platform is open source, experts can directly validate the implementation, while researchers and others working within an area can collaborate on projects and co-develop simultaneously. The result is that we will be able to iterate more rapidly to mature the technology. Only through collaboration at a massive scale will we be able to combine previously unconnected or even unrelated datasets into extensive inventories that can be analyzed by AI to further unlock the power of data.

Large and open datasets possess an unimaginable amount of potential. The differential privacy platform paves the way for us to contribute, collaborate and harness this data, and we need your help to grow and analyze the world’s collective data repositories. The resulting insights will have an enormous and lasting impact and will open new avenues of research that allow us to develop creative solutions for some of the most pressing problems we currently face.

The differential privacy platform and its algorithms are now available on GitHub for developers, researchers, academics and companies worldwide to use for testing, building and support. We welcome and look forward to the feedback in response to this historic project.

Tags:AI, artificial intelligence , data privacy , Data Protection , Open Data , Privacy


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Linux内核设计与实现

Linux内核设计与实现

拉芙 / 陈莉君、唐华、张波 / 机械工业出版社 / 2006-1 / 38.00元

《Linux内核设计与实现》基于Linux2.6内核系列详细介绍Linux内核系统,覆盖了从核心内核系统的应用到内核设计与实现等各方面的内容。主要内容包括:进程管理、系统调用、中断和中断处理程序、内核同步、时间管理、内存管理、地址空间、调试技术等。本书理论联系实践,既介绍理论也讨论具体应用,能够带领读者快速走进Linux内核世界,真正开发内核代码。 本书适合作为高等院校操作系统课程的教材......一起来看看 《Linux内核设计与实现》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

在线进制转换器
在线进制转换器

各进制数互转换器