A recently posted critique of a critique of differential privacy reminded me of this example, which I've named after of Ursula LeGuin's thought experiment The Ones Who Walk Away from Omelas. (If you haven't read that, you should. And it probably doesn't help to know that Omelas is "Salem, O." spelled backwards.)

Suppose that we have a society that values the privacy of its citizens' medical records, but yet wants to publish data that will enable epidemiologists and other researchers to find out correlations between diseases and their causes, etc. And the way that they collect and publish their medical data is the following. They hold a lottery every year, to select one citizen, and they publish everything that is known or knowable about that one citizen. His or her medical records. His or her name and home address. His or her sexual history. His or her school records. His or her complete genome. High-resolution photographs of every square inch of his or her body's surface, and scans of its interior. Everybody else has complete privacy (nothing about them is published), but that one person has none.

Would you want to take part in such a society, or make use of the records provided in this way? I suspect not. But it would definitely provide usable data, especially if there were say 100 lottery winners instead of only one. And, as far as I can see, it satisfies the definition of differential privacy, which is that adding or removing one person from the pool of people whose data is being aggregated can only change the probability of any event by a tiny amount. That's true here: if the population is large (say 300 million) then the change to any event probability if we add or remove one person from the citizens is small (at most 1/population), because the likelihood that the one person you add or remove is the lottery winner is small, and that's the only outcome that could cause a change in an event.

This doesn't mean that differential privacy is a bad idea. But to me it does mean that there's something missing about differential privacy, some way that it doesn't capture the essential idea of privacy. And it makes me worry that when researchers prove differential privacy results for more reasonable-looking aggregation schemes, that maybe they're missing the same thing.

109 November 3rd, 2014
кому только такая хуйня в голову приходит? этак я могу ввести differential baby killing: при достаточно большой выборке...

wizzard0 November 3rd, 2014
Эмм, ты не понял, о чем псто :)

Тут как раз наглядно высмеиваются определения прайваси, которыми пользуются некоторые компании, утверждая, что не нарушают оную.

109 November 3rd, 2014
это никак не инвалидирует, а даже в каком-то смысле подтверждает мой пойнт о том, что концепция "differential privacy" - бессмысленная бредовая хуйня :)

