The essential questions of Fair Data Use
Today is Independence Day in the United States, which seems like a great time to return to the subject of liberty, privacy, and fair data use. I continue to believe:
- New technologies for information creation, gathering, and analysis offer dire new possibilities for abuse.
- Our law- and policy-makers need to create effective new safeguards in response.
- That’s not going to happen unless we in the technology community help them.
In this matter – as in many others – I think getting the questions right is at least as important and difficult as then choosing the answers. What’s more, I think that the questions naturally fall into the domain of the technologists – we know better what is possible, what will be possible in the future, and which distinctions lead to true differences. The answers, on the other hand, lie more properly in the domain of those whose expertise is the crafting of actual laws.
For my first draft of suggested Fair Data Use Questions, I am dividing things into three categories:
- The questions themselves.
- Different kinds of data (for which the questions may have different answers).
- Other qualifiers that could change the answers to the questions.
Suggested additions and other comments will be gratefully received. I intend for this to be a community effort.
Essential questions of Fair Data Use
- Who may the data be given to?
- Government (classified)
- Government (law enforcement)
- Government (general)
- Certain regulated businesses (e.g. financial services, health care)
- All businesses
- Anybody at all
- What may the data be used for?
- Legal evidence
- In criminal cases
- In civil cases
- In regulatory proceedings
- Marketing
- Choosing what advertisements to offer.
- Choosing what prices and deals to offer – note that if one person gets a particularly great deal somebody else is arguably suffering from price discrimination.
- Decisions about whether to provide or deny a service
- Mortgages and other credit – obviously, this is a longstanding business for credit bureaus.
- Rental – can landlords discriminate if you have rowdy pictures on Facebook? If you have pictures on Facebook with friends of multiple races? If you’re in a medical high-risk group?
- Insurance – what kinds of risk factors may insurers take into account?
- Fraud detection or prediction – of course, fraud prediction has a lot to do with decisions about whether to deny a service
- Credit
- Insurance
- Telecommunications/cell service
- Online gambling
- (And more)
- Employment
- Hiring
- Retention
- Promotion
- Legal evidence
- Are there any procedural restrictions on obtaining, retaining, or aggregatingthe data?
- Notifications?
- Subpoenas?
- Consent?
Kinds of data
The answers to questions such as those above surely depend at least in part upon the nature of the data in question. Major kinds of data include:
- Transaction and movement. Between them, a number of commercial enterprises (especially credit card and cell phone companies) and governments amass or could obtain huge amounts of information about your activities, including:
- Purchases and other financial dealings — “credit bureaus” aggregate and supply this information to be used for a lot more than just credit-granting decisions.
- Location data
- Anything that can be inferred from fare and toll payment.
- Anything captured from mobile devices.
- Information that can be inferred from your transactions.
- Information from police/security cameras and the like.
- Anything else captured by cameras, other imaging devices, listening devices, etc. – including those that look through windows or even walls.
- Data from any kind of in-home sensor – for example, your electricity meter may provide substantial clues as to how you entertain yourself on an hour-by-hour basis.
- Purchases and other financial dealings — “credit bureaus” aggregate and supply this information to be used for a lot more than just credit-granting decisions.
- Communication, entertainment and attention. Much – indeed most — of the data that is or could be gathered about you pertains to what you read, write, say, look at, or listen to, and who you communicate with. Major subcategories include:
- Connection information from phone records, email, instant messages, social networks, etc. – intelligence agencies already mine these heavily.
- The actual contents of your phone calls, email, text messages, and so on. In many countries, at least phone calls have substantial legal protections.
- What you post on the web.
- Which web sites and web pages you visit, what you enter on forms when you do, and perhaps even the precise pixels you touch with your mouse or fingers –
- In particular, which terms you search on, and which search results you select.
- What you watch, listen to, or play, and how much attention you appear to be paying when you do.
- Health care.
- Diagnoses and treatments.
- Doctors’ and other caregivers’ observations – e.g., notes in medical records.
- Test results.
- Prescription and over-the-counter medical data – what was prescribed and when we bought it.
Qualifiers
To what extent do the answers to any of these questions change:
- Depending on when the data was collected (e.g., before the passage of certain regulatory legislation)?
- If the recipient is outside the country? Many countries already have rules in this regard.
- If the data originates outside the country?
- If the data is already in some way public? Aggregating already-known information can lead to surprisingly intrusive results.
- If you provided or created the information anonymously?
- Depending on the age of the subject? Minors traditionally get additional protections.
- Depending on privacy settings and privacy promises?
- Depending on notice or consent?
- Depending on who owns the equipment and networks on which the data was created or transmitted?
Comments
15 Responses to “The essential questions of Fair Data Use”
Leave a Reply
I think there is a basic question missing: who owns the data about a person? These are great categories of what can largely be done with the data, but it will all bake down to who owns the right to the information irrespective of how it is collected. Answer that question, and the rest will fall into place.
Michael,
I don’t think that paradigm works.
If you collect data about me in the ordinary course of business, at the minimum you have some rights to use it. So to call me the “owner” is confusing at best.
But if we call anybody else the “owner”, then we have to question the circumstances under which they have a right to create it.
Basically, the data “ownership” paradigm adds confusion without, so far as I can tell, resolving anything.
Michael/Curt
I think that Michael’s paradigm works to some extent. When you and I exchange information about each other in a transaction, we have both allowed the other to use some of that information for their own purposes. So, If I buy a research report from you, I have agreed to some terms and conditions giving you a “right to use” data about me in some way. It might be as restrictive a right as to count that another purchase has occurred, or it might be a right to contact me again in some future, or it might be a right to track my purchase history with you, so you can see how good a customer I am.
The data about me is mine. The data about you is yours. The data about the transaction is ours and subject to the contractual terms we have agreed superseded by law, (“did you purchase Curt’s research report on or about the 5th. of July?”) as long as there is due process and it isn’t simply a fishing expedition…
It is thornier than that, of course, but at least we have a start.
Chris/Michael,
My first objection is that defining data ownership and licensing rights leaves an awful lot of important issues unaddressed, including:
1. Anything where the government’s investigative or subpoena powers could come into play.
2. Anything where we want to limit people’s use of public domain information (just like today we limit their use of their knowledge of our address, phone numbers, or skin colors).
My second objection is that overreliance on that approach probably leads to some of the answers coming out wrong. E.g., it could well leave merchants with the right to exert influence to gain comprehensive licensing rights to our data as a condition of doing business with them — EULAs on steroids, with a real potential for outcomes we don’t like.
Curt, this is great! I hope that anyone concerned with these issues reads this. In particular, anyone proposing a policy decision, regulation, or law should consider all the cases you lay out, to help make sure that the new decision/regulation/law will have the effects it is intended to have.
The previous commentators argue for simplifying all this. I think there’s nothing wrong, per se, with someone’s putting forth a statement that groups together some of the specific issues from your list. But looking at the list helps you make sure that your statement really applies to everything in the group.
When you say “the questions naturally fall into the domain of the technologists”, I understand your reasoning. It’s what I’d call a mechanism/policy dichotomy: we think about the mechanism (how it works and what it’s capable of doing), and they think about policies based on the available mechanisms.
People thinking about law and ethics might come up with other questions that are valid and reasonable, based on their expertise, that we might not anticipate. They might not be in the form of “May X use Y data for Z purpose?” There will probably be iteration. Well, maybe this is too detailed for your blog posting. If you turn this into a more extended paper, you might want to consider such things.
It’s a great bunch of lists: you covered so many of the interesting cases! Here are a few ideas:
On your list of uses, here are some other ones that I think should be considered. (1) “prevention of crime”: can the data be used to find people planning to bomb Times Square? (2) “identifying people who are trying to be anonymous”, particularly political dissidents; (3) political advertising in particular, such as using the data to target different mailings to different people. These might be subsumed in other categories but calling them out as subcategories might be desirable.
The USA has anti-discrimination laws for certain special groups. These come up frequently enough in real court cases that you might want to say something about it (I’m not sure what, exactly).
The following is about clarify rather than substance. In “Who may the data be given to”, I’m not sure about the use of the word “given”. To you and me, it is clear that for present purposes, it’s irrelevant whether the data is actively pushed to someone, or whether that someone has the ability to grab the data. But in colloquial English, for people not familiar with all this, “give” specifically means to actively push something. If I let you borrow tools from my garage and you borrow a hammer, it’s not that I “gave” it to you but that you “took” it from my garage. Suggested replacements: “Who can see the data?”, perhaps.
Terrific stuff – it’s great being in the debate/discussion. So here’s a real life story from the last few days and a fun one to play with.
This an actual post I made to a site that tracks scams. The company that got my attention was Motor Vehicle Services the callback number is (866) 238 6239
“Again, no actual phone call made by these people. But I got the letter too. Interestingly about a year after I bought a new vehicle. The flyer looks official, but clearly isn’t. Scare tactics – “Your factory warranty has expired or is about to expire and you may extend your warranty coverage on the vehicle.”
So for grins and giggles (and I was bored), I called them back. I asked them which of my vehicles was about to expire and then listed 4 vehicles in our household. Trouble is none of the vehicles I mentioned exists. I mentioned a RAV-4 that I bought last year (except of course I didn’t) and the guy on the phone said, “yes that’s the one” and then went into a spiel about how much it would cost me if I had to replace stuff….
Clearly a scam, and so when I told the guy on the phone that it was a scam, he became quite belligerent. That’s one of the nice things about skype – callerid doesn’t come through, so they have no clue. I didn’t reveal the code that came with the letter, nor was I asked for it. That alone made me even more suspicious.
I don’t know who this company is, I don’t know where they are licensed, I can’t (or am too lazy to) find them. I just want to make sure that people do not fall for this kind of a scam.
There I feel better now!”
Dissecting the situation from a privacy point of view, we see the following (and some of this has to be hypothesis).
How do they know I am a candidate? Is it coincidence that it is about a year after purchasing a vehicle? Why don’t they ask to identify me when I call them from a number without callerid? What would they do with any data that I provide them if I am foolish enough to sign up?
The (very) small print says, “You may have been selected selected to receive this special limited time offer because of information in your consumer report or other data”.
This to me gets to the very essence of the privacy discussion. I can see no reason why a credit reporting agency should divulge any of my data to anyone. Nor should they be legally able to. They perhaps can collect the data – and provide answers like, “yes this person is credit worthy”, but that data absolutely should not be used for solicitation.
However, how can we stop it? It’s asymmetric. It is easy for companies to abuse, but hard to get them to desist. There are too many of them for any agency to fight with. It’s like being overrun by the soldier ants – or like playing “whack-a-mole” You can tire yourself out trying to beat them – you might kill a few, but more and more will keep coming
Chris Bird: You say “I can see no reason why a credit reporting agency should divulge any of my data to anyone.” – and in that very phrase you reveal the complexity of the situation (ignoring, of course, the fact that a credit reporting agency that couldn’t divulge any data would rapidly cease to exist as a business). In what way is it “your” data? It’s data *about* you – but there’s tons of that around (like my opinion of you, for example) which isn’t in any reasonable sense “yours”.
But then it’s equally data about whoever you dealt with. And, if some third party cleared the transaction – checked your credit rating, advance money in your stead to the purchaser – it’s about *them* as well. And, if the credit reporting agency used that information – of complex provenance – to compute something new and interesting, how is that result “your data”? If some of the data used was ownership data that is, by law, public – whose is what when all is said and done?
A relatively simple version: You buy a toaster and charge it on your credit card. The store you bought it from notices that you might be interested in other home appliances and starts targeting you with ads for ice cream makers and can openers. They, of course, claim that they own the transaction data as much as you do, so have the perfect right to target ads on that basis.
Ah, but then the credit card processor decides to get in on the deal, and sells information about people buying home appliances to the store’s competitor. Now the store gets upset, because the credit card processor is selling “its” data to its competitors! This has actually come up and lead to a lawsuit. (I don’t know how it came out.)
If the credit card processor can do this – how about UPS, if you order on line and they can infer what they are delivering to you? How is what’s readily apparent from the outsides of boxes that UPS is carrying at your request “your” data?
— Jerry
Generally, I think rules that say intellectual property “belongs” entirely to somebody and not at all to somebody else are hard to make workable. I think it’s usually more practical to speak in terms of who has the rights to do what. For example, there’s one principle of fair use applying to sufficiently long sequences of words I write in this blog, and a slightly different set of rules for the comments you guys write (e.g., you don’t have the right to take them down once you post them here). But in neither case does the principal owner of the words (namely the author) have unlimited authority to control them.
I actually can’t think of a law about intellectual property that really fits the ownership paradigm, nor can I think of a contract that does unless it’s a draconian work-for-hire. Creating new “property” classes is IMO a wrong turn.
Jerry, I absolutely agree that the toaster store knows about the transaction. But that doesn’t mean they should no about me.
And in your observation about UPS I have another story. Many years ago I wanted to send a gift from Amazon to a friend. I wanted the gift to be anonymous. I made that clear when I ordered the item. Imagine my surprise when my friend called me to thank me for it! Amazon had partially honoured my request – by not including their customary gift card, but failed by putting my return address on the package.
I still argue that basic data about me (my blood type, my iris, my fingers, the placement of distinguishing characteristics) is mine and mine alone. I may choose to let others see it (because there is a transaction that I deem to be to my advantage – e.g. seeing a doctor, wanting to register for the fast way through Heathrow, wanting to come into the USA, and no I won’t give a distinguishing mark case!).
Now just saying that I own the data doesn’t necessarily make that worth a damn! Because there is little I can do to enforce that ownership.
Transaction data is different from the essential data. When I buy something with a credit card, there are three interests in that data (and 4 if I fill out the warranty slip). They are me, the merchant that I bought the thing from and the credit card company. I have entered into an agreement by signing up for the card that some of MY data is available to the card company. I have not entered into that agreement with the store. The store can only be assured that the credit card is valid and will cover the costs of the transaction. The credit card company is acting as a proxy for me. It knows that I spent the money at the store, but has no need to know what it was that I bought. Only I know everything about the transaction. Who I am, what I bought, and what card I used.
Again quite theoretical because the store will argue that it has the right to ensure that the card holder is the card submitter. It may or may not, but it should destroy that data immediately. PCI anyone?
Curt, words are equivalent in this discussion to the transactions. I don’t claim ownership to the words and nor can I. I have chosen to publish them, but they are not inherently me. These comments are by me, but the are not a fundamental property of me.
I like the French concept of “tutoyer”. The idea that you can only use the familiar form when you have been given permission so to do. Up until that time you use the vous form (and not the tu form) when addressing someone formally. That;s the granting of permission.
[…] so this is all just one more part of what I regard as the crucial discussion around privacy and liberty issues. Categories: Liberty and privacy Subscribe to our complete […]
[…] July 4 privacy post engendered thoughtful discussion from three of the smartest guys who comment here — Chris […]
[…] Implications will be dramatic for numerous industries and government activities, including but not limited to law enforcement, automotive manufacturing, infrastructure/construction, health care and insurance. Further, these technologies create a near-certainty that individuals’ movements and status will be electronically monitored in fine detail. Hence their development and eventual deployment constitutes a ticking clock toward a deadline for society deciding what to do about personal privacy. […]
[…] The essential questions of fair data use, in which I point out such a long list of legal issues that almost everybody has overlooked some of them. […]
[…] the legal frameworks around information use is a difficult and necessary job. The tech community should be helping more […]
[…] the legal frameworks around information use is a difficult and necessary job. The tech community should be helping more […]