The Volokh Conspiracy

Mostly law professors | Sometimes contrarian | Often libertarian | Always independent

Crime

Tentative Thoughts on the Use of Genealogy Sites to Solve Crimes

Prompted by the apparent catching of the Golden State Killer.

|

As many readers know, the Golden State Killer appears to have been finally caught using a genealogy site, GEDmatch.com. This raises some really interesting issues that I haven't thought much about before, and I thought I would explore them more. I'll focus on two questions. First, was this process legal? And second, is it troubling that the police were able to do this? Here are some tentative thoughts.

The Facts. First, here's an overview of the facts to get us all on the same page. The website, GEDmatch.com, allows people who have obtained their DNA profies to upload them and see if they can find a match with others on the site. About 950,000 users have voluntarily posted their profiles there in an apparent effort to find matches with others who have posted profiles. Investigators apparently used the known DNA profile of the Golden State Killer collected long ago from a crime scene, created a profile, and were able to figure out the killer's great-great-great grandparents who lived in the early 1800s. From that point, the Washington Post has reported, "a team of five investigators spent four months building out family trees, name by name. They pored over census records, newspaper obituaries, gravesite locaters, and police and commercial databases to find each relative and, ultimately, DeAngelo" — Joseph James DeAngelo Jr., a 72-year-old former police officer.

So was it legal? And from a policy standpoint, what should we think of this?

The Law. As far as I can tell, the legal issues are relatively straightfoward. I don't know of any plausible arguments that the investigation (or at least what I have described above as the investigation) was unlawful. There are no Fourth Amendment issues here I can see, as the government was just using a service to check with data that others had voluntarily made publicly available. You can think of this as kind of undercover operation, in that the police were perhaps "posing" as the Golden State Killer looking for family DNA matches. But that's not a Fourth Amendment issue, as the police have always been allowed to do that under the Fourth Amendment. See Hoffa v. United States, 385 U.S. 293 (1966).

Some have wondered if the police using a genealogy site might be criminal if it violated the site's terms of service under the Computer Fraud and Abuse Act. But the answer to that is clearly no, as the statute provides an exception for law enforcement investigations in 18 U.S.C. 1030(f): "This section does not prohibit any lawfully authorized investigative . . . activity of a law enforcement agency of the United States, a State, or a political subdivision of a State[.]" And that's even assuming that violating terms of service can be a crime anyway, an issue on which there is a circuit split and in which the Ninth Circuit (which of course includes California) has said violating terms of service is not a CFAA offense. See United States v. Nosal, 676 F.3d 854 (9th Cir. 2012) (en banc).

Assessing the Privacy Threats. But what about the broader set of privacy questions? The benefit to law enforcement in terms of solved cases involving DNA is weighty, and I think we all get that. Especially in such an incredibly disturbing case as this one. I take the policy question to be how we consider the tradeoffs in more general cases: how do we weigh the benefit to public safety against the privacy harms? In doing that, I would think there are three distinct privacy harms to consider:

  1. The invasion of privacy when the goverment collects a DNA sample and creates a profile of it in order to be able to make a match;
  2. The invasion of privacy when the government uses the genealogy site to obtain information about a person who submitted his or her own DNA profile to the site; and
  3. The invasion of privacy when the government uses the genealogy site to obtain information about a relative of the person who submitted a profile to the site — when that relative did not submit a profile of his or her own.

Let's consider each in turn, with my apologies if I am not using all of the correct scientific terminology in how I am describing how the different issues might break down.

1. Sample Collection. I think there are serious privacy issues generally raised by the police obtaining DNA samples, although I gather none of those issues are rasied in the use of the genealogy site. In particular, I think there are troubling issues raised by the use of so-called "abandoned DNA," as well as the rules that the government must follow more generally to collect DNA samples. Regular readers may recall that I once wrote a cert petition arguing for greater Fourth Amendment limits on the collection and testing of DNA samples taken from a suspect. It seems to me that those issues still need more attention. DNA evidence can of course be pretty powerful, and it seems to me that the rules on collecting samples need significantly more attention than they have received so far.

With that said, I gather those issues are not raised in the use of the genealogy site, which presupposes that a sample has already been collected. Further, I understand that the Golden State Killer's initial DNA material was collected at the crime scene when the crimes occured, and it's hard to argue that there was a problem with the government collecting the DNA from the crime scene. There's the separate issue of how the government obtained DeAngelo's abandoned DNA at the end of the investigation after the genealogy site was used. But I gather no one is objecting to that at this point, so I'll move on to other issues.

2. Gathering Information on the Person Who Submitted A Profile to the Site. The second issue is the invasion of privacy when the government uses the genealogy site to obtain information about a person who submitted his or her own DNA profile. My tentative thinking is that this isn't particularly problematic. I say "tentative thinking" advisedly, because this is a new issue for me. I'm still working through the questions myself, and I may later decide that my initial thinking was wrong. But at least right now, it seems to me that collecting evidence about a person who voluntarily submitted it to a million or so people — and who submitted that information with the goal of having others among that million or so people match data with his own — is pretty unobjectionable. As a result, I tend to think it falls into the category of good policework and not creepy overreach.

There are counterarguments to this, I realize, and here's one particularly worth discussing: A person who submits his own profile to a site like GEDmatch.com presumably does so with the hope that others will seek matches with his own for a particular purpose, and that this government use is outside that purpose. That is, the person who posts his profile on a genealogy website is hoping to find out about his genealogy, not help the government gather evidence of his connection to a crime. That raises concerns that the information is being misused, in that people submitted their information for one purpose but then the government is using it for another purpose.

I see the argument. But I confess it doesn't seem particularly weighty to me in this context. The problem, I think, is that when you decide to expose your information to the entire world, with the express purpose of having anyone in the world find an identifying connection to your information, your basis on which to complain about what a particular person is doing with that identifying information seems somewhat limited. Here, the government was doing what anyone else in the world could do on a site open to the world.

Granted, I gather people didn't foresee this sort of use. But if so, I wonder if that's more a story of technological growing pains than government overreach. I see it as sort of like the weev case, where AT&T published information hoping that only certain people would see it. By posting in the information on the web, AT&T assumed the risk that others would pick it up. If you don't want to risk that happening, the lesson would go, it's probably best not to put the information out there.

3. Gathering Information About Non-Submitting Relatives of those Who Posted Profiles to the Site. The final question, and the one that seems to have drawn the most attention, is the potential privacy invasion when the government collects information about the relatives of profile-submitters — relatives who did not themselves submit a profile. Here the concern is that the person identified did not take any affirmative steps to reveal their identity. The government is learning information about someone who didn't submit their information. As law professor Elizabeth Joh told The New Republic:

Do you realize, for example, that when you upload your DNA, you're potentially becoming a genetic informant on the rest of your family? And then if that's the case, what if you're the person who didn't personally upload the DNA, but you discover that your family member has done that?

I understand this concern, but I find it mitigated, at least in part, by two dynamics. First, I gather that the non-submitting relative here is the person whose sample was collected and profile submitted but whose identity is unknown. If that's right, then it's not clear to me why the privacy protection shouldn't come at the initial evidence-collection stage rather than at the profile-matching stage. As I wrote for Point 1, I think the collection of a DNA sample and its subsequent testing raises serious privacy concerns. But if we assume that those privacy concerns have been addressed, and we're okay with the collection and creation of a profile, I'm not sure I'm all that bothered by the subsequent additional identity information collected using voluntarily-submitted profiles of relatives.

Put another way, when the government uses a genealogy site like this, there are three potential privacy-invading stages — collection the initial DNA sample to profile; seeking a familial match on the genealogy site to narrow down suspects; and collecting a final DNA sample to get an exact match — and I'm not sure it's the second stage, as compared to the first or third, that is the one we should be most concerned about.

The second dynamic that mitigates the privacy concern, at least to my tentative thinking, is that the idea that people giving up their own privacy voluntarily also are giving up some privacy as to their family members strikes me as something that's actually pretty common. Think of familiar examples from everyday life. Say you don't want it to be easy for people to know your home address. A family member with the same last name might post their home address on the web or in the phone book. If someone wants to find you, a quick google search of your last name may give people an inkling of where you live because of what your family member posted. Or say you don't want a picture of you to be online. You don't post one, but a friend or colleague might post a public picture of a group of people that includes you without your permission. As a practical matter, maintaining privacy often requires the cooperation of others.

A difference here is that the technological uses here are new. The third-party privacy risk may not have been appreciated. But this goes back to the growing pains idea above. When technologies are new, they often seem like a magic black box. It's easy to miss the privacy implications of its use. But as we understand the techology better, we get better at understanding it. Perhaps the Golden State Killer investigation is a step in warning users about a use of data that many people didn't think about but should be on their radar screen going forward when they make choices (or revisit past ones) about what to do with their data and the data of others.

Thoughts Going Forward. Of course, none of this means that there shouldn't be some sort of regulation of whether or how criminal investigators use genealogy sites. Depending on how you see the balance of privacy and security interests, you could imagine different rules. For example, you could have a rule that legal process is required before the government can use such sites. Perhaps you could require the government to first show probable cause to believe that the DNA profile to be submitted is evidence of a serious crime (such as a homicide or a sexual assault). You could have something like the Electronic Communications Privacy Act for using a genealogy site — say, a DNA Genealogy Privacy Act — that would limit how investigators use such sites, require court orders, prohibit voluntary disclosure, and the like. There are a lot of options, depending on what uses you foresee and how you measure the costs and benefits.

Anyway, I really mean it when I said that my views were tentative. I'm new to the issues and just feeling my way, and maybe I'm missing something obvious (or less obvious) that will lead me to reverse course. I look forward to hearing what others have to say and learning from your insights.