An article in the current issue of Science reports that researchers at the Whitehead Institute in Boston have been able to combine online genetic data from the 1000 Genomes Project with genealogical data available elsewhere on the Internet to identify specific individuals. From the press release:
[Whitehead Fellow Yaniv] Erlich and colleagues began by analyzing unique genetic markers known as short tandem repeats on the Y chromosomes (Y-STRs) of men whose genetic material was collected by the Center for the Study of Human Polymorphisms (CEPH) and whose genomes were sequenced and made publicly available as part of the 1000 Genomes Project. Because the Y chromosome is transmitted from father to son, as are family surnames, there is a strong correlation between surnames and the DNA on the Y chromosome.
Recognizing this correlation, genealogists and genetic genealogy companies have established publicly accessible databases that house Y-STR data by surname. In a process known as “surname inference,” the Erlich team was able to discover the family names of the men by submitting their Y-STRs to these databases. With surnames in hand, the team queried other information sources, including Internet record search engines, obituaries, genealogical websites, and public demographic data from the National Institute of General Medical Sciences (NIGMS) Human Genetic Cell Repository at New Jersey’s Coriell Institute, to identify nearly 50 men and women in the United States who were CEPH participants.
Previous studies have contemplated the possibility of genetic identification by matching the DNA of a single person, assuming the person’s DNA were cataloged in two separate databases. This work, however, exploits data between distant paternally-related individuals. As a result, the team notes that the posting of genetic data from a single individual can reveal deep genealogical ties and lead to the identification of a distantly-related person who may have no acquaintance with the person who released that genetic data.
Yawn. Really, what is the big deal? If some portion of the public is spooked over the vacuous concept of "genetic privacy," researchers who listened to the ditherings of certain bioethicists have only themselves to blame. An accompanying policy article in Science does note:
The general expectations of the public about privacy and confidentiality may be subtly shifting as well. In addition to social media outlets (e.g., Facebook) that have led to more pervasive sharing of personal details, patient-centric organizations (e.g., PatientsLikeMe) now provide the means to share in-depth information about health status and to identify research opportunities for motivated individuals.
Well, yes. And there's lots more of that kind of genetic self-revelation and sharing coming down the pike. More and more people are realizing that concerns over genetic privacy are way exaggerated as I explained in my article, "I'll Show You My Genome. Will You Show Me Yours?" If you're interested, click on over to SNPedia and take a look at my many genetic flaws.