Sunsetting openSNP - a personal retrospective
a black and white photo of a person taking a photo out of car on the passenger-side, the person and camera are visible in the side-mirror[1]
As this blog post is being published, we are sending out emails to all _openSNP_ users with some news: _OpenSNP_[1] will be turned off – and with that also delete all the data stored on it – on April 30, 2025.
Given that the project has been part of my life for so many years and shaped how I view (data) commons and open & free culture, it seemed just right to take a bit of space for a retrospective - but also the reasons why I think it's time to pull the plug now.
The idea to what would become _openSNP_ started nearly exactly 14 years ago: Going back through my emails, I find that I emailed Philipp[1] on April 13, 2011 to let him know that I had just provided my saliva sample to send it off to 23andMe.
Those email conversations would lead to "someone should make a platform where people can donate their genetic & phenotypic data".
_Et voilà_.
Well, with our then limited _Ruby on Rails_ skills, we threw together something half-working by October of that year. And luckily Helge[1] came to our rescue, doing a lot of work to getting us to a "more-than-half-working"-state.
From the very beginning, my aspiration for _openSNP_ was to not only follow along a practical data repository angle, but also embody a more radical political dimension: At a time when genetic data was locked into the commercial siloes of "direct-to-consumer" (DTC) genetic testing companies - and only made accessible to the pharma companies that could afford buying access to it - _openSNP_ should open up access to everyone. Regardless of financial means and institutional status or credentials, it should provide free access to the data.
And equally important: It would give the individual the choice to contribute to this open data resource, instead of having researchers or companies broker the access.
Of course, this ambition came from a (in retrospect naïve) data-centric belief that genetic data would be a key driver for improving human health and medicine.
In 2025, my view on that is a lot more sober (and bleaker): Today it seems clear to me that the biggest impact on improving health - even in the rich, allegedly 'developed' nations - would come from providing food security and access to stable housing.
And not from trying to find genetic confounders of common diseases that are a lot more rooted in those environmental & societal factors.
This deeper-seated criticism aside, _openSNP_ has been quite successful at some of these academic & political endeavours: Over the 14 years, _openSNP_ has grown to become one of the largest (if not _the largest_) resource for this type of data, despite a lot of more _academic_ projects with lots of funding trying to do the same.
The data that people have contributed to _openSNP_ has also been used quite productively: Reseachers, both from within academia but also outside of it, have used the data for their own studies. And the data was also used widely for teaching students around the globe how to work with and interpret human genotyping data.
One example of how the data has been used that's of particular note to me personally was the attempted replication of a published finding on the "genetic origins of Chronic Fatigue Syndrome"[1] (CFS).
The original study claimed to have found a lot of genetic variants strongly associated with CFS.
In fact, these alleged associations were so strong that the community of CFS patients didn't believe those to be true findings.
Which is why we were approached by a member of that community, to see if the data in _openSNP_ could be used to try to replicate that paper.
And with a bit of our help, the data on _openSNP_ could indeed be used for that. We could show those original claims to be a technical artifact rather than a true finding.
To me, this was _the_ show case for the potential to enable a more "democratic" reasoning about genetic claims.
Another big aspect of _openSNP_ that I'm personally quite proud of is that we started out with it being an independent project and have managed to keep this way for it's full duration: Basically since the launch of _openSNP_ we have been approached about selling the platform itself or the data.
Big pharma companies reached out, as did start ups. And even forensics companies that would like privileged use for law enforcement use.
But we always made the promise of not selling _openSNP_ or the data contained within it, and we have kept true to that, there was never even a question about it.
On that note, it is also worth pointing out how _openSNP_ has managed to stay independent and operate that way on the proverbial shoe-string budget (even now the costs came out at less than a 100 €/month for storing and serving the data).
A large credit for keeping the costs that low goes to Helge, as he always advocated against using _the cloud_, and so _openSNP_ has always run on small and cheap virtual servers with a predictable cost instead.
Which is why we never needed big money.
And so, while we won some small amounts of prize/grant money after the initial launch (and had a small sponsorship for a year), most of the cost of running _openSNP_ has been covered by individuals: At first, mostly out of our own pockets, even when we were students with little income.
Later on, also thanks to our supporters on Patreon, who cover around half of our bills.
This way of running _openSNP_ has been formative for how I (critically) view approaches to _"scaling"_ and "_sustainability_": These days, it seems like virtually every open science project is keen to rush towards applying for large amounts of funding to "scale up" their efforts, only to then very quickly start worrying about the "sustainability" of their project. Naturally, those two issues are deeply intertwined:
Funders ask for "_scale_" & "_impact_", and once an organisation has grown accustomed to the money, it needs more and more money to keep the lights on. And as capitalism-brain affects philanthropies just as well, there always remains a pressure to "_scale_" even more.
I think _openSNP_ has demonstrated that a different way of doing open science – and running infrastructure for it – is possible: Despite hosting at least as much data as other initiatives in the space, we never had to rush to find the next grant or funder to cover us.
As a result, our financial sustainability was never a serious concern. Of course, this is at least in part due to the privilege of being able to use some of our own income and time to continue operating this way. But, this has also meant that we never had to play nice with some of the large, and highly problematic[1] funders - such as the Chan Zuckerberg Initiative or the Gates Foundation – that operate in these space to advance billionare interests.
With all of this said, why are we now shutting down _openSNP_? The proximate reason is the bankruptcy of _23andMe_[1].
1: the bankruptcy of _23andMe_
Since the potential of them going under made the news during the last year, we had been thinking about how to proceed with _openSNP_.
On a pragmatic level, this is because the vast majority of data sets are provided by people who got their own genetic data through _23andMe_, so the potential for adding more data to _openSNP_ in the future will be quite diminished.
But, there's also the aspect that more and more people worry about how their data will be used now, compared to 14 years ago, which is the ultimate reason.
Even outside _openSNP_, the largest use case for DTC genetic data was not biomedical research or research in big pharma. Instead, the transformative impact of the data came to fruition among law enforcement agencies, who have put the genealogical properties of genetic data to use, leading to the purchase of genetic genealogy platforms like _GEDMatch[1]_.
Similarly, some forensic companies[1] have started to engage in "DNA phenotyping" – e.g. predicting skin color and facial features based on DNA data – something that's best understood as 21st century re-run of long-debunked craniometry[2].
This is not the only anti-science/knowledge backlash around in 2025.
Across the globe there is a rise in far-right and other authoritarian governments.
While they are cracking down on free and open societies, they are also dedicated to replacing scientific thought and reasoning with pseudoscience across disciplines.
In the US, medicine in particular (alongside climate change) has been targeted by removing trustable source to replace them with the crackpot theories of those in power[1]. All while large corporations that are aligned with those governments are strip-mining and polluting our digital commons and strain our open culture infrastructures - such as free & open[2] software repositories[3], Wikipedia[4] and others - at the same time.
1: by removing trustable source to replace them with the crackpot theories of those in power
Which is all to say: The risk/benefit calculus of providing free & open access to individual genetic data in 2025 is very different compared to 14 years ago. And so, sunsetting _openSNP_ – along with deleting the data stored within it – feels like it is the most responsible act of stewardship for these data today.
My huge thanks go to Philipp & Helge, working together with them on this project has been a huge pleasure – and it wouldn't have been possible without them.
And, of course, a huge thanks also to all the people that have supported _openSNP_ over the years. From people donating their data, spreading the word about the project, over contributing to the code, to supporting us on a small scale with some donations.
This work would not have been possible without these contributions.
I hope that many of you will continue to be active in the spirit of mutual support and aid, even if _openSNP_ comes to an end.
Source