In 2009, the U.S. federal government’s open data portal, Data.gov, launched with 47 open data sets. Today, there are over 194,000. And it’s increasingly the anomaly for a city or state not to have its own similar open data portal. Meanwhile, more than half of countries around the world have their own open data initiatives. And while the United States has long been a leader in the degree of records that its governments made available digitally, many other countries are picking up steam.
But that’s just the growth in open data — by which we mean, data in fully accessible, machine-readable formats. And that ultimately amounts to relatively a small fraction of the public data that’s coming online. Indeed, a 2016 survey of public data sets by the World Wide Web Foundation found just 10 percent were open. Far, far more of public records and information are being made digitally available to the public in searchable databases and less structured forms.
There are fundamental tailwinds that are driving and will continue to drive this growth. Governments are finding that making their data accessible helps reduce costs and improve services. It can help catalyze new economic innovation. And at the same time, the costs and barriers to hosting it are are in decline thanks to dedicated providers like Socrata. Perhaps most importantly though, there’s a growing consensus that the information our government generates is all of our public property — and the presumption should be that it will be made available, unless otherwise justified.
These forces are driving more openness in public data, and that offers tremendous opportunity. Openness and transparency in data is a powerful force for accountability in our public institutions, and has the possibility to be a particularly important check on corruption.
But it’s not enough for it just to be out there. Because if it’s out there, and doesn’t get used, then it isn’t really adding value. All of this information is available, but if you have to go out and find each data set, figure out how to use it, search it, and then go onto the next one, it’s going to take quite awhile. And you aren’t going to do it. And so that data will go unused or underused, because of the difficulty in accessing it.
The main problem with making use of all of these newly available public data sets on the internet is that they mostly exist in a whole bunch of different silos — a whole bunch of different individual databases that you have to search one by one.
If you think about the problem here, it’s that we’re adding public data to the internet in a whole bunch of silos — a bunch of individual, differently-structured data sets. And in the past, the way we’ve typically tried to solve that is by dumping the data into another, bigger silo. But with illumis, we doing something different: we’re accessing the information where it is hosted. illumis connects our users with the information from public records data bases they need when they need it, with live search integrations to thousands of public data sets.
Before we started illumis, I’d worked for 10 years doing research, investigations and due diligence, on campaigns and then for my own firm. Which means I’d spent a whole bunch of time searching through all of these individual data sets, one by one, over and over. The information I needed was buried in hundreds of different silos, and it could take hours to pull it all together — or to even know if it was there at all.
As I grew my small firm, I was lucky enough to get to hire some really, really smart people…who would then spend their time doing the same thing I had — going out to all of the data bases we needed to search, and searching them manually, one by one by one. They were essentially fetching data, rather than doing the thing smart folks should be doing, which was making sense of it. I thought this was a bit crazy. That’s why we started building illumis.
illumis is a data access platform, for public records and public data sets. With so much more information available and coming online, if you’re relying on manual searches, you just won’t have the time to access most of it. Vigilant lets our users access all the data sets they need to search in a single search in seconds, running thousands of simultaneous searches in real time. It’s a powerful tool for research, investigations, and diligence — and for the reporters, analysts and researchers who do this critical work.
A big part of the reason that illumis is such a powerful platform is that it’s easily extendable to integrate nearly any searchable data source that our users might need. So all of that data that coming online now? We’re ideally positioned to catch it as it gets here.
But we believe that access is really just the first half of the battle — because if we want to make all of this newly available information meaningfully and broadly accessible, we actually need to integrate it. And that’s where things get really exciting with illumis.
illumis’s search platform sits on top of an API — letting us integrate data from illumis into any platform, work flow or even website. We started off building a way to search for records, but what we built ended up being ideal for monitoring for new information as well. And if we can integrate any data source into illumis, and then integrate that data feed into any platform or website, we can really connect our users with the information they need, when and where they need it — connecting any data source to any user.
That’s its own form of news — an opportunity to vastly expand the sort of records that are “reported” on and known. And we’re very excited to be helping bring this to life — to take illumis from a powerful search tool, to a platform that’s truly delivering the information our users need, when and where they need it. It’s a massive opportunity to expand the coverage that exists and is possible today.
We’re with the Washington Post — democracy dies in darkness. The result of today’s silo-ed data landscape is that records remain fragmented, facts remain hidden, and critical and valuable stories go untold. We believe the truth should never be a victim of logistics, and so we’re building a platform that makes real, meaningful public records access and transparency a reality.
Interested in illumis? Sign up for a free trial at illumis.com, and subscribe to our mailing list.
We’re also looking for a few more great developers to join our small but growing team. Love data? Love hard problems? Love building powerful tools that people rely on every day to do important work? Join us!
Note: this piece has been edited to reflect our name change from Vigilant to illumis in 2020.