In today’s recording of the monthly Semantic Link podcast series, we returned to a common theme – ‘open data’ and ‘Linked data’ initiatives. We were discussing the possible implications of the US Congress’ desire to cut back on the ‘Electronic Government Fund’ and in particular the impact that would have on initiatives such as data.gov, USASpending.gov and the IT Dashboard.
There is growing support for public and untrammelled access to vast datasets held and curated by public authorities but less concern about the role that those same agencies have for putting the data in context – bluntly put: before or when releasing such data sets to the public, is there a public service imperative to deliver an information service as well as just a data product? Where should we draw the line between letting anyone mash-up, remix, re-analyse and re-position any public data – admitting the possibility (nay, given the global nature of the Internet, the likelihood) that it will be misused, abused and deliberately taken out of context to serve any one person’s ends – and ensuring that public data sets at least carry a Government Health Warning that “The Tolstoy Syndrome is extremely addictive. Selective use of public data will reinforce the habit.”
You will find a high correlation between references to ‘open data’ and ‘disintermediation’ – ‘cutting out the middle man’ – as the movement is driven precisely by a desire to ‘get at the facts and the truth’. But facts don’t speak for themselves. Today’s conversation with Evan Sandhaus from the New York Times underlined my conviction: technology tools are great but they have their greatest impact when in the hands of skilled craftspeople –rNews is of limited value if journalists simply cream off the top and choose not to apply the scientific rigour that we have come to expect from first class investigative journalism. In the words of Carnegie Corporation in an article back in 2006, “it’s fair to ask whether the news organisations of today – and tomorrow – are up to the task of sustaining the informed citizenry on which democracy depends”. Institutions such as the New York Times and The Guardian probably are, and have shown how capable they are through continual reinvention and investment in technology. Your average open data enthusiast, working alone with little or no training or access to informed context, probably isn’t. But then I never bought into the “wisdom of crowds” and noosphere either.