Space

Big Data, the Library of Babel or the butter on our daily bread?

By: The Leader
Published: Thursday, March 21, 2013 - 18:35 GMT Jump to Comments

As Birmingham gears up for Big Data Week and Science Capital offers illumination with food and a drink the Leader asks "Big Data? Why give a damn?"

The Leader has just read a tweet from @bengoldacre asking if (we) should expect private organisations (to) donate big data for (the) public good as a new form of philanthropy? @willperrin thinks we should.

"Bigdata" as currency? Well yes actually. Humans all are (just) packages of data, the result of expressions of 21,000 genes that occasionally malfunction to predispose us to thousands of genetic diseases. This data can be decoded (for around £700 per person) and used to tailor personal treatment options. The data has a commodity cost that can be priced and once acquired used to produce a valuable outcome or output. This is just one of thousands of ways in which data can have simple, direct, commercial value. 

500,000 adults in Britain are now answering questions about their daily habits. Questions posed by UK Biobank, a big (but not that big) research programme funded by the Wellcome Trust, the Medical Research Council and the government. It's a puzzle to The Leader why this research needs public funding. The quite big data that comes out of the programme will be worth a footballer's salary. Surely the public purse could loan the money at a decent rate of interest?

In the 1980s the regulatory authorities  grudgingly allowed that the value of "brands" could be included on a balance sheet. How long before the value of a company's data holdings are included in the company accounts? It is not too fanciful to imagine a day when Data credits are traded on a "currency" exchange.

Also in the 1980s Saul Wurman, the author of Information Overload and the godfather of Information Design, noted that a single 1980s copy of the Sunday edition of the New York Times contained more printed information than the entire world contained in 1400 AD. When Wurman made this observation the personal computer was hardly into puberty and the internet was still in nappies. Wurman was an information visionary and ground breaking information designer but he had no idea what was just around the corner.

Being connected not only gives us access to more data (a challenge in itself) and allows the creation of larger and larger databases (useless until made useful) but also facilitates the production of more and more data with a reproductive energy that even rabbits might admire.

People now talk about big data as though the world was populated by super-massive chunks of discrete data towards which we might travel and from which we might extract or manipulate some part or parts to be untilised for our benefit or illumination. This idea misses the point completely. Big data is much bigger and more complex than that.

The Argentinian writer Jorge Luis Borges wrote, in The Library of Babel, of an infinite library which contains an infinite number of volumes each of which is a variation of other volumes differing from the other volumes in one or an infinite number of ways. The Complete Works of Shakespeare as is, the Complete Works of Shakespeare with one extra comma, with two extra commas, with three fewer commas and so on ad infinitum.

People talk of mountains of data, oceans of data, continents of data, but all to no avail.

Borges was foresighted in many ways but The Library of Babel, the most useful paradigm for "big data" the Leader has found, where even the words which might describe something unimaginably huge are contained in a data package (book) which is itself part of a "set" which is infinite (and so cannot be a set having no beginning and no end), was arguably his finest imaginative leap.

Feeling lost? Don't worry. The question is not how may we comprehend "big data" (don't try for that way madness lies) but how we might make it useful. How can we, as ordinary mortals who might try to come up with one novel thought on a good day, meaningfully make use of it. 

Borges, working with what came to hand, envisioned the library and the books. But the vast majority of the books were useless, meaningless jumbles of letters and punctuation or complete texts with the crucial word left out or misplaced. Occasionally, in a life time of searching, researchers in The Library of Babel would find a book with a complete sentence or paragraph or even a whole page which made sense. But the rest was nonsense.

So it is with big data, we face a life time of fruitless research "in the stacks" of The Library of Babel unless we can link data through shared analytical and visualisation tools for shared understanding and benefit. This, the Leader would like to suggest, might usefully be called The Babel Problem.

FOLLOW THE LEADER ON TWITTER @THELEADERSPEAKS

The views and opinions expressed in this article are those of the author(s) and do not necessarily reflect the official policy or position of The Information Daily, its parent company or any associated businesses.

Comments

Latest

Outdated infrastructure and an increasingly fragmented market threaten the future of technology-enabled integrated care.

County Durham voters back devolution in the North-East, Sir Digby Jones considers run for West Midlands mayor…

The recent launch of The Mayoral Tech Manifesto 2016 on London’s digital future, sets out a clear agenda…

The manufacturing industry is currently facing scrutiny from parties concerned for its survival. Far from facing…

Almost a year ago, I made some predictions for what would take place in government and public sector customer…

Sheffield, Warrington and Doncaster announce cuts, Lincolnshire is held to data ransom, fight begins for West…

Working for an education charity delivering numeracy and literacy programmes in primary schools, I’m only…

Northamptonshire County Council recently received the maximum four star rating from Better connected after putting…

Historically, the entrance of new generations into the workplace has caused varying levels of disruption. The…

Following another commendation for digital services, Surrey County Council's Web and Digital Services Manager,…

We cannot carry on spinning the roulette wheel that is cyber security, knowing that the “castle and moat”…

This week David Cameron wades into row over £69m of cuts planned by Oxfordshire CC; Stoke on Trent plans…