Rob's faceRob Blackhurst

RobBlackhurst.com/2007/historyinthetrash-ft

FT Magazine: Will History End Up In the Trash?

FT Magazine:

March 17, 2007

Will history end up in the Trash? A new breed of digital archivist is tracking the e-mails, computer files and electronic ephemera that might otherwise be lost forever


Title picture

Deep beneath the British Library, Jeremy John, a dandyish forty- something with a floppy sweep of collar-length straw hair, is holding the keys to the digital scriptorium. He unlocks a heavy metal gate, revealing a cramped, chilly white vault furnished only with a standard-issue desk and chair. It would feel as cloistered as a monk's cell if not for the clatter overhead of conveyor belts carrying books to the library's reception. John heads for a reinforced steel shelf and carefully lifts off the dust covers from a line of ancient computers.

Like Leonardo da Vinci's notebook, love-letters between Oscar Wilde and Lord Alfred Douglas, and John Lennon's scrawled first draft of "Ticket to Ride", these superannuated machines, and the equally venerable computer files boxed next to them, are now part of the world's greatest library collection. Digital texts - whether e- mails, research projects or literary drafts - are easy to create and even easier to discard. But as John, the library's first curator of digital manuscripts, is aware, they constitute an increasingly large part of our cultural record - treasures which, if not properly archived, could soon be lost to future generations.

It's a sobering thought that the Domesday book, written in 1086 on pages of stretched sheepskin, has lasted more than 900 years. Scholars with a permission slip and a sound grasp of Latin can visit the public records office in Kew, leaf through the book's pages and decipher its inventory of the manor houses and livestock in William the Conqueror's Britain just as they did in the 11th century. But the BBC's attempts to create a new Domesday book chronicling British life in 1986 - capturing fleeting historical records such as adolescent diaries and a video tour of a council house - was more problematic. The Pounds 2.5m project, stored on huge laser discs and readable only by a brick-like, mid-1980s vintage BBC microcomputer, became obsolete within a decade. Both the laser- disc player and the software it relied on have long since been abandoned. A specialist team from the national archives had to spend more than a year rewriting the software to rescue it from oblivion.

That latter-day Domesday project is a metaphor for the carelessness with which we're treating the digital information created during the past 20 years. The first telegram ever sent has been preserved in a frame; the first e-mail, sent in the 1960s using a mainframe computer the size of a room, has been lost. Will future generations look back at this period as a "digital dark age" - a modern equivalent of the early Middle Ages, which left barely a trace on the written historical record?

Blinking under the fluorescent strip-lighting, John admits that "it takes a while to get priorities changed." So far, his collection - an attempt to capture the digital files of working artists and scientists before it is too late - is hardly an embarrassment of riches. In predicting what future generations will be interested in, the library says it has opted for the "quietly influential" over the "distinguished".

The collection began in 2000 when, following the death of the evolutionary biologist Bill Hamilton, his estate bequeathed the library 200 boxes of paper and 26 boxes of computer files. Hamilton was lionised as the greatest evolutionary thinker since Darwin for his insight that red-in-tooth-and-claw competition between genes could still produce altruism in the animals they formed - a theory later popularised by Richard Dawkins. Some of the material is so primitive that it can't be saved on any modern computer. Jeremy John pulls a cardboard box from a shelf that is filled with the most rudimentary computer technology - reels of tightly packed five-hole paper tape - produced in the mid-1960s to analyse sex- ratio in different species. "People will want to know how the program was created in order to do his analysis. In terms of computer power, it's pathetic, but it's historically interesting." This paper tape can still be deciphered by reading from the holes. The boxes of magnetic cards from the 1970s are more problematic: "We can't read these. That's one of our challenges."

There are about three million websites hosted in the UK. The Legal Deposit Libraries Act, passed in 2003, gives the British Library and other legal deposit libraries (the six institutions in Britain and Ireland that are entitled to collect every book published in Britain) the right to "harvest" selected material from this online morass for posterity. But they cannot venture into private territory. E-mail accounts and text documents are kept at the individual's discretion, and are most at peril. Consider how much of our lives is now channelled into a keyboard rather than on to paper. Whereas once we might have stumbled across old notepads or postcards stuffed at the back of a desk drawer, digital "dear diary" entries and missives could in the future remain buried in passworded files that no one will ever take the trouble to unlock.

And what of contemporary writers? Are their messages being carefully stored for hardback editions of their collected e-mails 20 years from now? Gillon Aitken, a veteran literary agent whose clients include V.S. Naipaul, Pat Barker, Sebastian Faulks and Germaine Greer, doesn't bother holding on to their every word: "If a message by an author of significance was itself significant I would make sure to keep it. But if it's a 'thank you for this' e- mail I would not bother, honestly. I think that the day of constructed correspondence - where you get a reasoned letter - is going to pass." He says he can't see collected volumes of e-mail "happening in quite the same way".

But Jonny Geller from the agency Curtis Brown, who represents writers including Tracy Chevalier, Hari Kunzru and Phil Whitaker, "pathologically" holds on to his e-mails. "Most of my writers do as well, which surprises me. In some ways the archiving of authors' correspondence is safer and easier now than when there were letters - where you didn't always get both sides of the correspondence. The new generation of writers might use e-mail in different ways - musing and thinking about it more. You can get into a discussion with a writer. For instance, Hari Kunzru writes a piece in a paper and I'll engage in a discussion about it. Now who knows if that's interesting to anybody? But if that had been Samuel Beckett 60 years ago then it would be interesting now."

For romantics, the replacement of well-crafted letters by instant e-mails means that, even if our digital legacy does survive, we will leave a far more anonymous trail than previous generations. Regimented lines of text will seem bloodless compared with the creased manuscript stained by the bottom of a wine glass, or the letter infused with scent of cigar smoke that transports us back to a late night in Churchill's study.

The Bodleian Library in Oxford holds the papers of five 20th century prime ministers, and it has recently started a trial - the "Paradigm Project" - to collect the digital files of prominent politicians. Richard Ovenden, keeper of special collections and associate director at the Bodleian, admits that there has been a loss: "The thing that may go is the handwritten aside. We have memos in the Asquith archive from when he was in the Cabinet during the first world war. His annotations and notes to himself are actually the interesting thing about them - rather than their typewritten content. That's what I hope we won't lose - the paper, the printouts of those memos which then get taken into a meeting and are scribbled on by politicians. The danger with e-mail is that you think 'oh, I can just throw the paper away because I've got a copy of the memo on my system.'"

There is also a risk that the transition to e-mail means that long, reflective letters and memos simply aren't written. Anthony Seldon, a biographer of John Major and Tony Blair, says there has been a price: "E-mails are not as discursive as letters and it's all a lot more immediate. There is a loss - perhaps more for the biographer than someone writing an institutional history, because individual character does not come across so well in e-mail."

It's not just the republic of letters that is endangered. A survey commissioned in 2005 by the Digital Preservation Coalition, a UK non-profit company, found that 28 per cent of organisations had lost important data and 38 per cent had used file formats that are now obsolete. Across the Atlantic, AIIM International and Kahn Consulting surveyed 1,000 respondents from a range of industries in the US about their e-mail use. They found that 60 per cent of organisations had no formal policy on storage of e-mails and 54 per cent did not tell their employees where, how, or by whom e-mails should be retained. Another survey, of local authorities in England, Scotland and Wales, found that only 23.5 per cent had any historical electronic records held in their archives. As David Thomas, director of collections and technology at the Public Records Office says: "I can't see - unless things change radically in the next 10 years - how in 50 years' time people are going to be able to write the local history of Britain from 2000 to 2010. The stuff won't be there."

Only a decade ago, when communication was still paper-based, large organisations employed platoons of filing clerks, copying documents and delivering them by trolley to the right filing cabinet. Now e- mail record-keeping is dependent on individual conscientiousness. As Chris Rusbridge, director of the Edinburgh-based Digital Curation Centre, says: "E-mail has elevated the gossip level of interchange into a recordable form. But also it has dropped some of the stuff that would have been in the formal discourse down into the informal. It has made it more difficult to distinguish between these two. This has made the task of the institutional record- manager much harder. Formal decisions such as the minutes of the board are going to get treated properly, but small business decisions - such as employing X rather than Y - are now on individual e-mail accounts and are routinely lost."

Apart from the temptations of the delete key, digital files have to avoid a host of natural threats if they are to remain intact. If Robert Falcon Scott had keyed his eerie final message - "For God's sake look after our people" - into a PalmPilot, it is unlikely that it would have survived the Antarctic cold anything like as well as his paper journal - found perfectly preserved in his tent.

According to Simon Bains, digital library manager at the National Library of Scotland: "The difference between digital and print is that you can get away with not looking after print material. You can leave a book on a shelf and if the paper is acid free and there are no floods or fires, it will still be there in 100 years. Whereas with digital you have to be very proactive from the start to avoid losing it. There is no chance whatsoever that in 100 years the material will still be accessible." Even CDs, with a lifespan of 20 years before they degrade, will turn to dust far more quickly than paper.

Perhaps the most immediate problem is the speed of technological change, which leaves data in old formats marooned by time. In January of this year, floppy discs were added to the crowded computer cemetery when the PC World chain announced they would no longer be sold once existing stocks ran out. Already, anyone with data on a floppy disc will struggle to find a new computer equipped with a disc-drive that can read it. This is hardly new: PC users from the 1980s onwards will probably have a box of relics in their attic that their new computer can't understand - from 3in Amstrad discs to once-mighty software such as Wordstar or Lotus 1-2-3.

This, says the digital preservation lobby, could happen to documents we're writing in 2007, if Microsoft's grip on the computer market is broken. Najla Semple, executive secretary of the Digital Preservation Coalition, says: "We can pretty much guarantee that we are going to be able to read rich text or HTML in the future. Anything else is dubious. It's not in some software companies' interests to preserve because they want to sell products year after year. We take it for granted that these companies will still be around - but all companies fold, and who will take on the responsibility for maintaining and updating the software after that?"

The millions who rely on web-based e-mail systems like Yahoo or Hotmail may also find that they have placed the storage of their personal history in unsentimental hands. As Maureen Pennock from the Digital Curation Centre says: "The licences that you sign up to when you click through these things tend to leave the question of ownership at best ambiguous. At worst, you don't actually own your messages. And there isn't much publicly available advice on the best way to take your e-mails out of the system."

The inherent difficulties of finding information squirrelled away on an obscure disc-drive means that curators can no longer wait until after death to collate digital pasts. "People naturally tend not to keep materials," says Jeremy John. "Scientists can make mistakes, and writers certainly lose files. By approaching them early, we can help them keep them."

While collecting the computers and early digital files of James Lovelock - the far-sighted prophet of global warming - British library technicians have already run into problems. In desperation, they have appealed to vintage computer enthusiasts for help in finding long-extinct 5.25in floppy-disk drives and manuals from the 1970s and 1980s to help them decode his work.

Another worry is that it will be impossible to chart the progress of a work through its various drafts, as handwritten comments and alterations on manuscripts disappear. "People tend to press 'save' rather than 'save as'," says John, "so you don't get earlier versions - although with computer forensics we can recover some of their earlier drafts. The good thing about digital files is that you almost always get a date."

But with a typical inbox bridging everything from travel arrangements to the intimacies of family life, there is a natural reluctance among the relatives of public figures to hand all their e-mails over to the archivists. The British Library has to agree to keep some material confidential for up to 70 years, just as it has traditionally done for diaries and letters. Fear - of exposure or litigation - is inextricable from the threat to our digital heritage. It would seem sensible, for example, for companies to archive every message in case it proves useful in the future - but under data protection legislation this would probably be illegal. In Britain, under the 1998 Data Protection Act, personal data - including names, phone numbers and e-mail addresses - must only be kept for an "appropriate length of time" and "no longer than necessary".

Equally, it is unclear under European law whether managers have the right to read and archive their workforce's e-mails. As Maureen Pennock from the Digital Curation Centre says: "You could make a very good case that the your e-mail account belongs to your employer. But it's a grey area because the European Convention stipulates that everyone has the right to respect for 'his private life, his family life, his home, and his correspondence'. It could be argued that under the terms of the Human Rights Act all e-mail is confidential in principle."

At the same time, auditors and company law demand that organisations retain documents (including e-mails) for a period of time. In December last year, the Securities and Exchange Commission, the New York Stock Exchange and NASD fined five US companies - including Deutsche Bank Securities, Goldman Sachs, Morgan Stanley, Salomon Smith Barney and US Bancorp Piper Jaffray - a total of Dollars 8.25m for failing to preserve e-mail communications.

In Britain, the Freedom of Information Act, passed in 2000, gives the public a "right to know" about the activities of 100,000 public sector organisations - from local primary schools to the Foreign Office. The law makes it a criminal offence for civil servants to destroy files "with the intention of preventing their disclosure". Some historians fear that the effect of this legislation in Whitehall could be the opposite of its intention - leading to a razing of the archives. Anecdotally, the practice among senior civil servants is to keep only a handful of e-mails in their inbox and delete the rest immediately, so that if there is a subsequent freedom of information request, there will be no relevant documents to disclose. This scorched-earth policy is perfectly legal. Under the legislation, civil servants have the discretion to wipe any e- mail that they don't think has any "record-keeping" or "archival" value - as long as it is done before any Freedom of Information request.

Government departments are responsible for archiving their own e- mails until, after 30 years, they are transferred to the National Archives in Kew. Government departments refuse to comment on how they do this. At the Treasury, all correspondence going in or out of a minister's private office (including e-mails and letters) gets automatically archived. Otherwise, any message sent is automatically deleted after several months, unless a civil servant makes the effort in an idle moment to save it in the "archive folder". As Maurice Frankel from the Campaign for Freedom of Information says: "Nobody expects the minutes from the Christmas card committee to be preserved for 30 years. But there is a psychologically unhealthy obsession with hygiene - like washing your hands every 10 minutes."

In the US, the age of e-mail innocence is long over. The previous three presidents - Ronald Reagan, George Bush and Bill Clinton - fought unsuccessfully to have their administration's e-mails destroyed at the end of their terms of office. President Clinton only ever sent two e-mails himself, while junior members of his administration were instructed that they should make phone calls if they were planning to say anything sensitive. Such defensiveness will impoverish the archives, according to the historian Anthony Beevor: "Civil servants are that much more conscious at an early stage that things are going to be leaked. If I was asked to do a history of the Iraq war now I wouldn't touch it with a barge-pole. Before, with paper documents, the clerks would box everything up for 25 years. I can't prove it - but with e-mail I think that so much will have been deleted."

But historians and biographers shouldn't despair. The sheer volume of material sent electronically, along with the frenetic pace of political life, means that, according to Whitehall historian Peter Hennessy, it will in fact be a struggle for politicians and civil servants to cover their tracks. "Given there's an information explosion these days, there's a high chance that quite a lot will survive one way or another. The pressure of business means that they will have to do it on e-mail." And, as David Thomas from the Public Records Office says, electronic records are far more difficult to expunge than paper: "With e-mails there isn't just one copy. It might be slightly harder for people to look because they're going to have to look more widely - but it's actually hard to hide evidence and destroy it." Far from an empty archive, future generations are likely to face the problem of recovering interesting nuggets from a fathomless ocean of information. However guarded the Clinton administration was, it has still bequeathed a collection of magnetic tapes containing almost 40 million e-mails, fired off by 2,000 employees. When the tapes are declassified, teams of researchers will be able to create an intricately detailed first draft of history, with administration decisions chronicled minute-by-minute.

And, while we'll lose the scribbles and reflections of public figures on paper, we'll have the quick-fire immediacy of e-mail culture. Whereas the process of writing in longhand and sealing an envelope leaves time for self-censoring second thoughts, the send button conveys a message instantly. It is, for instance, unimaginable that the notorious e-mail sent by Labour special advisor Jo Moore on September 11 2001 telling press officers that it was a "good day to get out anything we want to bury" would have been committed to paper. Conversations that would once have been whispered in corridors, lost to the ether as phone calls, or held over a half-remembered wine-fuelled lunch can be preserved for posterity via e-mail, with a precise date, time and copy-list.

Because e-mails appear as if sent and delivered between two computers without any intermediary, they give a false sense of privacy, combining, fatally, the easy informality of a private conversation with the legal status of a written record. To the horror of senior Downing Street officials, the Hutton Inquiry published candid e-mails recording their worries over the weakness of the case for war and outlining stratagems for changing public opinion. "Much of the evidence we have is largely circumstantial so we need to convey to our readers that the cumulation of these facts demonstrates intent on Saddam's part," wrote Foreign Office diplomat Daniel Pruce. "The more they can be led to this conclusion themselves rather than have to accept judgements from us, the better." Tom Kelly, a Downing Street press officer, described the escalating row with the BBC as a "game of chicken with the Beeb. The only way they will shift is (if) they see the screw tightening."

These messages create a real-life picture of Downing Street under siege with a nervy, kinetic atmosphere straight out of the television satire The Thick of It. This hothouse mood - or the sense that the prime minister was preparing British opinion for war whilst the UN weapons inspections were still continuing - might never have seeped into official paper memos. According to Blair's biographer Anthony Seldon: "I don't think the rich harvest of documents in the Public Record Office in 2037 will be any less because of e-mails. There'll be slightly different kinds of material - in some ways inferior - but in other ways superior because more of it is created."

The deep-seated sense of ownership of the office computer that we feel means that, perhaps irrationally, we don't expect our Friday- afternoon rant about the boss to be read or preserved by anyone else. Former employees of the collapsed energy corporation Enron were astonished to find that, as part of its investigations, the Federal Regulatory Energy Commission released 200,000 employee e- mails on the web. They show the recklessness with which e-mail accounts are treated in the workplace. Messages confirming meetings or invoice payments are mixed seamlessly with Christmas wish-lists, advice about wedding planners, casual office bitching, and tales of one-night stands. This colourful slice of office life around the millennium may be mortifying for its authors, but it is paradise for historians.

There will also be new sources to mitigate the loss of letters and manuscripts. A novelist writing using a modern version of Microsoft Word saves for future generations an exact knowledge of when the work was last altered as well as the total "editing time" spent on the document since its creation. And though pencil marks are missing from digital documents, the authors may well have used the "track changes" function that marks each amendment to the text in a different colour, complete with a date and time signature. If we had this kind of information about Shakespeare's plays we'd be able to discover which plays he slaved over and which were written in lightning bolts of inspiration, as well as who wrote which line in his collaborations with other playwrights.

Perhaps, too, we should be sceptical about the more apocalyptic preachers of digital doom. For all the claims that our data is likely to face rapid obsolescence, the growth of a huge consumer market that will demand (and pay) for continued access to their old files makes it unlikely that the Word or Apple Works files that we're using today will face extinction anytime soon. Chris Rusbridge, director of the Digital Curation Centre, suspects that the risks have been over-hyped: "There is an urban myth that says a lot of stuff has been lost. I'm sure some of it is true - paper documents have been lost too. Generally when you track it down, if it's really important, someone's gone to the effort of saving it." And with multiple copies of the same e-mail held on different computers - which could be in different countries - digital information is at least protected from flood, fire and the kind of sacking that destroyed the ancient royal archives of Baghdad in the chaos following the 2003 invasion.

But before complacency sets in, we should remember that, if there's a yet-to-be-discovered teenage Shakespeare growing up in 2007, he's probably sending his sonnets to his Dark Lady via text message rather than e-mail - which he regards as hopelessly old-fashioned. And, since no one bothers to preserve these texts, they have already been lost to the British Library forever.

PAPER TIGERS

In a world that has all but abandoned paper and ink, politics could be the last refuge of the Luddite. Despite delivering endless speeches effusing about the "information super-highway", Tony Blair has left his computer idle during the 10 years of his premiership. When asked by a parliamentary committee whether he was a technophobe, he replied: "I am afraid that is fair actually, yes."

Blair's only brush with e-mail was inauspicious. He once sent a message to his friend Paddy Ashdown, reading "HIGH PAT! You will be amazed that I have now come into the modern world. Tony." When Ashdown's office staff read the e-mail, they deleted it, thinking, in their words, that "it was the work of one of those e-mail nutters." Only when Blair wrote a letter asking if Ashdown had read his missive was the message rescued from the trash bin.

Alistair Campbell is another self-confessed "pen and paper man", managing to run government press operations without ever sending an e-mail or reading a newspaper article online. Campbell's aides would print out e-mails sent to him and type up his handwritten replies. Since leaving Downing Street he has bought himself a BlackBerry - from which he accidentally sent a obscene message to the Newsnight office.

This lack of know-how isn't entirely an accident. The Whitehall support network of red boxes, diaries and assistants insulates those at the top of politics from personal communications - and its accompanying technologies. When Peter Hain arrived at the Foreign Office he had to make a revolutionary request for a computer that operated using Windows, as the dinosaur PC on his desk couldn't handle Word documents.

Such technophobia will change as the Milibands and Camerons - the new generation of politicians in their early forties, who were brought up with IT - make their presence felt. And a surprising number of long-out-of-office politicians put their younger colleagues to shame. Already, the Bodleian Library has preserved the e-mails and draft speeches of Barbara Castle, the left-wing Labour politician and pension campaigner who died in 2002. Richard Ovenden, keeper of special collections at the Bodleian, says that the library is also preserving the electronic files of an anonymous shadow cabinet minister for posterity. "It's the kind of stuff that any individual would have on their desktop. It might include stuff from official visits as well as snapshots of their grandchildren playing. Young Johnny on the swing in the back garden is something that historians in 200 years might find a great deal of meaning in."

Tagged: Financial Times Reportage

Posted on 17th March 2007.

Last changed at 03:41 UTC, 28th May 2008.

No comments. Add one.