Wednesday, October 22, 2014

Evaporating Web Records

You’re hip, you’re cool – ‘business enablement’ is your middle name, and you’ve got social media accounts, blogs, forums, Atom/RSS feeds and wikis rocking and rolling. As Chief Records Officer (CRO) you help your agency move into the 21st Century full speed ahead.

Except. Things are never that easy, and what we find is web records are sadly missing.

We know that a short URL was used in this tweet, but 2 years later the URL has been re-used and no longer points where it did when we created the tweet. Worse yet, we relied on identifying records at creation and we missed one, it never got recorded and Facebook’s API is refusing to give it to us.

The wiki system was migrated to a new platform and the old edit history has been lost – worse the new system tracks comments in a different form and they have been lost too.

An ill-considered rollout of a new website neglected to ensure that all of our old URLs were migrated, and apart from losing Google ranking, we also now can’t identify what content a user might have seen on a given date for a given URL.

In other words, our web records are evaporating. It’s not your EDRMS that’s failing, it’s the fact that all of these web systems exist outside the EDRMS and compliance needs are seen as a secondary (unimportant?) requirement for replacement systems. Practical needs for delivering services now are overwhelming the old centralised compliance needs.

The “Review of Social Media and Defence” report in 2011 by George Patterson Y&R is a good example of the sorts of problems agencies face:

“Given the dynamic nature of social media communications and the collaborative approach to the creation of user generated content, Defence will need to take particular care to ensure that such content is properly identified as a Commonwealth record as and when it is created. An accurate and authentic copy of such content will need to be captured and saved as a record so as to ensure that obligations under the relevant auditing, recordkeeping and disclosure legislation can be met. This is likely to require the development of a specific Defence social media records policy that provides guidance for each particular social media channel to be used by Defence during Professional Use.”
Review of Social Media and Defence, p.102

“The simplest interpretation of international record-keeping policy is that all outgoing communication should be housed on an official website that provides both a credible source for the community and a method of archiving content. The content can then be shared easily into social media, and important or significant conversations can be selected for archiving.”
Review of Social Media and Defence, p.124

“Because the National Archives of Australia (NAA) considers social media to simply be channels in which Commonwealth records can be shared, existing record management and archiving protocols need to be followed. The challenge lies in identifying commonwealth records worthy of archiving but also in the resourcing and processes required to ensure compliance. The government’s response to the Government 2.0 Taskforce (p. 15) states explicitly that the Archives will produce guidance on what constitutes a Commonwealth record in the context of social media. The NAA should be consulted to provide greater clarification for DEOC.
Review of Social Media and Defence, p.157

Rebecca Stoks produced an academic paper in October 2012 that summarised a survey of actual recordkeeping practices for social media records amongst Australian government agencies (mostly state (33), but some local (20) and federal(9) agencies). Her summary was damning:

“The transient nature of social media opposes traditional recordkeeping methods; consequently, most government agencies are not meeting their legal obligation to keep records.”
Taming the Wild West: Capturing Public Records Created on Social Media Websites, p.8

“In this study, only a minority of government agencies were found to be capturing social media records. Most of those capturing records were not very confident that they are meeting their legal obligations or that their methods are sustainable. Within the sample, the level of internal support, be it strong or lacking, was found to affect the degree to which social media records were being captured. Although well regarded as a resource, the guidance provided by PROs did not seem to have an impact on whether or how agencies were capturing records, with several respondents expressing a desire for more practical advice.”
Taming the Wild West: Capturing Public Records Created on Social Media Websites, p.48

What do we want to know about web records when we capture them?

  • URL
  • AGLS meta-data (author, publish date/time, country, copyright, etc)
  • Re-use (trackbacks, retweets, inbound links, ratings, likes, votes)
  • Outbound links, and their status (if they redirect, then to what URL? do they have meta-tags set like NoFollow?)
  • Linked resources (images, JavaScript, iframes, Flash files, video/audio) – not always useful, but worth bringing images into content as an embedded image at very least
  • Conversations started by the record (comments, replies, threads in general)
  • Relative site-map location compared to other web records (requires the concept of a site, perhaps leverage Google site maps?)

Much of this comes from Atom/RSS feeds, but some of it requires post-capture processing.

How do we want to see web records that we capture?

  • As an HTML page, even though stored as XML.
  • As a PDF, even though originally seen as an HTML page and stored as XML.

Of course this only gets us 80% of what we need, there will always be the missing context of what the page design looked like when that content was displayed (and what other content was dynamically displayed alongside it). With social media there is also the context of an responses, retweets, likes, shares or trackbacks to consider.

Do we organise web records by the site they belong to, the Atom/RSS feed they come from, or by some other more definite measure?

I don’t know anyone that has all the answers to those questions, I’m not even sure I know that many people that care about all those questions! However, I do know that without those answers there are essential government records that are literally evaporating every minute of the day, never to be seen again, or known about. They may not be important now, they may not ever be important, but our lack of care with them is likely to be lamented by future generations seeking to understand what motivated, inspired and drove us into action (or not).

Tuesday, October 21, 2014

Collaborate 2014 Roundup

We just finished another Collaborate conference for Objective Corp's international customers and it was a great effort by all; staff, partners and customers included.

The amount of effort that goes into the event is extraordinary, but the payoff is an event that informs, excites and inspires our customers to get more value from their Objective products (ECM, ECC and Connect) and change the way their organisations impact the world.

A big part of that is helping them drive business process innovation so they can deliver better services and products to the public.

Business Process Innovation

Something we are doing across our products is looking to deliver great user experiences, especially ones that are customised for our customers' organisations so that they are received by their end users with credibility.

One of the challenges that raises is what an old colleague of mine used to call the "pink flamingo effect":

Pink Flamingos

This is the tendency of some people to go wild with any sort of HTML customisation and add what they assume is an attractive element (e.g. pink flamingos) to an otherwise functional page.

We can't prevent that altogether, but a big part of the move we're making is to educate customers about the sorts of things they should consider customising (logos, fonts, colours) and how to consider the user experience when customising things that we have carefully designed, such as our responsively designed sample email templates.

To that end the slide above was used to help explain the difference between user interfaces (UI) and user experiences (UX).

Edit: See more about UI vs UX at my colleague David Eade's blog.

Updated blog template

It's been a few years since I updated the template my blog uses, so I've taken the chance now to simplify things and apply a template that seems less busy and more modern. I've also updated the fonts to make them more readable on modern screens.