Back in January 2010 I wrote about what I called the ‘semantic office’. The concept was based on the fact that most contemporary digital documents were made up of xml which in theory would lend themselves to cross linking and alternative ways of viewing digital content.
The last five years has seen very little change to the long-held method of organising documents through the use of folders and relying on document names to identify content, with search still consigned to second place as a preferred way to find and group related information.
But, using folders to store, and finding content in those folders, by the document name, are both deeply flawed methods. The former is clearly evident when users store documents in folders in SharePoint document libraries. SharePoint includes a wonderful option, available in every view, to ‘Show all items without folders’. When this option is selected, all the documents that had been squirelled away in the various folders become visible. The value of folders to ‘categorise’ the content seems questionable, and any duplicates are revealed.
Users also rely on being able to identify a document by its name and some quite elaborate naming ‘conventions’ have been offered over the years and continue to persist. My advice, in relation to such conventions, is to name a document in a way that help a future user know what it’s about. Because the reality is that no matter what you name a document, I may not search for it by the same name.
This last problem becomes evident in systems that have both a ‘title’ search, and a ‘document content’ search option. In almost every case, a search for exactly the same words by ‘document content’ will produce significantly more results. So why do we rely on a document name or title?
If the use of folders is overly restrictive and makes it hard to find documents, and document content searches reveal significantly more search results, why do we bother with putting documents in folders?
Are there other ways to see the information?
Data visualisations seems to be one possible way to achieve this, but examples are few and far between and mostly seem to be based on indexing of the content. Word maps or word clouds such as Wordle (www.wordle.com) are one example for text within a single work.
But we are not just talking about the words within a single document.
It would be so much more interesting to visually examine the content – by the words – across documents and the way documents relate to each other. If we were able to do this, we should see logical groups of financial records, committee or meeting records, and IT type records. Some may have connectors with each other; for example an IT project document with financial information presented at a meeting.
Some products such as Recommind do this type of textual relationship analysis but present the results in a traditional text based format that ressembles folders; perhaps because this is the way most end users would want to see it.
Are there other ways to visualise the contents of our network drives or email, or SharePoint sites, including by drawing on the underlying xml based metadata structure of the documents? The former, maybe, the latter I have not yet seen in operation.
The image below is a visualisation of my LinkedIn contacts in early 2014. What if we could see the content of folders this way? Would a visualisation like this tell a thousand words about the type of content I might find in my folders? Would it be possible to use this visualisation to help with retention and disposal?