Posted in Artificial Intelligence, Compliance, Electronic records, Information Management, Microsoft 365, Microsoft Viva, Records management, SharePoint Online

Changing patterns of office work

A recent (September 2021) Verge post titled ‘File Not Found‘ by Monica Chin (@mcsquared96) noted how pre-defined storage locations for digital content have become mostly redundant for the younger generation that grew up using Google to find content – ‘the concept of file folders and directories, essential to previous generations’ understanding of computers, is gibberish to many modern students.’

The post resonated for two specific reasons:

  • The way people create and access content in the modern digital environment, especially how they do this out of the office on mobile devices.
  • The enduring ‘requirement’ in many organisations to create pre-defined aggregations to store records, with an expectation that end-users will voluntary and consistently store content in those aggregations.

In my opinion, there is a fundamental clash between these two things in the modern digital world.

It reminds me of the transition to Web 2.0 from around 15 years ago. The ‘semantic web’ changed the way we interacted with the internet.

A diagram from 2009 showing how Web 2.0 would work

According to the University of Melbourne in a blog post in 2008, ‘Web 2.0 is the term used to describe a variety of web sites and applications that allow anyone to create and share online information or material they have created. A key element of the technology is that it allows people to create, share, collaborate and communicate. Web 2.0 differs from other types of websites as it does not require any web design or publishing skills to participate, making it easy for people to create and publish or communicate their work to the world.’

Fast forward ten years, and end-users can do just that, creating and publishing as much web content as they like through a host of apps, none of which requires a pre-defined folder structure to manage the content.

Is this the future for the modern digital office?

Aggregate or collate?

Expecting end-users to capture all digital content about the same subject in pre-defined aggregations has been a challenge since computers first appeared in the workplace. The fact that email systems have always been physically separate from other digital content stored on file shares didn’t help.

Organisations attempted to resolve this dilemma, with varying degrees of success, through ‘controlled’ folder structures on file shares, or acquiring EDM/ERM systems.

The success of these models was (and remains) often linked to the degree of compliance required by regulators, or the penalties for non-compliance. That is, the best document and records management systems were usually (and continue to be) in organisations that had a high compliance requirement. Everything else was about managing risk (and reputation).

The reality, for most organisations, is that not all relevant digital content is (or can be) captured in pre-defined aggregations. Some of the key ‘missing’ content may include emails, text and chat messages, content created in ‘personal’ spaces, and content not created using corporate systems. This is why FOI and discovery requests inevitably involve asking end-users to locate and produce relevant content.

So, it is just a waste of time to try to aggregate (some) digital records?

I think there is still a comfortable place for the creation and management of content in pre-defined aggregations, where it makes sense to do so and/or there is a specific compliance requirement to do this AND it works.

But, with some exceptions, pre-defined aggregations have a key shortcoming – they may not capture all digital content. Yahoo attempted to categorise the internet from 1994 (as ‘Jerry and David’s Guide to the World Wide Web’), when the reality was that most people preferred the simple Google search bar – without worrying about how the information may be have been aggregated or collated.

The modern workplace is heading in a similar direction. The illusion that all related digital content, of whatever type or format, can be grouped in pre-defined aggregations makes no sense. There is simply too much digital content and a lot of it cannot even be copied to or saved to a single aggregation point. Phone text messages have always been a case in point here.

What makes more sense, like Web 2.0, is to use technology to automatically find the connections between disparate digital objects and collate them in a way that makes sense in the context we want to see it. We already see this model in many online platforms that draw on our digital exhaust (and possibly our voices) to work out our interests or potential interests.

The digital office

The 2 November 2021 announcement of a range of updates to the Microsoft portal, ‘Creating a hub for your content with‘, is consistent with the need to address changing patterns of work and the shift away from pre-defined aggregations as the primary method of storing and finding all relevant content. Instead, the focus is on ease of creation, quick access to content that is relevant to the use based on their digital activities, search, and recommended actions.

The new portal

As noted already, this new approach does not remove the requirement to create pre-defined aggregations; it simply needs to be understood that these may only contain some of the content or records. The rest may remain stored in email accounts or other systems, in many cases out of reach (thanks to access controls) to the people who may want to see it.

However, behind the scenes, the Microsoft Graph makes use of this information (‘signals’) to draw inferences and make recommendations including for content that the end-user can access. For example, Fred and Mary may chat in Teams or exchange emails about a given subject, but Fred is unaware that Mary is working on a document that he has access to. The Graph works out that this may be of interest to Fred and recommends it to him.

Microsoft 365 provides the capability to recommend content and, with products like Viva Topics, has the ability to automatically group relevant content beyond the scope of pre-defined aggregations.

It may not be able, or ever need, to completely replace pre-defined aggregations, but it is heading in the direction to make that scenario increasingly more likely.

Feature image: Photo by Rebecca Diack from Pexels

Posted in Artificial Intelligence, Classification, Electronic records, Information Management, Microsoft Viva, Products and applications, Records management

Are auto-generated topic cards the future for aggregations of records?

Humans have natural instinct for grouping, classification and categorisation of things. It helps us find what we are looking for and gives us a sense of satisfaction, whether it be household items, computer storage, or much broader social and population groupings.

Humans have created and kept records ever since we developed a way to record them, on stones, clay shards, papyrus, bamboo sheets, velum, paper and various other means. Multiple records were aggregated in ways that made sense to the people who created or kept them and wanted to find them again.

The introduction of computers at work from the late 1980s/early 1990s began the decline of traditional ways of aggregating records about a particular subject together in a physical ‘file’, although that practice has persisted to the present day because it was and still is easier to refer to. Lawyers (or more often the legal clerks) still attend digital courtrooms armed with printed copies of (usually digital) evidence and other materials for this reason.

Lawyers off to court – Image credit Sven Vik (NYC TV News Videographer)

The ‘problem’ of digital aggregations

While physical files provided the ability to store anything (printable) about a given subject in the one location, digital ‘files’ (or aggregations) suffered from the fact that emails and other content are created or stored in completely different locations.

The only way to keep emails together with other content about the same subject was for end-users to copy them to a network file share folder location or a digital recordkeeping system. In almost every case, the original email remained in the mailbox where it might still have an active life. Some email mailboxes became a primary (or alternative) storage location for both emails and attachments (as did some desktops!).

Keeping all digital records about a given subject in a single aggregation was never an easy task. It was never possible to be sure that everything was captured because it relied on end-users.

The email mailbox – SharePoint conundrum

In the same way that organisations decided to store copies of emails in network file shares or EDRM system, it was easy to see SharePoint as the replacement for both.

But Microsoft have never made it easy to ‘natively’ copy an email from Outlook to SharePoint. There isn’t even a download option for emails. Emails can be dragged and dropped to synced document libraries, and various third-party products exist, but the process usually relies on end-users (a) to copy the emails and (b) to copy them consistently. Neither of these can be guaranteed.

And, of course, the records created and captured in Microsoft 365 is not just in Outlook mailboxes and SharePoint. A number of other apps create content that could records (for example Yammer conversations, Teams chats, calendar entries, Planner tasks, even Whiteboard diagrams). Few of these records can be saved to SharePoint.

So, are digital aggregations impossible?

There is nothing stopping organisations doing whatever they can or want to group related records together. In Microsoft 365, the most logical way to do this is in SharePoint document libraries (the ‘Files’ tab in Teams channels). An entire SharePoint site (the ‘Files’ tab in MS Teams channels) provides a form of meta-grouping; that is, multiple document libraries grouped by the SharePoint site/Team.

But if we stand back for a moment, to look at the (Microsoft 365) forest, what we see is not just individual trees (SharePoint sites, Exchange mailboxes and so on). Just as in a forest the roots of all the trees connect via mycorrhiza networks, sometimes known as ‘wood wide webs’, something similar happens in Microsoft 365 (and many other online systems, including Facebook).

Trees networking

The equivalent of networks in these systems are the ‘graphs’.

Like other graphs, the Microsoft Graph draws on all the rich data created and stored by end-users, in this case across the Microsoft 365 ecosystem – our corporate relationships, who we connect with and how, what we are communicating or writing, what we like, the way we use our time and so on. The graph learns what is popular or trending and makes suggestions (while respecting permissions) as to what we might want to see or know about.

Project Alexandria and Viva

According to a post in the Microsoft Research blog published in April 2021 and titled ‘Alexandria in Microsoft Viva Topics: from big data to big knowledge‘, Project Alexandria is ‘a research project within Microsoft Research Cambridge dedicated to discovering entities, or topics of information, and their associated properties from unstructured documents’.

The blog post also noted that ‘Alexandria technology plays a central role in the recently announced Microsoft Viva Topics, an AI product that automatically organizes large amounts of content and expertise, making it easier for people to find information and act on it’.

The Alexandria pipeline – from unstructured text to structured knowledge (From the blog post above)

The outcomes sound similar to traditional ‘manually’ created aggregations, although they don’t replace them. In fact, the more that content is manually curated, the more likely that Viva Topics can accurately connect them and other related content that might otherwise be missed.

While Viva Topics might appear to primarily focussed on supporting knowledge management outcomes and is currently limited to content stored in SharePoint, the technology has potential implications for records management. In particular, the age-old issue of how to find all information about a given subject (or know that a pre-defined aggregation contains all relevant information).

Viva Topic cards

As noted already, there is nothing stopping organisations from creating aggregations in ways that make sense to them and their end-users. SharePoint document libraries are the most logical form of aggregation that also happen to allow complex metadata, versioning and other features typically associated with EDRM systems. SharePoint document libraries are just one of several ways that content may be aggregated; Exchange mailboxes are another.

But, in most organisations, potentially relevant information AND records is frequently hidden from view in personal mailboxes and OneDrive accounts, in Teams chats, and in other applications (e.g., Planner). Viva Topics has the potential to leverage this information.

Once set up (as described in Set up Microsoft Viva Topics) , Microsoft Viva begins to work its magic, discovering topics. An example of a discovered topic (from ‘Manage topics at scale in Microsoft Viva Topics‘ is shown below.

While Topics are still limited to SharePoint content and people, there is potential to extend this model even further by including details about emails, chat messages or other content across the Microsoft 365 ecosystem – even if that information cannot be seen. For example:

  • Topic Name
  • Suggested people (perhaps grouped by AD manager or business area)
  • Suggested files and pages (you can see)
  • Authors of (n number of) emails that are related to the topic with an indication of volume over given periods (e.g., ‘251 emails in the past 6 months’) or a graphic representing this activity
  • Names of Teams that contain (n number of) chat messages related to the topic.
  • Participants in Teams 1:1 chats that contain (n number of) messages related to the topic.
  • Volume and date range of other related content (e.g., Tasks, Whiteboards, Forms, Yammer conversations).

Could Topic cards be the new aggregations?

Topic cards have the potential to resolve the age-old dilemma of digital aggregations, but they are unlikely to replace pre-defined ways to aggregate records including by copying emails to SharePoint document libraries. Those older methods will continue to exist for a long time.

But more importantly, they have the potential to draw out or highlight content that would otherwise be hidden from view – even if that content remains inaccessible.

When configured, Viva Topics already appear in search results, enhancing search outcomes.

It is only a matter of time before the probabilistic programming techniques of Project Alexandria, with expert human curation, begins to provide the type of high precision knowledge base construction for all relevant content about a given subject, first described by Microsoft researchers in May 2019.

Perhaps they may even support or link with retention and disposal processes, highlighting records due for disposal within a given period or even preventing their premature disposal.