Records management standards (see below) state that a defining feature of records is that they are associated with metadata – both ‘point of capture’ metadata and ‘process’ metadata that continues to evolve throughout the life of the record.
For at least two decades, the requirement to capture and store metadata for digital records has driven the implementation of centralised electronic document and records management EDRM systems, many of which began life as databases used to record metadata about physical records (files and boxes).
EDRM systems were (and still are) used to store copies of digital records created or captured natively in other systems, primarily network file shares and email. End-users were required to copy individual records to the EDRMS, a process that mirrored the storage of records (including printed digital records) in physical files.
Network file shares and email systems were not considered to be suitable as recordkeeping systems because they could not ensure the authenticity, integrity and reliability of records over time, including to manage and preserve metadata about the records stored in them.
The increasing implementation of Office 365, and in particular the use of SharePoint for the storage of records, has highlighted the extent to which recordkeeping metadata can – or even should – be applied to the content stored in that system.
This post discusses the need for metadata in records stored in Office 365, including in both Exchange/Outlook, MS Teams, and SharePoint/OneDrive for Business. It concludes that most records stored in Office 365 do not need additional metadata but, where such metadata is required, there is unlimited capability to add it.
Records and metadata
The international standard for records management, ISO 15489:2016, defines a record as ‘information created, received, and maintained as evidence and as an asset by an organization or person, in pursuit of legal obligations or in the transaction of business’.
Records are said to be different from ‘non-records’ because they are associated or described with (mostly added) metadata that describes ‘the context, content and structure of records and their management through time’.
The standard for recordkeeping metadata is ISO 23081:2017. One records management professional (link at the end of the post) noted that there has been reluctant adoption of this standard, mostly because it was ‘too complex’ and ‘academic’, and used ‘foreign terms’. Unspecified vendors were said to have been dismissive of the standard.
Standard for managing digital records – ISO 16175
Part 2 of the standard ISO 16175:2011, ‘Guidelines and functional requirements for digital records management systems’ contains multiple requirements relating to metadata, across three broad categories:
- Point of capture metadata. This includes metadata that forms part of the ‘metadata payload’ of the original record (e.g., date created, creator), other metadata added at point of capture, and metadata that provides additional context for the records.
- Process metadata. This is metadata that records activities and changes to both the record and metadata over the life of the record.
- The need to manage and control metadata over time.
This standard appears to reinforce the requirement for records to be stored and managed in dedicated recordkeeping systems.
Metadata versus enterprise ‘graphs’
While this post was in draft, James Lappin published a very interesting and informative post titled ‘Project Cortex and the future of document management in Office 365‘. The post highlighted a key difference between on-premise and cloud era document management, in relation to the way metadata is managed:
- On premise document and records management systems: These systems use metadata schema that specify metadata fields to be used in the system.
- Cloud systems including Office 365: These systems can make use of enterprise ‘graphs’ that map people to documents and topics. The graph is built from the interactions of people with content across the different workloads of the suite.

Most people now accept the algorithm capabilities of Facebook, LinkedIn, eBay, Amazon and similar online systems to automatically connect us with information relevant to us, without having to add any metadata.
Given the volume and types of digital content, almost all of which has metadata ‘payloads’, how can we ever hope to add the required recordkeeping metadata?
Can’t we just rely on the algorithms and graphs?
How much metadata do you really need?
The answer to this question may depend largely on business, regulatory/compliance and/or government recordkeeping requirements relevant to the organisation and its jurisdiction. In my experience, across multiple very large and also very small organisations:
- Most private sector organisations will likely have minimal metadata requirements beyond basic ‘point of capture’ and ‘process’ metadata already recorded in the system where the records are created or captured (including email), unless this is required for specific compliance or regulatory purposes, or where there is risk associated with poor recordkeeping. For example, in a major food processing company, records relating to the manufacture of food were very well documented and managed, while corporate records were managed haphazardly.
- Most public sector organisations are required, for government accountability and transparency (and information retrieval) purposes, to apply a minimum set of both ‘point of capture’ and ‘process’ metadata for non-permanent records. Many government agencies have struggled to manage digital records effectively.
- A small percentage of records captured or created in government agencies may require more extensive metadata, especially if those records are to be transferred to archival institutions for permanent retention.
Office 365 ‘workloads’
In Office 365, most business records will be created or captured in either Exchange/Outlook (includes MS Teams chats), or SharePoint or OneDrive for Business (for ‘working’ or personal content).
Exchange is a recordkeeping system in that it stores records with consistent metadata. The primary ‘weakness’, in terms of recordkeeping, is that ‘personal’ Exchange mailboxes aggregate records on a range of subjects by an individual user rather than by business subject. The mailboxes of Office 365 Groups, on the other hand, can be used to aggregate records about a business function/activity or subject.
SharePoint is a recordkeeping system that has extensive default metadata and almost unlimited additional metadata capability (see below). OneDrive for Business is a SharePoint service that has the same extensive default metadata capability.
There is, generally speaking, no requirement for organisations that have implemented Office 365 to allow the continued use of network file shares because the ‘save’ and ‘save as’ options in Office/Windows 10 points to SharePoint and OneDrive as the default save locations.
Metadata in Exchange mailboxes/MS Teams
Emails have the same metadata options in the header of every email:
- Message ID
- Subject
- Sender (From)
- Recipients (To, including CC and BCC)
- Sent (date/time)
- Received (date/time)
- (Plus more with routing information and security controls including DKIM, SPF, DMARC etc)
However, no other metadata can be added and some (or most) emails may never form part of the collated record of a given subject.
Because of this ‘limitation’, there has been an assumption ever since email was introduced that emails identified as records would have to be copied to a (separate) recordkeeping system.
- In pre-digital days, this meant printing out emails and placing them on a paper file.
- In organisations with EDRM systems, this meant copying the email to the EDRMs where additional metadata would be applied.
The original emails generally remained in place in individual mailboxes where they may be subject to backups and journaling in case they needed to be recovered for whatever reason including subpoenas (eDiscovery).
The Office Graph in Office 365 now provides the ability to connect the content in email with other content across that ecosystem, as noted in James Lappin’s post above. This is new – but it doesn’t rely on metadata or copying emails anywhere.
Metadata in SharePoint
As a SharePoint service, OneDrive for Business has the same default metadata columns. According it will not be described further here.
What metadata is required?
Organisations that plan to manage records in SharePoint should consider the following questions as part of their overall information architecture design to ensure records are kept in logical aggregations rather than randomly. This is important especially if end-users are allowed to create Office 365 Groups or Teams.
- What point of capture and process metadata is required (for compliance, regulatory, recordkeeping purposes)? What is the source of this requirement?
- Is there a difference in the metadata requirements for short-term (retain in the organisation) and permanent records that are to be transferred to archival institutions?
- Do the required metadata columns already exist in SharePoint?
- If they don’t exist, should the additional metadata columns be added as site columns or library columns?
- Does any of the metadata need to be mandatory, and/or can it be a default setting – for example, a metadata column that has the default function and/or activity so the user doesn’t need to add this.
- Where is the process metadata and how do you view or manage it? (See also below on this subject).
Information architecture and metadata
The information architecture of SharePoint, in terms of managing records as objects (e.g., documents, spreadsheets, images, etc), is relatively simple:
- SharePoint site. The primary aggregation that can be linked to a business function (e.g., ‘Financial management’).
- Document library/ies. Logical aggregations or containers of records that can be linked to business activities (e.g., ‘Meetings’).
- Folders, document sets as content aggregations.
- Documents/records.
- Folders, document sets as content aggregations.
- Document library/ies. Logical aggregations or containers of records that can be linked to business activities (e.g., ‘Meetings’).
An effective site architecture can replace the requirement for metadata. For example, the name of the SharePoint site can map to a business function, and library names can map to activities, instead of applying a function and activity pair to each record. The URL address for the record provides the context:
https://tenantname.sharepoint.com/teams/finance/invoices2020/record1.ext
If additional metadata is still required, SharePoint has extensive and almost unlimited capability.
- Every new SharePoint site comes with a standard set of around 240 metadata ‘site columns’. The metadata columns include the Dublin Core metadata items.
- New metadata columns can be created at the site level (‘site columns’). These are then can be used by all libraries and lists on the site. Here is a useful description of how to add new site columns from ShareGate: SharePoint 101: SharePoint Site Columns.
- Every new SharePoint library comes with a standard set of metadata columns – see below. New metadata columns can be created at the library (or list) level, but these columns are only available to that specific library or list.
Default SharePoint document library columns
The default library metadata columns are as follows. Dublin Core metadata items are shown with [DC]:
- App Created By
- App Modified By
- Check In Comment
- Checked Out To
- Comment count
- Compliance Asset Id
- Content Type
- Copy Source
- Created [DC]
- Created By [DC]
- Document ID (when enabled as a feature)
- File Size
- Folder Child Count
- ID
- Item Child Count
- Item is a Record
- Label applied by
- Label setting
- Like count
- Modified [DC]
- Modified By Name [DC]
- Retention label
- Retention label Applied
- Sensitivity
- Title [DC]
- Type [DC]
- Version
How metadata is added to records in SharePoint
Every digital record saved to SharePoint will have some form of native metadata (payload). Additional metadata may be added when the document is saved; this may be optional or mandatory.
- When a digital record is saved to SharePoint, SharePoint only copies the title or name of the record, not the original created date or author.
- When a Microsoft Office document is saved to a SharePoint document library, the Office document stores the library metadata (including the unique Document ID) in its own XML-based properties. This information is retained with the record even when the record is downloaded from SharePoint.
Viewing the metadata
The metadata that describes the content stored in the SharePoint document library may be viewed in multiple ways (via the edit view option), and may be exported (for example if records are to be destroyed or transferred).
Every record includes a version history that provides details of who modified the content, and when (but not what changes were made unless this is recorded).
Process metadata
Process metadata is metadata that records events relating to the record or the aggregation in which it is kept.
Examples of process metadata include when:
- Records are viewed or downloaded (date and by whom).
- Records are modified (date and by whom, and ideally what changes were made).
- Records are copied or moved (date and by whom).
- Security controls were changed (date, by whom, and what changes were made).
- Records are deleted/destroyed (date and by whom, with what authority).
While ISO 16175 describes the general requirement to keep process metadata, the actual requirement is likely to differ between organisations. Organisations with high compliance requirements, such as certain types of businesses or government, are more likely to want process metadata to be created, accessed when required, and protected against unauthorised modification.
Office 365 process metadata
Office 365 records process metadata in multiple ways in Exchange and SharePoint.
- Emails generally cannot be modified after they have been sent. Accordingly, the primary process metadata for emails and Teams chat is likely to be in the deletion records stored in the Office 365 Compliance admin portal audit logs.
- SharePoint/OneDrive process metadata is recorded as follows:
- Viewed or downloaded, modified, copied or moved: This is recorded in the Office 365 Compliance admin portal audit logs.
- Modified: This is recorded in the Date modified and Modified by metadata, as well as the version history (which also keeps the previous actual versions that can be compared if required).
- Security changes: This is recorded in the Office 365 Compliance admin portal audit logs.
- Destroyed: Depends, but generally this requires the capture of information manually, then stored elsewhere. For example, if the content of a document library is to be destroyed, then the metadata (along with details of the original library URL) should be exported (manually) first and saved somewhere. This is a manual process.
Note that audit log data in Office 365 is only retained for 90 days with an E3 licences, 365 days for an E5 licence.
Final thoughts
Exchange/Outlook email has basic metadata. It is unlikely that it will ever be possible to add other metadata, unless email is copied to SharePoint document libraries.
Chats from MS Teams are stored in hidden folders in Exchange mailboxes.
Organisations that need to keep certain emails for specific compliance, recordkeeping or archival purposes, should consider capturing these in SharePoint document libraries. Organisations might also consider making more use of Office 365 Group mailboxes for business-specific content as these Groups also include both MS Teams chat and have an associated SharePoint site.
The metadata capabilities of SharePoint are unlimited but not all records need the same degree of metadata.
The majority of records can probably be managed in standard SharePoint document libraries using the default metadata columns, or with one or two additional site or library metadata columns added, where required.
The Office Graph will increasingly be able to bring together records dynamically in the context of the business or the end-user via Project Cortex and Delve, respecting security controls that may be in place. The centralised content search and retention policy capability in Office 365 will also enable businesses to find, retrieve and manage content across both Exchange and SharePoint.
Reference
AS/NZS ISO 23081 series, Information and documentation – Records management processes – Metadata for records
Click to access IT-021-Seminar-Presentation-4-Anne-Picot.pdf