When people chat in Microsoft Teams (MS Teams), a ‘compliance’ copy of the chat is saved to either personal or (Microsoft 365) Group mailboxes. This copy is subject to retention policies, and can be found and exported via Content Search.
But what happens if there is no Exchange Online mailbox? It seems the chats become inaccessible which could be an issue from a recordkeeping and compliance point of view.
This post explains what happens, and why it may not be a good idea (from a compliance and recordkeeping point of view) not to disable the Exchange Online mailbox option as part of licence provisioning.
Licences and Exchange Online mailboxes
When an end-user is allocated a licence for Microsoft 365, a decision (sometimes incorporated into a script) is made about which of the purchased licences – and apps in those licences – will be assigned to that person.
E1, E3 and E5 licences include ‘Exchange Online’ as an option under ‘Apps’. This option is checked by default (along with many of the other options), but it can be disabled (as shown below).
If the checkbox option is disabled as part of the licence assigning process (not after), the end-user won’t have an Exchange mailbox and so won’t see the Outlook option when they log on to office.com portal. (Note – If they have an on-premise mailbox, that will continue to exist, nothing changes).
Having an Exchange Online mailbox is important if end-users are using MS Teams, because the ‘compliance’ copy of 1:1 chat messages in MS Teams are stored in a hidden folder (/Conversation History/Team Chat) in the Exchange Online mailbox of every participant in the chat. If the mailbox doesn’t exist, those copies aren’t made and so aren’t accessible and may be deleted.
If end-users chat with other end-users who don’t have an Exchange mailbox as shown in the example below, the same thing happen – no compliance copy is kept. The chat remains inaccessible (unless the Global Admins take over the account).
The exchange above, between Roger Bond and Charles, includes some specific key words. As we will see below, these chats cannot be found via a Content Search.
(On a related note, if the ability to create private channels is enabled and they create a private channel and chat there, the chats are also not saved because a compliance copy of private channel chats are stored in the mailboxes of the individual participants.)
Searching for chats when no mailbox exists
As we can see above, the word ‘mosquito’ was contained in the chat messages between Roger and Charles.
Content Searches are carried out via the Compliance portal and are more or less the same as eDiscovery searches in that they are created as cases.
From the Content Search option, a new search is created by clicking on ‘+New Search’, as shown below. The word ‘mosquito’ has been added as a keyword.
We then need to determine where the search will look. In this case the search will look through all the options shown below, including all mailboxes and Teams messages.
When the search was run, the results area shows the words ‘No results found’.
Clicking on ‘Status details’ in the search results, the following information is displayed – ‘0 items’ found. The ‘5 unindexed items’ is unrelated to this search and simply indicates that there are 5 unindexed items.
Double-checking the results
To confirm the results were accurate, another search was conducted where the end-user originally did not have a mailbox, and then was assigned one.
If the end-user didn’t have a mailbox but the other recipient/s of the message did, the Content Search found one copy of the chat message in the mailbox of the other participants. Only one item was found.
When the Exchange Online option was enabled for the end-user who previously did not have a mailbox (so they were now assigned a mailbox), a copy of the chat was found in the mailbox of both participants, as shown in the details below (‘2 items’).
Summary and implications
In summary:
If end users chat in the 1:1 area of MS Teams and don’t have an Exchange Online mailbox, no compliance copy of the chat will be saved, and so it will not be found via Content Search.
If any of the participants in the 1:1 chat have an Exchange Online mailbox, the chat will appear in the mailboxes of those participants.
If all participants in the 1:1 chat have an Exchange Online mailbox, the chat will be found in the mailbox of all participants.
Further to the above:
If end users can delete chats (via Teams policies) and don’t have a mailbox, no copy of the chat will exist.
If end-users with a mailbox can delete Teams chats, but a retention policy has been applied to the chats, the chats will be retained as per the retention policy (in a hidden folder).
And finally, if you allow private channels, end-users can create private channels in the Organisation Team. The chats in these private channels are usually stored in the personal mailboxes of participants (not the Group mailbox) – so these chats will also be inaccessible and cannot be found via Content Search.
The implications for the above are that, if you need to ensure that personal chat messages can be accessed (from Content Search), then the participants in the chat must have an Exchange Online mailbox.
Further, if you allow deletion of chats but need to be able to recover them for compliance purposes, a retention policy should be applied to Teams 1:1 chat.
Records management standards (see below) state that a defining feature of records is that they are associated with metadata – both ‘point of capture’ metadata and ‘process’ metadata that continues to evolve throughout the life of the record.
For at least two decades, the requirement to capture and store metadata for digital records has driven the implementation of centralised electronic document and records management EDRM systems, many of which began life as databases used to record metadata about physical records (files and boxes).
EDRM systems were (and still are) used to store copies of digital records created or captured natively in other systems, primarily network file shares and email. End-users were required to copy individual records to the EDRMS, a process that mirrored the storage of records (including printed digital records) in physical files.
Network file shares and email systems were not considered to be suitable as recordkeeping systems because they could not ensure the authenticity, integrity and reliability of records over time, including to manage and preserve metadata about the records stored in them.
The increasing implementation of Office 365, and in particular the use of SharePoint for the storage of records, has highlighted the extent to which recordkeeping metadata can – or even should – be applied to the content stored in that system.
This post discusses the need for metadata in records stored in Office 365, including in both Exchange/Outlook, MS Teams, and SharePoint/OneDrive for Business. It concludes that most records stored in Office 365 do not need additional metadata but, where such metadata is required, there is unlimited capability to add it.
Records and metadata
The international standard for records management, ISO 15489:2016, defines a record as ‘information created, received, and maintained as evidence and as an asset by an organization or person, in pursuit of legal obligations or in the transaction of business’.
Records are said to be different from ‘non-records’ because they are associated or described with (mostly added) metadata that describes ‘the context, content and structure of records and their management through time’.
The standard for recordkeeping metadata is ISO 23081:2017. One records management professional (link at the end of the post) noted that there has been reluctant adoption of this standard, mostly because it was ‘too complex’ and ‘academic’, and used ‘foreign terms’. Unspecified vendors were said to have been dismissive of the standard.
Standard for managing digital records – ISO 16175
Part 2 of the standard ISO 16175:2011, ‘Guidelines and functional requirements for digital records management systems’ contains multiple requirements relating to metadata, across three broad categories:
Point of capture metadata. This includes metadata that forms part of the ‘metadata payload’ of the original record (e.g., date created, creator), other metadata added at point of capture, and metadata that provides additional context for the records.
Process metadata. This is metadata that records activities and changes to both the record and metadata over the life of the record.
The need to manage and control metadata over time.
This standard appears to reinforce the requirement for records to be stored and managed in dedicated recordkeeping systems.
Metadata versus enterprise ‘graphs’
While this post was in draft, James Lappin published a very interesting and informative post titled ‘Project Cortex and the future of document management in Office 365‘. The post highlighted a key difference between on-premise and cloud era document management, in relation to the way metadata is managed:
On premise document and records management systems: These systems use metadata schema that specify metadata fields to be used in the system.
Cloud systems including Office 365: These systems can make use of enterprise ‘graphs’ that map people to documents and topics. The graph is built from the interactions of people with content across the different workloads of the suite.
Microsoft’s depiction of a graph model and the dynamic interaction between objects.
Most people now accept the algorithm capabilities of Facebook, LinkedIn, eBay, Amazon and similar online systems to automatically connect us with information relevant to us, without having to add any metadata.
Given the volume and types of digital content, almost all of which has metadata ‘payloads’, how can we ever hope to add the required recordkeeping metadata?
Can’t we just rely on the algorithms and graphs?
How much metadata do you really need?
The answer to this question may depend largely on business, regulatory/compliance and/or government recordkeeping requirements relevant to the organisation and its jurisdiction. In my experience, across multiple very large and also very small organisations:
Most private sector organisations will likely have minimal metadata requirements beyond basic ‘point of capture’ and ‘process’ metadata already recorded in the system where the records are created or captured (including email), unless this is required for specific compliance or regulatory purposes, or where there is risk associated with poor recordkeeping. For example, in a major food processing company, records relating to the manufacture of food were very well documented and managed, while corporate records were managed haphazardly.
Most public sector organisations are required, for government accountability and transparency (and information retrieval) purposes, to apply a minimum set of both ‘point of capture’ and ‘process’ metadata for non-permanent records. Many government agencies have struggled to manage digital records effectively.
A small percentage of records captured or created in government agencies may require more extensive metadata, especially if those records are to be transferred to archival institutions for permanent retention.
Office 365 ‘workloads’
In Office 365, most business records will be created or captured in either Exchange/Outlook (includes MS Teams chats), or SharePoint or OneDrive for Business (for ‘working’ or personal content).
Exchange is a recordkeeping system in that it stores records with consistent metadata. The primary ‘weakness’, in terms of recordkeeping, is that ‘personal’ Exchange mailboxes aggregate records on a range of subjects by an individual user rather than by business subject. The mailboxes of Office 365 Groups, on the other hand, can be used to aggregate records about a business function/activity or subject.
SharePoint is a recordkeeping system that has extensive default metadata and almost unlimited additional metadata capability (see below). OneDrive for Business is a SharePoint service that has the same extensive default metadata capability.
There is, generally speaking, no requirement for organisations that have implemented Office 365 to allow the continued use of network file shares because the ‘save’ and ‘save as’ options in Office/Windows 10 points to SharePoint and OneDrive as the default save locations.
Metadata in Exchange mailboxes/MS Teams
Emails have the same metadata options in the header of every email:
Message ID
Subject
Sender (From)
Recipients (To, including CC and BCC)
Sent (date/time)
Received (date/time)
(Plus more with routing information and security controls including DKIM, SPF, DMARC etc)
However, no other metadata can be added and some (or most) emails may never form part of the collated record of a given subject.
Because of this ‘limitation’, there has been an assumption ever since email was introduced that emails identified as records would have to be copied to a (separate) recordkeeping system.
In pre-digital days, this meant printing out emails and placing them on a paper file.
In organisations with EDRM systems, this meant copying the email to the EDRMs where additional metadata would be applied.
The original emails generally remained in place in individual mailboxes where they may be subject to backups and journaling in case they needed to be recovered for whatever reason including subpoenas (eDiscovery).
The Office Graph in Office 365 now provides the ability to connect the content in email with other content across that ecosystem, as noted in James Lappin’s post above. This is new – but it doesn’t rely on metadata or copying emails anywhere.
Metadata in SharePoint
As a SharePoint service, OneDrive for Business has the same default metadata columns. According it will not be described further here.
What metadata is required?
Organisations that plan to manage records in SharePoint should consider the following questions as part of their overall information architecture design to ensure records are kept in logical aggregations rather than randomly. This is important especially if end-users are allowed to create Office 365 Groups or Teams.
What point of capture and process metadata is required (for compliance, regulatory, recordkeeping purposes)? What is the source of this requirement?
Is there a difference in the metadata requirements for short-term (retain in the organisation) and permanent records that are to be transferred to archival institutions?
Do the required metadata columns already exist in SharePoint?
If they don’t exist, should the additional metadata columns be added as site columns or library columns?
Does any of the metadata need to be mandatory, and/or can it be a default setting – for example, a metadata column that has the default function and/or activity so the user doesn’t need to add this.
Where is the process metadata and how do you view or manage it? (See also below on this subject).
Information architecture and metadata
The information architecture of SharePoint, in terms of managing records as objects (e.g., documents, spreadsheets, images, etc), is relatively simple:
SharePoint site. The primary aggregation that can be linked to a business function (e.g., ‘Financial management’).
Document library/ies. Logical aggregations or containers of records that can be linked to business activities (e.g., ‘Meetings’).
Folders, document sets as content aggregations.
Documents/records.
An effective site architecture can replace the requirement for metadata. For example, the name of the SharePoint site can map to a business function, and library names can map to activities, instead of applying a function and activity pair to each record. The URL address for the record provides the context:
If additional metadata is still required, SharePoint has extensive and almost unlimited capability.
Every new SharePoint site comes with a standard set of around 240 metadata ‘site columns’. The metadata columns include the Dublin Core metadata items.
New metadata columns can be created at the site level (‘site columns’). These are then can be used by all libraries and lists on the site. Here is a useful description of how to add new site columns from ShareGate: SharePoint 101: SharePoint Site Columns.
Every new SharePoint library comes with a standard set of metadata columns – see below. New metadata columns can be created at the library (or list) level, but these columns are only available to that specific library or list.
Default SharePoint document library columns
The default library metadata columns are as follows. Dublin Core metadata items are shown with [DC]:
App Created By
App Modified By
Check In Comment
Checked Out To
Comment count
Compliance Asset Id
Content Type
Copy Source
Created [DC]
Created By [DC]
Document ID (when enabled as a feature)
File Size
Folder Child Count
ID
Item Child Count
Item is a Record
Label applied by
Label setting
Like count
Modified [DC]
Modified By Name [DC]
Retention label
Retention label Applied
Sensitivity
Title [DC]
Type [DC]
Version
How metadata is added to records in SharePoint
Every digital record saved to SharePoint will have some form of native metadata (payload). Additional metadata may be added when the document is saved; this may be optional or mandatory.
When a digital record is saved to SharePoint, SharePoint only copies the title or name of the record, not the original created date or author.
When a Microsoft Office document is saved to a SharePoint document library, the Office document stores the library metadata (including the unique Document ID) in its own XML-based properties. This information is retained with the record even when the record is downloaded from SharePoint.
Viewing the metadata
The metadata that describes the content stored in the SharePoint document library may be viewed in multiple ways (via the edit view option), and may be exported (for example if records are to be destroyed or transferred).
Every record includes a version history that provides details of who modified the content, and when (but not what changes were made unless this is recorded).
Process metadata
Process metadata is metadata that records events relating to the record or the aggregation in which it is kept.
Examples of process metadata include when:
Records are viewed or downloaded (date and by whom).
Records are modified (date and by whom, and ideally what changes were made).
Records are copied or moved (date and by whom).
Security controls were changed (date, by whom, and what changes were made).
Records are deleted/destroyed (date and by whom, with what authority).
While ISO 16175 describes the general requirement to keep process metadata, the actual requirement is likely to differ between organisations. Organisations with high compliance requirements, such as certain types of businesses or government, are more likely to want process metadata to be created, accessed when required, and protected against unauthorised modification.
Office 365 process metadata
Office 365 records process metadata in multiple ways in Exchange and SharePoint.
Emails generally cannot be modified after they have been sent. Accordingly, the primary process metadata for emails and Teams chat is likely to be in the deletion records stored in the Office 365 Compliance admin portal audit logs.
SharePoint/OneDrive process metadata is recorded as follows:
Viewed or downloaded, modified, copied or moved: This is recorded in the Office 365 Compliance admin portal audit logs.
Modified: This is recorded in the Date modified and Modified by metadata, as well as the version history (which also keeps the previous actual versions that can be compared if required).
Security changes: This is recorded in the Office 365 Compliance admin portal audit logs.
Destroyed: Depends, but generally this requires the capture of information manually, then stored elsewhere. For example, if the content of a document library is to be destroyed, then the metadata (along with details of the original library URL) should be exported (manually) first and saved somewhere. This is a manual process.
Note that audit log data in Office 365 is only retained for 90 days with an E3 licences, 365 days for an E5 licence.
Final thoughts
Exchange/Outlook email has basic metadata. It is unlikely that it will ever be possible to add other metadata, unless email is copied to SharePoint document libraries.
Chats from MS Teams are stored in hidden folders in Exchange mailboxes.
Organisations that need to keep certain emails for specific compliance, recordkeeping or archival purposes, should consider capturing these in SharePoint document libraries. Organisations might also consider making more use of Office 365 Group mailboxes for business-specific content as these Groups also include both MS Teams chat and have an associated SharePoint site.
The metadata capabilities of SharePoint are unlimited but not all records need the same degree of metadata.
The majority of records can probably be managed in standard SharePoint document libraries using the default metadata columns, or with one or two additional site or library metadata columns added, where required.
The Office Graph will increasingly be able to bring together records dynamically in the context of the business or the end-user via Project Cortex and Delve, respecting security controls that may be in place. The centralised content search and retention policy capability in Office 365 will also enable businesses to find, retrieve and manage content across both Exchange and SharePoint.
Reference
AS/NZS ISO 23081 series, Information and documentation – Records management processes – Metadata for records
Most people should be aware that pressing the ‘delete’ option for a file stored on a computer doesn’t actually delete the item, it only makes the file ‘invisible’. The actual file is still accessible on the disk and can be retrieved relatively easily or using forensic tools until the space it was stored on is overwritten.
Traditional legacy electronic document and records management (EDRM) systems have two components:
A database (e.g., SQL, Oracle) where the metadata about the records are stored
A linked file share where the actual objects are stored, most of which are copies of emails or network file share files that remain in their original location.
In most on-premise systems, email mailboxes, network file shares, and the EDRMS database and linked file share are likely to be backed up.
When a digital record comes to the end of its retention and is subject to a ‘destruction’ process, how do you know if the record has actually been destroyed? And even if it is, how can you be sure that the original isn’t still stored in a mailbox, network file share, or a back up?
This post examines what actually happens when a file is ‘deleted’ from a Windows NT File System (NTFS), and questions whether digital records stored in an EDRMS are really destroyed at the end of the retention period.
The Windows NTFS Master File Table (MFT)
Details of every file stored on a computer drive will be found in the NTFS Master File Table (MFT).
In some ways, the MFT operates like a traditional electronic document management system – it is a kind of database that it records metadata about the attributes of the digital objects stored on the drive. These attributes include the following:
As noted in the diagram above, the details stored by the MFT include the $File_Name and $Data attributes.
The $File_Name attributes include the actual name of the file as well as when it was created and modified, and its size. This is the information that can be seen via File Explorer and is often copied to the EDRMS metadata.
The $Data attribute contains details of where the actual data in the file is stored on the disk (in 0s and 1s) or the complete data if the file is small enough to fit in the MFT record.
If the MFT record has many attributes or the file data is stored in multiple fragments on a disk (for example as a file is being edited), additional MFT ‘extension’ records may be created.
When a file is deleted, the MFT records the deletion.
If the file is simply deleted, the record will remain on the disk and can be recovered from the Recycle Bin.
If the file is deleted through SHIFT-DEL or emptying the Recycle Bin, the MFT will be updated to the ‘Deleted’ state and update the cluster bitmap section to set the file’s cluster (where the data is stored) as being free for reuse. The MFT record remains until it is re-used or the data clusters are allocated in whole or part to another file.
So, in summary, ‘deleting’ a file does not actually delete it. It may either:
Store the file in the Recycle Bin, making it relatively easy to recover, or
Change the MFT record to show the file as being deleted but leave the file data on the desk until it is overwritten.
How does an EDRMS store and manage files?
The following summary relates to a well-known Electronic Document and Records Management System (EDRMS). Other systems may work differently but the point is that records managers should understand exactly how they work and what happens when electronic files are destroyed at the end of a retention period.
Most EDRM systems are made up of two parts:
A database (SQL, Oracle etc) to store the metadata about the record.
An attached file store that stores the actual digital objects.
When EDRM systems are used to register paper or physical records (files and boxes), only the database is used.
When digital records are uploaded to the EDRMS:
The metadata in the original file, including the file type, original file name, date created, date modified and author are ‘captured’ by the system and recorded in the new database record.
Additional metadata may be added, including a content or record ‘type’.
The record will usually be associated with a ‘container’ (e.g., ‘file’). This containment makes the record appear to be ‘contained’ within that container, whereas in fact it is simply a metadata record of an object stored elsewhere.
The original record filename is changed to random characters (to make it harder to find, in theory) and then stored on the attached (usually Windows NTFS) file store, often in a series of folders.
A link is made between the database record and the record object stored in the file store (the MFT record).
When the end-user opens the EDRMS, they can search for or navigate to containers/files and see what appears to be the digital objects ‘stored’ in that container/file. In reality, they are seeing a link to the object stored (randomly) in the file store.
What happens when an EDRMS record is destroyed?
If there is no requirement to extend their retention, or keep them on a legal hold, records may be destroyed at the conclusion of a retention period.
For physical records, this usually means destroying the physical objects so they cannot be recovered, a process that may include bulk shredding or pulping.
For digital records, however, there may be less certainty about the outcome of the destruction. While the EDRMS may flag the record as being ‘destroyed’ it is not completely clear if the destruction process has actually destroyed the records and overwritten the digital records in a way that ensures its destruction to the same level as destroyed paper files.
Also:
If the original associated NTFS file share becomes full and a new one is used, the original is likely to be made read only.
There is likely to be a backup of the EDRMS.
The original records uploaded to the EDRMS probably continue to exist on network files shares, in email, or in back up tapes.
Digital forensics can be used to recover ‘deleted’ files from the associated file share.
Consider this scenario:
An email containing evidence of something is saved to a container in an EDRMS.
The container of records is ‘destroyed’ after the retention period expires.
A legal case arises after the container is ‘destroyed’
A subpoena is made for all records, including those specific records.
Has the record actually been destroyed, or could it still be recoverable, including from backups or the digital originals?
Is it really possible to destroy digital records, and does it matter?
Yes, records can be destroyed by overwriting the cluster where the record is kept, and some EDRM systems may offer this option.
But:
Do EDRM systems overwrite the cluster when a digital record is destroyed in line with your records retention and disposal authorities, or simply mark the record as being deleted, when it is still technically recoverable?
Could the record still exist in the network file shares or email, or in backups of these or the EDRMS?
Might it be possible to recover the record with digital forensics tools?
Does it matter?
It might be worth asking IT and your EDRMS vendor.
Microsoft have improved the Classification section in the Office 365 Security and Compliance centre. The change will help to reduce confusion and make it easier for records managers and security administrators to focus on their individual needs.
Previous user interface
The primary change is to the menu interface. The previous menu options, shown in the screenshot below, showed only ‘Labels’ and ‘Label policies’.
When the previous ‘Labels’ option was selected, a new screen with two tabs ‘Sensitivity’ (default) and ‘Retention’ was displayed, as shown below.
The sensitivity or retention tab had to be selected to create or publish a new label. The user interface was unclear and the difference between creating and publishing a label was not obvious.
New user interface
The sensitivity and retention elements have now been separated and placed under the primary ‘Classification’ menu option as shown below.
Now, ‘Labels’ and ‘Label policies’ are two tabs under the relevant section as can be seen below.
The options to create and publish labels remain the same.
A recent (September 2017) article suggested that OneDrive for Business (ODfB) (and by extension SharePoint Online (SPO); ODfB is a SharePoint-based service), a key application in Office 365 was a potential source of data leaks and/or target for hacking attacks.
I don’t disagree that, if not configured correctly, any online document management system – not just ODfB/SPO – could be the source of leaks or the target of external attacks. Especially if these systems, and the security controls that can protect the data in them, are not properly configured, governed, administered, and monitored.
But, I would ask, what controls do most organisations have in place now for documents stored in file shares and personal file folders, not to mention USB sticks, and the ability to send document via Bluetooth to mobile devices or upload corporate data to third-party document storage systems? Probably not many, because users have no other way to access the data out of the office.
As we will see, the controls available in Office 365 are likely to be more than sufficient to allow users to access to their documents out of the office, while at the same time reducing (if not eliminating) the sharing of documents with unauthorised users.
How to stop or minimise sharing from OneDrive for Business and SharePoint Online
There is one simple way to prevent the sharing of data stored in SPO and ODfB with external people – don’t allow it.
There are several ways to control what can be shared, each allowing the user a bit more capability. All these options should be based on business requirements and information security risk assessments, and Office 365 configured accordingly.
In this article I will start with no sharing allowed, and then show how the controls can be reduced as necessary.
External sharing – on or off
This is the primary setting, found in the main Office 365 Admin centre under Settings > Services & add-ins > Sites. If you turn this off, no-one can share anything stored in SPO or ODfB.
The option is shown below:
If you do allow sharing, you need to decide (as shown above) if sharing will be with:
Only existing external users
New and existing external users [Recommended]
Anyone, including anonymous users
The second option is recommended because it doesn’t restrict the ability to share with new users. The last option is unlikely to be used in most organisations and comes with some risks.
The next place to set these options are in the SPO and ODfB Admin centres.
OneDrive admin center
If the previous option is enabled, the following options are available for ODfB. Note that BOTH SharePoint and OneDrive are included here because the latter is a part of the SharePoint environment.
Let users share SharePoint content with external users: ON or OFF.
NOTE: If this option is turned OFF, all the following options disappear.
If sharing with external users is enabled, the following three options are offered:
Only existing external users
New and existing external users [Recommended]
Anyone, including anonymous users
Let users share OneDrive content with external users: ON or OFF
This setting must be at least as restrictive as the SharePoint setting.
If sharing with external users is enabled, the following three options are offered
Only existing external users
New and existing external users [Recommended]
Anyone, including anonymous users
If sharing is allowed, there are three sharing link options:
Direct – only people who already have permission [Recommended]
Internal – only people in the organisation
Anonymous access – anyone with the link
You can limit external sharing by domain, by allowing or blocking sharing with people on selected domains.
External users have two options:
External users must accept sharing invitations using the same account that the invitations were sent to [Recommended]
Let external users share items they don’t own. [This should normally be disabled]
A final ‘Share recipients’ checkbox allow the owners to see who viewed their files.
SharePoint admin center
The SPO admin center (to be upgraded in late 2017) has two options for sharing.
The first option is under the ‘sharing’ section which currently has the following options:
Sharing outside your organization
Control how users share content with people outside your organization.
Don’t allow sharing outside your organization
Allow sharing only with the external users that already exist in your organization’s directory
Allow users to invite and share with authenticated external users [Recommended]
Allow sharing to authenticated external users and using anonymous access links
Who can share outside your organization
[Checkbox] Let only users in selected security groups share with authenticated external users
Default link type
Choose the type of link that is created by default when users get links.
Direct – only people who have permission [Recommended, same as above]
Internal – people in the organization only
Anonymous Access – anyone with the link
Default link permission
Choose the default permission that is selected when users share. This applies to anonymous access, internal and direct links.
View [Recommended]
Edit
Additional settings (Checkboxes)
Limit external sharing using domains (applies to all future sharing invitations). Separate multiple domains with spaces.
Prevent external users from sharing files, folders, and sites that they don’t own [Recommended]
External users must accept sharing invitations using the same account that the invitations were sent to [Recommended]
Notifications (Checkboxes)
E-mail OneDrive for Business owners when
Other users invite additional external users to shared files [Recommended]
External users accept invitations to access files [Recommended]
An anonymous access link is created or changed [Recommended]
Sharing via the Site Collections option
In addition to the options above, sharing options for each SharePoint site are set in the ‘site collections’ section as follows. Note that the default is ‘no sharing allowed’. A conscious decision must be taken to allow sharing, and what type of sharing.
When a site collection name is checked, the following options are displayed.
Sharing outside your company
Control how users invite people outside your organisation to access content
Don’t allowing sharing outside your organisation (default)
Allow sharing only with the external users that already exist in your organization’s directory
Allow external users who accept sharing invitations and sign in as authenticated users
Allow sharing with all external users, and by using anonymous access links
If anonymous access is not permitted (setting above), a message in red is displayed:
Anonymous access links aren’t allowed in your organization
SharePoint Sharing option
The SharePoint Admin Centre has an additional ‘Sharing’ section with the same settings as shown above for ODfB. It is expected that these multiple options will be merged in the new SharePoint Admin Centre due for release in late 2017.
Additional security controls
In addition to all the above settings, there are a range of additional controls available:
All user activities related to SPO and ODfB, including who accessed, viewed, edited, deleted, or shared files is accessible in the audit logs.
SPO and ODfB content may be picked up by Data Loss Prevention (DLP) policies and users prevented from sending them externally. This is of course subject to the DLP policies being able to identify the content correctly.
SPO and ODfB content may be subject to records retention policies set by preservation policies. These may impact on the ability to send documents externally.
SPO and ODfB content may be subject to an eDiscovery case.
Administrators can be notified when users perform specific activities in both SPO and ODfB.
Sharing (and access to the documents once shared) may be subject to security controls enforced through Microsoft Information Protection.
Conclusion
In summary, the settings above allow an organisation to strongly control what can be shared. If sharing is allowed, certain additional controls determine whether the sharing is for internal users or for users external to the organisation. If the latter is chosen, there are further controls on what external users can do. Audit controls and policies may also control how users can share information externally.
The key takeaway is that organisations should ensure that the sharing options available in Office 365 are based on the organisation’s business requirements and security risk framework.
Office 365 includes a range of information security and protection capabilities. This post focusses on the configuration and implementation of Data Loss Prevention (‘DLP’) capabilities in SharePoint Online and OneDrive for Business (ODfB).
The purpose of DLP is to protect specific and definable types of sensitive company or agency information by preventing (or monitoring) its deliberate or inadvertent exfiltration from the organisation.
Examples of exfiltration methods where DLP can be used include:
Attachments to emails.
Uploads to web-based systems.
Examples of the types of sensitive information that can be protected with DLP include:
Financial data. For example, bank account numbers, tax file numbers, credit/debit card numbers.
Personal and sensitive information (PSI). For example, driver’s licence numbers, tax file numbers, passport numbers.
Medical and Health records. For example, medical account numbers.
The requirement to protect sensitive information is the subject of legislation in a number of countries.
Enabling Data Loss Prevention in Office 365
DLP in Office 365 enabled through policies that are set in the Security and Compliance Admin Centre of the Office 365 Admin Portal, under ‘Threat Management’ > ‘Data Loss Prevention’.
DLP policies are set by the Office 365 Global Administrator, as well as the Compliance Administrator and/or the Security Administrator if these roles have been configured in the Security and Compliance Admin Centre.
To create a DLP policy, the Administrator clicks on the + icon in the Data Loss Prevention screen. This opens a new window with the following options displayed.
A custom policy is one that is defined by the organisation. It would normally be for content that contains specific values.
The options ‘Financial regulations’, ‘Medical and health regulations’, and ‘Privacy regulations’ include default Microsoft-provided policies. Each of these default policies includes a description, coverage (e.g., what information is protected), and where the information is to be protected (e.g., in SharePoint Online, OneDrive for Business, and Exchange Online).
Enabling and modifying default policies
After selecting a default policy, the authorised user must then identify the services that may store the information that need to be protected – SharePoint Online, OneDrive.
Note: The option to choose Exchange Online is (as of 13 March 2017) still unselectable.
The next option allows the Administrator to customise the rule that has been chosen. If a default policy has been selected in the previous dialog, options for that policy will display; these may include ‘count sensitivity’ (i.e., how many times the sensitive content is identified. Low count means high sensitivity to sensitive content.
The Administrator may add a new rule or edit one of the default options.
The Administrator may modify the conditions, actions and what happens when there is an incident for each of the default policies – see below for further details.
Defining custom DLP policies
If a custom policy is required, the Administrator clicks on ‘Custom Policy’ from the ‘Data Loss Prevention’ opening dialog screen, and then ‘Next’ at the bottom of the screen. The Administrator must define which services are to be protected (same as for default policies, above).
The next screen allows the Administrator to create a new policy, via the + icon.
In the new window that opens, the Administrator can then must define the new DLP rule through Conditions, Actions, Incident reports and General.
Conditions, Actions, Incident reports, General options
For either default or custom policies, the Administrator must set the following rules:
Conditions – what will cause the policy to run?
Actions – what will happen when the policy runs?
Incident reports – how is reporting managed?
General – any other points.
Conditions
For default policies, conditions are pre-defined and are based on (a) the type of content (e.g., credit card numbers, bank account numbers) and (b) whether the content is shared internally or externally.
These pre-defined conditions may be removed or edited, and new conditions may be added. Editing options include the number of times the sensitive content is found (‘Min count’, ‘Max count’), and both maximum and minimum percentage-based ‘confidence levels’.
For custom policies, the Administrator must define which conditions are to be met:
If you choose ‘Content contains sensitive information’, you must define the information through a + option. This brings up all the default choices provided by Microsoft.
If you choose ‘Content is shared with’, it allows you define if the information is shared with people inside or outside the organisation.
If you choose ‘Document properties contain any of these values’, you must define the values that would be found in a document. Note that, if this option is selected, the property must be configured in the SharePoint Online search settings.
Actions
For default policies, the actions to be taken are pre-defined and are based on sending a notification.
For custom policies, the Administrator must first decide whether the action will be to (a) block the content or (b) send a notification.
If ‘Block the content’ is selected the user will be unable to send an email or access the shared content.
If ‘Send a notification’ is selected it offers the same options as for custom policies. Note the ability to customise the email notification.
Incident Reports
When ‘Incident Reports’ is selected for both custom and default policies, the following options are available. Incident reports should be sent to the Administrator/s.
General
Default policies are pre-named but the name can be modified. This is also where the policy can be disabled.
Custom policies must be named and a decision made whether to enable it, test it, or turn it off. As noted below it is possible to test the policy first, to collect data.
DLP Reporting
Reporting from the DLP policies is accessed from the Security and Compliance Centre > Reports > Dashboard.
Office 365 includes a range of information security and protection capabilities. These capabilities are first set in Azure and then applied across the Office 365 environment, including in Exchange and SharePoint Online. This post focuses on the application of these capabilities and settings to SharePoint Online.
Enterprise E3 and E4 plans include the ability to protect information in Office 365 (Microsoft Exchange Online, Microsoft SharePoint Online, and Microsoft OneDrive for Business). If you don’t have one of those plans you will need a subscription to Microsoft Azure Rights Management.
Enabling Information Protection in Azure
The following steps must be carried out the first time Information Protection is enabled on Azure:
Log on to Azure (as a Global Administrator).
On the hub menu, click New. From the MARKETPLACE list, select Security + Identity.
In the Security + Identity section, in the FEATURED APPS list, select Azure Information Protection.
In the Azure Information Protection section, click Create.
This creates the Azure Information Protection section so that the next time you sign in to the portal, you can select the service from the hub ‘More Services’ list.
Default Azure Information Protection policies
There are four default levels in Azure Information Protection:
Public
Internal
Confidential
Secret
Once set, these levels can be applied as labels to information content. Sub-labels and new labels may also be created, as necessary via the ‘+ Add a new label’ option.
The configuration settings are shown below:
Each of these label/level settings may:
Be enabled or disabled
Be colour-coded
Include visual markings (the ‘Marking’ column)
Include conditions
Include additional protection settings.
Each includes a suggested colour and recommended tip, which are are accessed via the three dot menu to the right of each label.
Markings
When selected, this option will place a label watermark text on any document when the label is selected.
Conditions
Conditions may be applied, for example, if credit card numbers are detected in the text. It allows the organisation to define how conditions apply, how often (Occurrences), and whether the label would be applied automatically or is just a recommended option.
Global Policy Settings
In addition to the settings per level, there are three global policy settings:
All documents and emails must have a label (applied automatically or by users): Off/On
When set to On, all saved documents and sent emails must have a label applied. The labeling might be manually assigned by a user, automatically as a result of a condition, or be assigned by default (by setting the Select the default label option).
Select the default label:
This option allows the organisation the default label to be be assigned to documents and emails that do not have a label.
Note: A label with sub-labels cannot be set as the default.
Users must provide justification to set a lower classification label, remove a label, or remove protection: Off/On [Not applicable to sub-labels]
This option allows you to request user justification to set a lower classification level, remove a label, or remove protection. The action and their justification reason is logged in their local Windows event log: Application > Microsoft Azure Information Protection.
Custom Site
A custom site may be set up for the Azure Information Protection client ‘Tell me more’ web page.
Unique ‘Scoped’ Policies
In addition to the default policies listed above, a unique policy may be created. These are called Scoped Policies.
Enabling (and Disabling) Azure Information Protection
The steps above are used to set up the labels. They must then be enabled to provide protection. The steps below also allow protection to be removed.
From the Azure Information Protection section, click on the label to be set, then click on Protect. This action opens the Permission settings section.
Select Azure RMS and ‘Select template’, and then click the drop down box and select the default label template. This will probably show as, e.g., ‘(Your Company Name) – Confidential’.
Click ‘Done’ to enable this label and repeat for the others.
Note: If a new template is created after the Label section is opened, you will need to close this section and return to step 2 (to select the label to change), so that the newly created template is retrieved from Azure.
Removing Protection
Users must have the appropriate permissions to remove Rights Management protection to apply a label that has this option. This option requires them to have the Export (for Office documents) or Full Control usage right, or be the Rights Management owner (automatically grants the Full Control usage right), or be a super user for Azure Rights Management. The default rights management templates do not include the usage rights that lets users remove protection.
If users do not have permissions to remove Rights Management protection and select this label with the Remove Protection option, they see the following message: Azure Information Protection cannot apply this label. If this problem persists, contact your administrator.
Additional notes
If a departmental template is selected, or if onboarding controls have been configured:
Users who are outside the configured scope of the template or who are excluded from applying Azure Rights Management protection will still see the label but cannot apply it. If they select the label, they see the following message: Azure Information Protection cannot apply this label. If this problem persists, contact your administrator.
All templates are always shown, even if a scoped policy only is configured. For example, a scoped policy for the Marketing group; the Azure RMS templates that can be selected will not be restricted to templates that are scoped to the Marketing group – it is possible to select a departmental template that selected users cannot use. It is a good idea (to help troubleshoot issues later on) to name departmental templates to match the labels in the scoped policy.
Once these settings are made, they need to be published (via the ‘Publish’ option) to become active.
Enabling Information Protection in Office 365
Activating Information Protection in the Office 365 Admin Portal
Once they have been configured and published, it is then necessary to enable the required settings in the Office 365 Admin Portal (Settings > Services & add-ins > Microsoft Azure Information Protection).
To do this, log on to the Office 365 Admin Portal (as a Global Administrator) then click on ‘Services & add-ins’ under Settings. Click ‘Activate’ to activate the service.
Activating Information Protection for Exchange and SharePoint Online
Once the service is activated for Office 365, it can then be activated in the Exchange and SharePoint Admin Centres. In SharePoint Online this is done via the Admin Center section ‘Settings’ and ‘Information Rights Management (IRM)’.
Configuring SharePoint and SharePoint Libraries for IRM
As at 12 March 2017, it is only possible to link Azure Information Protection classification policies with SharePoint Online if a new site is created via the SharePoint end user portal, as it appears as an option when enabled. Sites created via the SharePoint Admin Portal do not (yet) include the option to apply a protection classification.
If the creation of sites via the SharePoint end user portal is enabled, users with appropriate permissions (e.g., Owners with Full Control) can apply Information Rights Management to SharePoint libraries in their sites.
IRM is enabled on each individual library or list where the settings will be applied via Library Settings > Information Rights Management, under Permissions and Management.
Check the box to ‘Restrict permissions on this library on download’. Only one policy can be set per library.
Assigning Information Protection labels to Office documents
When labels are configured and enabled, they can then be be automatically assigned to a document or email. Or, you can prompt users to select the label that you recommend:
Automatic classification applies to Word, Excel, and PowerPoint when files are saved, and apply to Outlook when emails are sent. It is not possible to use automatic classification for files that were previously manually labeled.
Recommended classification applies to Word, Excel, and PowerPoint when files are saved.
Applying the policies to Exchange and office
The site below describes how to apply these policies to Exchange and Office applications. These are not discussed further here.
Until now, the security of information stored in SharePoint on-premise implementations was largely based on access control groups that gave or restricted access to the content on the site. Access to the content, and ability to do anything with it (e.g., edit, read) depending on what group you belonged to. The main five access control groups are:
SharePoint Administrator/s: Access to everything.
Site Collection Administrator: (Usually) access to everything, but this can be disabled.
Site Owners: ‘Full Control’ access to everything (except for the Site Collection Administration elements in Site Settings).
Site Members: ‘Contribute’ or add/edit access.
Site Visitors: Read only.
Other groups such as Designer and Reader existed for specific purposes.
At any point from the top level Site Collection downwards through all the content, these inherited permissions could be stopped and unique permissions – including for both individuals and new access groups – could be created and applied to control access to content.
Audit logs supplemented access controls by providing details of who did (including changing security permissions) or accessed what, and when. While the SharePoint Administrator and Site Collection Administrator’s names are not visible to Site Owners, Members or Visitors, they appear in the audit logs if any activity is recorded. System account activity is also recorded in the logs.
New Security Controls in SharePoint Online
SharePoint Online brings a range of new options to protect the security of information, in addition to access controls. These options, some of which are included with SharePoint 2013 an onwards, are:
Information security classifications
Data Loss Prevention (DLP)
Audited sharing
Information Rights Management (IRM)
Shredded storage (new from SP 2013)
Two of these options can be seen in the following Microsoft diagram:
Source: ‘Monitoring and protecting sensitive data in Office 365’ https://msdn.microsoft.com/en-us/library/mt718319.aspx
Information Security Classifications
According to a number of online sources, from at least March 2011, Microsoft has classified its own information into three categories: High Business Impact (HBI), Moderate Business Impact (MBI), and Low Business Impact (LBI).
High Business Impact (HBI): Authentication / authorization credentials (i.e., usernames and passwords, private cryptography keys, PIN’s, and hardware or software tokens), and highly sensitive personally identifiable information (PII) including government provided credentials (i.e. passport, social security, or driver’s license numbers), financial data such as credit card information, credit reports, or personal income statements, and medical information such as records and biometric identifiers.
Moderate Business Impact (MBI): Includes all personally identifiable information (PII) that is not classified as HBI such as: Information that can be used to contact an individual such as name, address, e-mail address, fax number, phone number, IP address, etc; Information regarding an individual’s race, ethnic origin, political opinions, religious beliefs, trade union membership, physical or mental health, sexual orientation, commission or alleged commission of offenses and court proceedings.
Low Business Impact (LBI): Includes all other information that does not fall into the HBI or MBI categories.
Source: ‘Microsoft Vendor Data Privacy – Part 1’ (March 2011) https://www.auditwest.com/microsoft-vendor-data-privacy/
Microsoft released code (via Github) to apply these classifications to SharePoint on-premise deployments in 2014.
In 2016 Microsoft released a Technical Case Study highlighting how it migrated all its SharePoint content to SharePoint Online – and how information classification formed part of that process.
Source: ‘SharePoint to the Cloud – Learn how Microsoft ran its own migration’ (Case Study – 2016) https://msdn.microsoft.com/en-us/library/mt668814.aspx
In May 2016, Microsoft announced that this form of classification would be added to new SharePoint Online site collections during 2016.
The application of security classifications to SharePoint Online sites has two elements:
Security and compliance policies, set by the SharePoint Administrator via either the ‘Security policies’ or ‘Data management’ section of the Office 365 Security & Compliance Center. [As of 23 May 2016 the only policies are ‘Device management’ and ‘Data Loss Prevention’. While the DLP policies appear to allow the inclusion of security classifications, it is expected that Microsoft will add more options to support the application of security classifications during 2016. See below for more information on DLP.]
A new drop-down, three choice (LBI, MBI, HBI) option in the ‘Start a new site’ dialogue box under the question ‘How sensitive is your data?’ The choice of classification invokes the relevant security and compliance policies.
Microsoft provides examples of the types of information that would be covered by each of these at this interactive site: https://www.microsoft.com/security/data/
The application of these policies will enable organisations to control what happens to information stored in sites assigned these classifications. Among other things, this can prevent users from sending (or trying to send) MBI or HBI classified information to people not allowed to receive or view it, including through DLP policies discussed in the next section.
Data Loss Prevention (DLP)
Data Loss Prevention policies allow organisations to:
Identify sensitive information across both SharePoint Online and OneDrive for Business sites (and in Exchange, through the same settings).
Prevent the accidental sharing of sensitive information, including information classified MBI or HBI.
Monitor and protect sensitive information in the desktop versions of Word, Excel and Powerpoint 2016.
Help users learn how to stay compliant by providing DLP tips.
View reporting on compliance with policies.
DLP Conditions
DLP works by giving Site Administrators the ability to create and apply DLP policies in the Security & Compliance Center for SharePoint (which includes OneDrive for Business; there is a separate Center for Exchange). In the Center, the Administrator navigates from ‘Security policies’ to ‘Data loss prevention’.
The DLP policy area includes a range of ‘ready-to-use’, financial, medical and privacy templates for a number of countries including the US, UK and Australia. Examples of pre-defined Australian sensitive information types include: bank account numbers, driver’s licence numbers, medical account numbers, passport numbers, and tax file numbers.
Specific actions must be set for every DLP policy; that is, what happens if the policy conditions are met. The default actions are:
Block access to content (for everyone except its owner, the person who last modified the content, and the owner of the site where the content is stored AND send a notification by email.
Suggest a Policy Tip to users. Options are (a) Use the default Policy Tip or (b) Customise the Policy Tip.
Allow override options. There is one main checkable option (‘Allow people who receive this notification to override the actions in this rule’) and two sub options:
A business justification is required to override this rule, and
A false positive can override this rule.
In addition to these actions, where the DLP policy identifies sensitive content in a document stored in SharePoint Online or OneDrive for Business it displays a small warning ‘stop’ sign icon on the document icon. Hovering over the item displays information about the DLP policy and options to resolve it.
DLP Incident Reports
Incident reports are designed to alert a compliance officer to details of events triggered by the DLP conditions, and provide reporting on those events.
Information sharing is a common activity in SharePoint and in SharePoint 2016 and SharePoint Online it is actively encouraged through a new Share option.
In addition to other existing audit options, sharing activity can now be audited in SharePoint Online. The audit logs for Office 365 (which must be enabled) are accessed through the Office 365 Admin Center > Security & Compliance Center > Search & investigation > Audit log search.
Microsoft’s Information Rights Management capability provides an additional layer of protection for a number of document types at the list and library level in SharePoint Online sites.
Supported document types include PDF, the 97-2003 file formats for Word, Excel and PowerPoint (e.g., Office documents without the ‘x’ at the end of the file extension – ‘word.doc’, the Office Open XML formats for Word, Excel, and PowerPoint (e.g. with the ‘x’ at the end – ‘word.docx’), the XML Paper Specification (XPS) format.
According to Microsoft, IRM:
‘… enables you to limit the actions that users can take on files that have been downloaded from lists or libraries. IRM encrypts the downloaded files and limits the set of users and programs that are allowed to decrypt these files. IRM can also limit the rights of the users who are allowed to read files, so that they cannot take actions such as print copies of the files or copy text from them.’
IRM is enabled via the Office 365 Admin Center > Admin > SharePoint > Settings > Information Rights Management > ‘Use the IRM service specific in your configuration’ and then ‘Refresh IRM Settings’.
Image source: ‘Apply IRM to a List or Library’ https://support.office.com/en-us/article/Apply-Information-Rights-Management-to-a-list-or-library-3bdb5c4e-94fc-4741-b02f-4e7cc3c54aa1
When IRM is activated on a library, any file that is downloaded is encrypted so that only authorised people can view them. Again, according to Microsoft:
‘Each rights-managed file also contains an issuance license that imposes restrictions on the people who view the file. Typical restrictions include making a file read-only, disabling the copying of text, preventing people from saving a local copy, and preventing people from printing the file. Client programs that can read IRM-supported file types use the issuance license within the rights-managed file to enforce these restrictions. This is how a rights-managed file retains its protection even after it is downloaded.’
Shredded storage, as the name suggests, describes the way documents are stored in SharePoint, starting from SharePoint 2013. Instead of storing a document as a single blob, documents are stored in multiple blobs.
This is a more efficient – and possibly more secure – way to manage documents when they are updated by only updating the element/s that were changed. According to a Microsoft presentation on 4 May 2016:
‘… every file stored in SharePoint is broken down into multiple chunks that are individually encrypted. And, the keys are stored separately to keep the data safe. In the future, we would like to give you the ability to manage and bring your own encryption keys that are used to encrypt your data stored in SharePoint. If you want, you can revoke our access to the keys. And we will not be able to access your data in the service’.
The Microsoft website ‘Monitoring and protecting sensitive data in Office 365’ provides further information about other Information Security options in Office 365, including reporting options to support auditing of activity in the tenant.