Email Archive Migration FAQ
What is an email archive migration?
An email archive migration is the process of copying the content of an existing email archive platform to a new target platform. Many factors create the need for migrations including:
a) Changing the underlying storage platform of the current archive,
b) Consolidation of environments in the event of a merger or acquisition,
c) Splitting a subset of the content to another location (de-merger), and
d) Migrating from an on premise to a cloud based solution (Google, Mimecast, Office365, Symantec EV.Cloud etc.)
Email archive migrations can be completed either manually or automated via the use of specialist third party software tools.
What is an automated email archive migration?
An automated email archive migration leverages specialist migration software to simplify and streamline the migration process, thus saving time and money, and significantly reducing the risk of human error. The process involves a careful mapping of source to destination platforms, and a set of tasks that can run continuously to copy and re-point the archive data from source to destination platform. In an automated migration, error handling and detailed reporting are provided, as well as the ability to maintain chain of custody.
An automated migration typically provides a direct means of getting source data to the target. Where the target can be communicated with directly, data can be migrated in one step, directly from source to target without the need for an interim stage to PST/MSG/EML or the like. Where the target can only receive data in PST/EML/MSG, an automated migration will enhance the speed of extraction, and will provide an audit log of all messages moved.
Can I perform my own automated migration? Or do I need to have a specialist service?
An email archive migration is a complex undertaking. An automated migration requires involvement from an experienced and dedicated archive migration specialist, certified in the software tool being leveraged to complete the migration. Each migration is different, with many configuration permutations and combinations that need to be tested to ensure the optimum migration and success levels throughout. A migration service provider will have the experience, qualified skills, and methodology to deliver a successful migration.
Often the migration service providers will provide an end to end service, ensuring that all source data has migrated to the target. Some organisations do consider contracting a specialist firm to provide design, implementation and initial user migrations, seeking to perform the balance of the migration themselves. Whilst this is a viable approach, it is not recommended. During a migration, issues will arise that require expertise in the source and target archives, as well as in the software tool being leveraged so as to understand the underlying problem. Often the additional cost of having an end to end migration is marginal, with majority of organisations preferring to leave the migration to a specialist organisation who can deliver an outcome based migration i.e. all readable data migrated.
Why can’t I just use PST to migrate to a new archive platform?
Many archive platforms offer the ability to export and import archived mailboxes to and from PST files. At first, it may seem that a manual approach using PST files as an interim format could be a good option, however, in the majority of cases, manual migrations via PST files are slow, manual, and prone to human error. Moreover they do not offer any error logging or audit capabilities, and are not suitable where a larger number of mailboxes required to be migrated.
Some archive target platforms only accept data in PST format. In these circumstances, it is highly recommended that an automated software tool be used to perform the extraction of data to ensure that the issues identified with a manual process can be addressed as far as possible.
There are many issues with a manual PST extraction approach including, but not limited to:
- Slow and manually intensive – Typically, native archive extraction and import tools are single threaded and need to be manually overseen. Seldom can multiple mailboxes be scheduled to run and the certainly cannot be left unattended. This significantly increases the migration time-frame, manpower, and costs involved.
- Little to no error management – If a PST extract or import fails, the process generally stops with no indication of the underlying problem, and rarely is there an ability to either identify and skip the items that fail, or to resume where the process left off .
- Little to no logging or auditing – When extracting to PST files there is seldom a record of which data has been moved. All checks must be done manually to ensure that all messages in the source have been extracted to the target. Such an approach is very time consuming and subject to human error.
- PSTs risk the security and integrity of your data – Coupled with the lack of built-in checks, the multi-step process involved in PST migrations means that the extracted data needs to be held on interim storage for a length of time, during which it is open to tampering or corruption. Chain-of-custody is broken as a consequence, making a PST-driven migration unsuitable for organisations that have compliance or regulatory requirements.
- Loss of compliance data – When using a journal archive it is likely that “envelopes” containing BCC and distribution list recipients will have been stored in the archive and need to be recreated on extraction. Manual extraction to PST files does not allow for this recreation of envelope information. Accordingly, BCC and distribution list data is lost and cannot be searched for in the target platform.
- Large PST files are prone to corruption – Often the standard tools shipped with an archive product do not allow for the split of a mailbox into multiple small sized PST files. As such, when extracting large mailbox or journal archives, PST files can become quite large and are then easily prone to corruption. Where this occurs, the source data needs to be extracted again in the hope that it will not be corrupt, and it can be ingested to the target.
- Requires interim storage – Most email archive platforms provide single instance storage i.e. only one copy of a message is stored in the database, regardless of the number of recipients. For example, if a mail message with a 10MB attachment was sent to 15 people, it will have been stored as one 10MB message in the archive, which (depending on the source archive and/or storage platform) may have been compressed as well. When extracting this data to PST, a copy of the message will need to be extracted into each recipient’s PST file. Therefore, the message that was previously 10MB will now become 150MB when extracted to 15 users PST files. This loss of single instance storage often means that the PST extraction is a minimum of 2.5 times the size of the original legacy archive.
- Limited or no flexibility in what is extracted – When extracting to PST files, typically all mailbox data will be extracted without an ability to filter or be selective over what is extracted and then ingested to the target.
An automated migration can address each of the above issues.
What is Chain of Custody and why is it important?
Chain-of custody-refers to the reliable recording of processes and procedures that occur while evidence (physical or electronic) is being captured, held, transferred, or disposed of. For organisations with strict industry compliance regulations and internal data management policies, maintaining chain of custody is vital.
In relation to a migration, organisations required to maintain chain-of-custody must be able to demonstrate that the data has not been altered in anyway as it transits from one archive platform to another. Manual migration methods that rely on problematic PST files are subject to human error, and have no tracking or auditing mechanism to prove that a migration was 100% successful.
Automated migrations that leverage third party software tools include complete auditing of the migration process. In some cases where the data must change to allow for it to be ingested into a new platform, such changes are logged and identified, proving the data itself isn’t altered. An automated migration service provides a full audit trail ensuring data can be seen as defensible evidence and means old archive platforms can be decommissioned without legal concerns.
For example, detailed reports can be provided to show 1:1 mappings of the ID of the item in the source archive and the ID of the new item as it is moved to the destination archive, enabling demonstration a complete ‘Chain-of-Custody’ for the data. Importantly, migration tools are the only solution available able to preserve BCC recipients and distribution list information from journal mailbox archives. These messages may be part of legal proceedings or may be subject to regulatory requirements, so the loss of such vital information could be extremely costly and potentially damaging.
How will shortcuts be handled? Will they work in the new environment?
An automated migration service ensures that shortcuts can be seamlessly converted as part of the migration, allowing them to work with the new archive environment. Alternatively, they can be removed if they are no longer needed.
Why do exchange journal archives require special care?
Journal archives tend to be extremely large, thus making a manual extraction approach slow and subject to size-related problems when relying on PST files as an interim store. Migration software tools allow journal and large mailboxes to be split into a number of separately handled virtual mailboxes of a user-defined size, providing multiple processing threads to be applied to the migration of a single mailbox, significantly speeding up the migration task. If PST must be the target platform, an automated migration will ensure that the PST files are of a manageable size and contain all of the source objects.
The majority of organisations that capture emails into Exchange journal mailboxes do so for compliance reasons, and have standardised on using the “Envelope Journaling” feature. This feature was developed by Microsoft as a way to preserve vital header information including BCC recipients and the expanded members of any distribution lists. From a compliance perspective, this data must be preserved and available for access when performing e-discovery.
It is important to note that different archive platforms store the envelope information in different ways. For example, some store this information separately in the archive index, whereas others store it in the archive store itself. This creates a requirement to recreate the envelope information at the time of migration that is not supported by the manual tools provided by the source archive’s native extraction capabilities.
By automating a migration with third party software, organisations can be assured that this data is properly recreated and re-associated with the original email during a migration to a new platform. For more details on whether your particular journal mailbox migration path is supported, please contact us.
How will I know my migration has been successful?
A successful migration will vary between organisations as their requirements will differ. A successful migration can be measured by some or all of the following key success factors:
- More than 99.9% of the readable data has been migrated
- Chain-of-custody has been maintained
- Time frame to migrate data was been reasonable given amount of data and health of environment, compared with manual approach
- Internal resources have not been overly consumed by migration project
- Corrupted data has been eliminated during the migration
- Migration project was cost effective
- Users had minimal impact on their mail and data retrieval experience
- Archived data has been effectively migrated into the new platform to support future data management and retrieval
A manual extraction cannot address the majority of the above criteria, driving organisations to take an automated approach.
What are failed or skipped items?
Failures in message migration may be temporary or permanent.
‘Temporary’ failures are usually due to environmental issues such as poor network bandwidth or high load on the legacy archive system. In such cases, the migration tools will automatically re-process the relevant item(s) for a specified number of times and/or at a different time of day until the object has been migrated.
‘Permanent’ failures are usually attributable to pre-existing problems in the source archive (i.e. not caused by the migration process). As such, it is likely that these items would NOT have been readable by any audit or e-discovery process. This kind of faulires tend to be low – typically 0.1-0.01% of the overall archive content.
Skipped items typically occur when there are items in the source system that are of a type or format not support by the target system, and these items cannot be altered to be excepted. These issues typically occur when migrating between Lotus Notes and Microsoft Exchange messaging platforms.
How long will the migration take?
This is the most common yet difficult question to answer as every environment performs very differently. When performing an automated migration we typically refer to speed in terms of messages per second, experiencing an average performance of approximately 30 messages/second. In some cases, performance can be as high as 200 messages/second, or as low as 10 messages/second.
The main variables that affect migration performance are:
- Available network bandwidth
- Speed of the storage subsystem on which the legacy archive sits as well as the destination storage
- The ingestion performance of the target archive system
- The scheduling of other project elements such the commissioning of the target environment
The speed at which a migration will take place is often not determined until the migration has commenced and core data has been gathered.