The Text Message Deduplication Dilemma - Part 2
More Devices? More Problems.
In the first part of this series we talked about the differences between email and text communications. Additionally, we discussed the practicality of converting text-based communications into documents for the purposes of deduplication. Frankly, it can end up giving you quite the ediscovery headache, not to mention a completely unnecessary processing bill to pay!
Returning to our iPhone and iPad scenario we outlined. If a user had deleted a text message from their iPhone but not from their iPad, by collecting from both sources, you would be able to fill in the missing components of a text message conversation, right? However, in doing so you would end up with a lot of different duplicate text messages on your hands. Alas, this is not the only problem this chat-to-document conversion approach presents.
A Frankenstein of Text Message Threading
In order for the "convert to document" deduplication scenario to have a remote chance for success, all of the data sources for each custodian must be received and processed simultaneously.
What if you were to convert all of the messages into a chat-based document on the iPhone first, not knowing there was an iPad with some of the same messages? Later on it is discovered that there is not only an iPad but an iCloud backup as well. Now what? By taking all of the newly found data and converting that into chat documents and attempting to deduplicate against the previous data... you begin to see the issue? If any messages were missing from any of those sources, you just created a Frankenstein of text message threading. While many of the individual messages are duplicates, as an entire document they are not a duplicate.
For a multiple device scenario, in order for the "convert to document" deduplication scenario to have even a remote chance for success, all of the data sources for each custodian must be received and processed simultaneously.
This is the driver behind the text message deduplication dilemma. By converting items into a document first, if messages are missing from one source or another (or worse yet, there are multiple duplicates), the additional “documents” you just created would not deduplicate against each other. If this wasn’t your starting goal, you’ve just increased your review team’s workload unnecessarily.
To further exacerbate this problem, some devices could have settings that remove images and video after a set period of time, or only keep messages for a short period. Data storage has always been a cost driver to phones, and those with older phones were consistently running out of space. To solve this problem, smartphone operating systems presented their users with the option to delete various messages after a given time span or remove media selectively using various device settings.
Given these settings can vary from device to device, each phone may or may not have various multimedia attachments. This creates further frustration when attempting to deduplicate text messages across devices, especially MMS and various app-based multimedia messages.
In addition to the above, let’s examine a matter where you have 3 - 4 custodians all with multiple devices, all with varied settings. To further complicate the process, these users also interacted with each other was well as others outside of their own circle of communication. In this scenario the number of text messages can spread faster than COVID-19 at a beach party in Florida during spring break.
In our next article in our series on the text message deduplication dilemma, we will explore how a unique approach to text message discovery may be the answer. Hint, it has to do with ESI Analyst, but we will let you decide as to the best fit for your matter (and your wallet).