The Text Message Deduplication Dilemma - Part 1
Let’s play "Fill in the Text Message Gaps"
Text messages offer up a world of valuable evidence to digital investigators beyond that of email communications. Today’s society is mobile and the use of email for corporate communications is on the decline. Where traditional messaging offers a fairly straightforward means for identifying duplicates, that same simplicity does not exist in the world of messaging apps. The dichotomy between email and text messages means a much different approach to deduplication. In this series, we are going to take a look at the issues the eDiscovery industry faces when it comes to taming the text message deduplication dilemma.
The Difference Between Email and Text Messaging
Email is centralized to a server and is delivered to various devices directly from that server. We will avoid the super-technical weeds here, but in a nutshell that server is the storage system and relay point that ensures that message is delivered to each device. If you connect a new device, it will download that email and synchronize actions performed on those emails between all of those connected devices. Now of course there are exceptions, but for the most part this is how email works.
Today’s text message communications (SMS, MMS, iMessage, WhatsApp, LINE, etc.) are not quite the same. Each device enabled to receive text messages receives a copy independent from all of the other devices. There is no single server that stores all of these messages. (Of course there is always an exception, depending on what type of form of text communication you use). Some apps like Slack are server-based communications, but in this article we are talking about direct device-to-device text-based communications.
Many of the electronic discovery platforms force you to make chat-based conversations into a document first, and then deduplicate that “document”.
The Dilemma of Text Message Deduplication
Without a central server, this means that If a user were to delete a specific text message from their iPhone, but had those same text messages delivered to their iPad, the message would be gone from the iPhone. However, the iPad would retain a copy of that same message. Now when we combine the extract from the iPhone with the iPad, this creates a “double up” in messages where they exist on both devices. Given discovery is often about filling in the gaps, we want the message that was deleted from one device, but not the other that creates the duplicate.
To win this “fill in the text message gap” game requires us to deduplicate these messages, and therein lies the problem with the vast majority of today’s electronic discovery tool sets. Why? One might ask. Because many of the electronic discovery platforms force you to make chat-based conversations into a document first, and then deduplicate that “document”. (Does RSMF ring a bell for you Relativity users?)
We strongly recommend against the “convert to document first” method, as it is not a consistent and practical approach to text message deduplication and should never be your go-to solution. It also can be an expensive process, as you need to convert everything first, and once converted, you lose the message by message integrity required for significant and reliable deduplication.
In our next article in this short series, we will continue to dive into the text message deduplication dilemma. Additionally we will explore the litany of issues text messages present when it comes to consistent deduplication across varied sources spanning multiple custodians and devices. While there is no perfect solution, ESI Analyst provides options for deduplication that many investigation platforms do not, but we may be a bit impartial, so we will leave it up to you to decide as we progress through this industry wide problem.