The Text Message Deduplication Dilemma - Part 3
Two People Walk Into a Bar, How Many Text Messages Come Out?
In part two of our series we hypothesized about the conundrum of deduplicating text messages from multiple custodians with multiple devices. To further complicate the scenario we threw in the fact that each of their devices could have settings enabled or disabled that prevented the preservation of various multimedia such as images and video.
In our example, we could end up with exponential duplicates across our body of evidence. As each of these different parameters would cause our various text message sources not to deduplicate against each other, especially when converting each to a document format prior to the deduplication process.
Text Message Review Scenarios
The problem is finding a practical way to do one (or more) of the following without having to review a myriad of duplicate text messages:
Create unified chat threads of all of the messages across all devices and custodians (global deduplication)
Isolate chat threads to a particular custodian (custodial level deduplication)
Isolate chat threads to a particular device for a particular custodian (device level deduplication)
One might wonder. Would there ever be a matter in which scenario 2 or scenario 3 would be desirable in conjunction with scenario 1 making all 3 possible at once? The answer is a definitive “yes”!
A Spacey Case Study in Text Message Discovery
It was alleged that various screen captures of the text messages had been altered and other key messages were suspected to be missing from conversations.
If you recall, recent headlines declared that the Kevin Spacey sexual assault trial was dismissed. The details of the matter revolved around the accuser’s text messages, which the defense wanted to review at a more in-depth level. Without diving into too many specifics, it was alleged that various screen captures of the text messages had been altered and other key messages were suspected to be missing from conversations. The case placed the two at a bar where the accuser sent multiple text messages to his friends telling them he was being groped against his will by Mr. Spacey.
So now we have two people in a bar and one texting, but how many text messages came out? The accuser sent texts to another individual, which means they would have a copy of the messages on their device, as well as their responses. Additionally, (and this is just supposition at this point) that individual may have sent messages to other friends telling them the story (hey, news travels fast via text). Those friends may have texted others, and so on and so on.
In this scenario, the receiver of the accuser’s text messages was never subpoenaed (at least we cannot find a reference that states otherwise). This means, if the defendant would have gathered that phone, theoretically they could have assembled a complete conversation using the first scenario in our list above. Definitely something we would want to do in this case to see the full and complete conversation!
However, we would also want to see all of the devices from our accuser and deduplicate those to see what messages were unique to that individual. This is equivalent to scenario 2 in our list. But in this case the mother of the accuser was known to have selectively deleted messages from her son’s phone to remove remnants of what she described as his “fratboy activities”. Sound suspect?
If you think so, then scenario 3 in our list above would apply as well, as we would want to know what specific text messages were deleted from that one device. Of course this last item requires we have other devices or backups to compare, but nonetheless it is definitely relevant in this scenario. (And why didn’t they ask for backups?)
In the end, if any of our scenarios of 1, 2 or 3 above are desired, the “convert to document” first methodology for deduplication is not optimal and in some cases it will not work as desired. Deduplication would be time consuming and isolating the specific chat threads would be cumbersome or require a data analytics team and all devices to be present at once. Each would then need to be matched back to the original matter actor and... well you get the picture.
The good news is there is a way to perform all three of our listed scenarios and have them all in one system at the same time, loaded over time. The trick is not to think of text messages as “documents”, because text messages were never a document in the first place! Text messages are just rows and rows of data in a self-contained database located on the device (or in the cloud). All you need is a system that can dynamically render the desired views of those chat threads.
The key to proper electronic discovery of text messages is to have the right software to deduplicate each message individually first, then convert the resulting responsive threads to a document for production. And yes, that is our shameless plug. Best thing to do now is to take a look at ESI Analyst. It can handle text messages for all of our scenarios and a whole lot more.