Research Output
Utilising Reduced File Representations to Facilitate Fast Contraband Detection
  Digital forensics practitioners can be tasked with analysing digital data, in all its forms, for legal proceedings. In law enforcement, this largely involves searching for contraband media, such as illegal images and videos, on a wide array of electronic devices. Unfortunately, law enforcement agencies are often under-resourced and under-staffed, while the volume of digital evidence, and number of investigations, continues to rise each year, contributing to large investigative backlogs.

A primary bottleneck in forensic processing can be the speed at which data is acquired from a disk or network, which can be mitigated with data reduction techniques. The data reduction approach in this thesis uses reduced representations for individual images which can be used in lieu of cryptographic hashes for the automatic detection of illegal media. These approaches can facilitate reduced forensic processing times, faster investigation turnaround, and a reduction in the investigative backlog.

Reduced file representations are achieved in two ways. The first approach is to generate signatures from partial files, where highly discriminative features are analysed, while reading as little of the file as possible. Such signatures can be generated using either header features of a particular file format, or by reading logical data blocks. This works best when reading from the end of the file. These sub-file signatures are particularly effective on solid state drives and networked drives, reducing processing times by up to 70× compared to full file cryptographic hashing. Overall the thesis shows that these signatures are highly discriminative, or unique, at the million image scale, and are thus suitable for the forensic context. This approach is effectively a starting point for developing forensics techniques which leverage the performance characteristics of non-mechanical media, allowing for evidence on flash based devices to be processed more efficiently.

The second approach makes use of thumbnails, particularly those stored in the Windows thumbnail cache database. A method was developed which allows for image previews for an entire computer to be parsed in less than 20 seconds using cryptographic hashes, effecting rapid triage. The use of perceptual hashing allows for variations between operating systems to be accounted for, while also allowing for small image modifications to be captured in an analysis. This approach is not computationally expensive but has the potential to flag illegal media in seconds, rather than an hour in traditional triage, making a good starting point for investigations of illegal media.

  • Type:


  • Date:

    30 October 2019

  • Publication Status:


  • Library of Congress:

    QA75 Electronic computers. Computer science

  • Dewey Decimal Classification:

    004 Data processing & computer science

  • Funders:

    Edinburgh Napier Funded


McKeown, S. Utilising Reduced File Representations to Facilitate Fast Contraband Detection. (Thesis). Edinburgh Napier University. Retrieved from



digital forensics; triage; image comparison; image processing; known file analysis; image thumbnails; cryptographic hashing; perceptual hashing

Monthly Views:

Available Documents