Forensic Linguistic Databank



One of the main aims of the Aston Institute for Forensic Linguistics is to create a pioneering forensic linguistics resource, named the Forensic Linguistic Databank (FoLD). This databank will be a global precedent, providing a permanent, controlled access online repository for malicious communication data, investigative interview data, and forensic evidence validation data for both speech and text. FoLD will thus comprise a wide range of data sets with relevance to forensic linguistics, including commercial extortion letters under the umbrella of, investigative interviews in police and other contexts, legal documents, forum posts from far-right online groups, and comment threads from political blogs.

The intention for the databank is to not only further academic research into forensic linguistics by developing new methods and approaches but also to directly contribute to impact in assisting the delivery of justice. Therefore, research projects using this data will validate methods for forensic analysis, further the effectiveness of interviewing techniques used by British police, and help tackle internet crime and abuse on behalf of law enforcement beneficiaries, such as the National Crime Agency.

Our People

Directors, Academic and Research Staff
Dr Sarah Atkins
Research Fellow in Forensic Linguistics
Professor Tim Grant
Director of the Aston Institute for Forensic Linguistics
Dr Lucia Busso
Research Fellow in Forensic Linguistics
Dr Márton Petykó
Research Associate in Forensic Linguistics


Creation of FoLD Databank

Prof. Tim Grant, Dr Sarah Atkins, Dr Lucia Busso and Dr Márton Petykó are working on the development of a secure repository for Forensic Linguistic research data, which will be made available to internal and external research groups. This innovative data-sharing development for the field of forensic linguistics aims to enable and encourage further research, thus building capacity in the discipline.


Prof. Tim Grant, Dr Sarah Atkins, Dr Lucia Busso and Dr Márton Petykó are working on the development of an online corpus of malicious communications called TextCrimes, which will be made available to researchers and students in the field of forensic linguistics. Through, we are providing tagged data sets of malicious communications which can be downloaded or examined using an integrated set of search and analysis tools.

EXCROW - the Extortion Corpus of Writings

Prof. Tim Grant, Dr Sarah Atkins, Dr Lucia Busso and Dr Márton Petykó are working on the storage and textual analysis of ~300 commercial historic extortion letters and emails provided by law enforcement partners with the aim of providing a resource to develop investigative techniques particularly in genre analysis, topic modelling and authorship analysis and the potential linking of texts.

Constructional complexity of lay legal language

Dr Lucia Busso is focusing on linguistic complexity  in an ad-hoc collected English and Italian legal lay language  corpus. The projects encompasses both lexical and syntactic complexity, and examine how this relates to comprehensibility. Using mainly Construction Grammar as a framework, she investigates complexity using both ecological corpus-based data and psycholinguistic experimental settings.

Trolling accusations and online hate speech

Dr Márton Petykó is running a corpus study examining trolling accusations as a potential indicator of hate speech and harassment in British and Hungarian online political discussions. This project draws on two existing corpora and involves a metapragmatic analysis of the trolling accusation comments and the alleged trolls’ comments.

Applied linguistic research ethics in professional settings

Applied linguistics has seen a rise in interprofessional research with external stakeholders, not least in forensic linguistics. Through a series of case studies, Dr Sarah Atkins is researching the practical and ethical challenges encountered by applied linguists working in these contexts. Her overview of current practices and challenges contributes to FoLD policies at AIFL.

Operation Heron

For this pilot project, Prof. Tim Grant, Dr Sarah Atkins, Dr Lucia Busso and Dr Márton Petykó are analysing a collection of historic abuse letters sent to individuals in the public eye from schools, hospitals, universities, and mosques and also to individuals’ private homes, between 2007-2009. By employing the relatively recent technique of structural topic modelling, alongside corpus linguistic analysis and coding of themes, we are mapping distinctions between topics and changes to the themes of the letters over time.