- AIFL Annual Symposium 2020
Date: Thursday 24 September – Friday 25 September 2020
The Aston Institute for Forensic Linguistics (AIFL) hosted its Annual Symposium in September 2020. Due to current restrictions, the event was hosted virtually, yet still attracted a vast number of delegates.
The five centres of AIFL showcased the diversity in cutting-edge research taking place. Each of the centres had the opportunity to introduce themselves, their speakers and the exciting research projects their teams are currently working on.
The two-day online event featured numerous presentations and poster sessions from the Institute’s researchers. Throughout, AIFL facilitated rich discussions and created exciting interactions between delegates from law enforcement legal and academic backgrounds. The poster sessions, delivered digitally through a new online environment Welcome.me, allowed attendees to experience the ‘social spaces’ whilst interacting over coffee (albeit homemade!) and discussing the topics raised.
The event was exceptionally well attended with over six hundred delegates from across five continents signing up for the event, and around three hundred delegates attending the busiest session.
- Linguistics Approaches to Online Sexual Crime
Date: 6 Mar 2020
Prof Tim Grant and Dr Nicci MacLeod launched their new book 'Language and Online Identities: The Undercover Policing of Internet Sexual Crime.'
Forensic linguistics is at the cutting edge of the undercover policing of child sexual abuse on the open internet and dark web, and language and identity is a fundamental part of this. The authors have drawn on their extensive experience in training undercover officers to develop innovative methods in identifying the creation and performance of online personas, crucial in detecting identity disguise online.
Incorporating the launch book: Grant, T. & MacLeod N. (2020) Language and Online Identities The Undercover Policing of Internet Sexual Crime. CUP
10:00 - 11:00 Dr Nicci MacLeod, Northumbria University. Where it all Began: Linguistic Training for Online Identity Assumption
11:00 - 11:30 Dr Andrea Nini, University of Manchester. Authorship clustering for the dark web: Methodological and theoretical remarks.
11:30 - 12:00 Daniela Schneevogt - AIFL, Aston University. “we were just really in love”: referentiality and clusivity of the pronoun ‘we’ in a Dark Web community of child sex abusers.
12:00 - 1:00 Prof Nuria Lorenzo-Dus, Swansea University. Developing Resistance against Online Grooming: From Linguistic Analysis to Practice-based Interventions.
2:00 -2:45 Matt Sutton, Senior Manager, National Intelligence Hub (CEOP), National Crime Agency. The linguistic contribution to the investigation of online sexual crime – Operation CACAM, the investigation into Matthew Falder.
2:45 - 3:15 Dr Emily Chiang, AIFL, Aston University, Dr Dong Nguyen, Alan Turing Institute and Prof Jack Grieve, University of Birmingham. Rhetorical analysis of suspected child sexual offenders’ interactions in a dark web image exchange chatroom
3:45 - 4:45 Prof Tim Grant AIFL, Aston University Linguistic. Prof Tim Grant identities: theory and practice in dark web child abuse fora.
4:45 Book launch celebration – with light refreshments.
Abstracts for academic talks
Dr Nicci MacLeod - Northumbria University. Where it all Began: Linguistic Training for Online Identity Assumption
The monograph launched during this event is the end product of almost ten years of involvement in the training of online undercover operatives (UCOs) in the linguistic aspects of identity disguise – a task that is required as part of a wide range of types of investigation, including those into the online sexual abuse and exploitation of children.
This talk tracked the development of linguistic input into this kind of training from the initial discussions and back-of-an-envelope thoughts on what might be the most relevant theories and ideas for trainees all the way through to the theoretically sophisticated approach to identity that has arisen from these research projects. Heeding Robert’s (2003) plea that “the design and implementation of [applied linguistics] research needs to be negotiated from the start with those who may be affected by it”, Nicci described how she worked in close partnership with practitioners in order to ensure the collaborative research was maximally impactful. Drawing on observations of trainee UCOs preparing for a live operation and a series of trials her and her team had run in which trainees had their performances assessed prior to and after linguistic training, Nicci demonstrated the measurable changes that her research and input has made to professional practice.
As well as addressing some stereotyped beliefs about the way particular groups of people use language online, linguistic input based on Nicci and colleagues’ research also raised trainees’ awareness of higher levels of linguistic analysis, such as pragmatics and interactional patterns. Nicci showed this enhanced knowledge being put into practice in a simulated operation carried out by an experienced UCO as part of the project. She concluded with some thoughts on how the work influenced her team’s thinking around language and identity, a theme picked up by Professor Grant in the afternoon session.
Dr Andrea Nini – University of Manchester. Authorship clustering for the dark web: Methodological and theoretical remarks
An important problem in dark web investigations is how to link usernames that belong to the same person in web forums. Combining data that belongs to the same offender can significantly help investigations but often the only evidence available to link usernames is linguistic. In the field of computational authorship analysis, the task of grouping texts in a corpus by authors is called ‘author clustering’ and it relies on cluster analysis techniques using frequency of linguistic items as features. This task is related to ‘authorship verification’, or the task of confirming that a certain text was written by a specific suspect, which is one of the most difficult tasks in authorship analysis. This talk covered the methodological problems in applying these techniques to dark web forum data and propose some theoretical solutions. Andrea included remarks on how studying this problem can shed more light on our understanding of linguistic individuality for forensic linguistics.
Prof Nuria Lorenzo-Dus, Swansea University. Developing Resistance against Online Grooming: From Linguistic Analysis to Practice-based Interventions.
The internet enriches children’s lives, providing learning, creative, entertainment and social opportunities. Yet it has a dark side, too, potentially exposing them to abuse and harm. This includes sexual grooming, known instances of which are increasing rapidly and with many more cases going unreported. Children who have been / are being groomed via the internet may not tell anyone because they feel ashamed or guilty; some may not even realise that they are being groomed given offenders’ manipulative tactics.
How can we better tackle the problem of online child sexual grooming? In this talk, Nuria advocated the importance of understanding both offenders’ linguistic modus operandi and child victims’ discourse within what is essentially a communicative process of entrapment. Firstly, she introduced the data (pseudo- and real- online grooming conversations) and methods (primarily Corpus Linguistics) that over a series of inter-connected studies have enabled the identification of complex communicative patterns within online child sexual grooming. Secondly, she focused on key results regarding offenders’ strategic use of ‘vague language’ and children’s attempts to resist grooming. Finally, Nuria discussed two interventions geared towards combating online child sexual grooming: a hybrid Artificial Intelligence - Corpus Linguistics tool for detecting groomer language and a prevention-oriented training resource for professionals designed to raise their awareness of groomers’ communicative tactics and children’s discourse in response to them. Both interventions are based upon multi-disciplinary academic work, integrating Linguistics, Computer Sciences, Criminology and Public Policy, and are being developed in collaboration with stakeholders, including child protection and law enforcement agencies.
Daniela Schneevogt, Aston Institute for Forensic Linguistics. “we were just really in love”: referentiality and clusivity of the pronoun ‘we’ in a Dark Web community of child sex abusers
Criminals use the Dark Web to build networks for conversation and support (Holt et al. 2015). For those with a sexual interest in children, the internet facilitates the abuse of children, the distribution and consumption of illicit imagery and the exchange of ideas and advice (Durkin et al. 2006; Cohen-Almagor 2013; Holt et al. 2015). Such communities create dense linguistic layers of meaning which are difficult to penetrate by persons outside the community. Drawing on Bell’s (1984) notion of audience design, Van Leeuwen’s (2013) social actor framework and Scheibman’s (2004) concept of clusivity, Daniela’s study aimed to investigate how users of a Dark Web child sex abuse forum use the first person plural pronoun ‘we’ by carrying out a two-fold annotation for semantic referents and clusivity. In these texts, first person pronouns are used in a much wider array of contexts than first anticipated. In addition to the well-studied variation in clusivity – that is, differences between exclusive and inclusive referents – large variation across two further axes was identified: group and function. For example, abusers normalise their actions when referring to both a child and a forum user together as ‘we’, portraying children as active and equal partners in those pseudo-intimate relationships. Scheibman’s (2004) clusivity categories are therefore not sufficient in explaining the different pragmatic functions of the pronoun ’we’ in child abuse forum communication. Applications of these findings include online undercover policing, such as infiltration of crime-related fora, as discussed by Grant and MacLeod (2017, 2020).
Dr Emily Chiang, Aston Institute for Forensic Linguistics, Dr Dong Nguyen, Alan Turing Institute, Prof Jack Grieve, University of Birmingham. Rhetorical analysis of suspected child sexual offenders’ interactions in a dark web image exchange chatroom
Child sexual offenders regularly convene in online spaces to exchange illicit imagery and advice about abusive practices (Davidson & Gottschalk, 2011; Westlake & Bouchard, 2016). In response, law enforcement agencies around the world are increasingly deploying undercover officers who pose as offenders to gather intelligence and evidence on offending communities. Currently, however, little is known about how offenders interact online, raising significant questions around how undercover officers should ‘authentically’ portray the child sexual offender. Emily presented a linguistic description of authentic offender-offender interactions taking place on a dark web image exchange chatroom. She analysed the rhetorical moves and strategies of chatroom users and visualise users’ move structures using Markov chains, enabling us to compare the linguistic behaviours of specific user ‘types’. Emily and colleagues found that the predominant moves characterising this chatroom were Offering Indecent Images, Greetings, Image Appreciation, General Rapport and Image Discussion, and that these moves (and others) were employed differently by users of seemingly greater and lesser offending experience. Based on their findings, Emily suggested some practical take-home messages for undercover agents working in this domain.
Prof Tim Grant, Aston Institute for Forensic Linguistics. Linguistic identities: theory and practice in dark web child abuse fora
Tim addressed the idea of a linguistic individual, and how as individuals we draw on an array of resources to perform a variety of online identities. In a theoretical aspect of this discussion Tim explored how the resources we draw on enable but also constrain our identity performances and he showed in practical terms how this has two implications for online undercover officers (UCOs). The first implication is that in attempting to perform as another person the most convincing route will be to acquire the resources that a target individual draws on in their identity performances; and that these resources can be identified through a detailed linguistic analysis of chat logs (as demonstrated by Dr MacLeod in the first session). The second implication is that undercover officers need to learn to suppress those resources which they commonly use to perform their everyday identities where these resources are not also shared by the targeted individual. Failure to achieve this suppression of identity resource can lead to the performance of hybrid identities somewhere between the UCO and their target identity. Tim illustrated these points with a series of examples from dataset used in the book he was launching that day, and he concluded by considering next steps in linguistic research in assisting police in the investigation of online sexual crime.
- Forensic Linguistics and New Urban Varieties
This event explored the characteristics of linguistic varieties in urban contexts, and their implications for research in Forensics.
Reflections from the event below are by Natascha Rohde, PhD student, Aston Institute for Forensic Linguistics.
As one of the big themes within sociolinguistics, (new) urban varieties of language(s) have been observed and studied by many linguists in various different context. However, in the case of forensic linguistics, linguistic variation, especially in multicultural urban contexts bring about new and different questions which this event offered an insight to.
The day started with lexicographer and UBE (Urban British English) expert Tony Thorne from Kings from College London giving an insight into his extensive experience working with police in interpreting drill lyrics and other texts written in UBE. Under the title 'Translating the language of violence: gang slang and Drill lyrics', he gave an overview of the various aspects to consider when dealing with UBE in a forensic or legal context and shared examples from his long-standing career as an expert in urban slang and rap lyrics.
The presentation of the following speaker, Yaron Matras from the University of Manchester focused on the 'Structural and social aspects of cryptolects' and illustrated the many functions cryptolects can take over in their respective communities of practice. Matras emphasized the multidimensionality and complexity of cryptolects, highlighting their role of making everyday conversation inaccessible to people outside the group in order to coordinate logistics of transactions but also how these overlap with ethnolects, illustrated by the example of Shelta, the language of the Irish Traveller Community. While ethnic marginalised minorities use their own variety to flag solidarity and speak in the presence of bystanders and not be understood, their language also serves to aid group bonding, perform identity and talk about taboos in a respectful way, accepted by their community.
He also illustrated the wide range of linguistic features of various cryptolects highlighting that they can be seen primarily as lexicons rather than fully fledged languages, and symbiotic in nature, meaning they do exist in symbiosis with another language. Using the languages around them, most cryptolects work by manipulating part of the lexicon to make meaning inaccessible, which can be done by semantic extensions, adding or swapping syllables or by using an heritage language, like Irish in the case of Shelta.
The final talk of the day, delivered by Eithne Quinn and Latoya Reisner also from the University of Manchester was titled 'Procedural unfairness and racially loaded misunderstandings in the use of rap lyrics in UK criminal cases' and gave a remarkable insight into how the judicial system utilises drill rap lyrics in their criminal proceedings. They provided a rich account of their experience of how drill lyrics have been used as evidence within the criminal justice system, often without the expert knowledge needed to contextualise and understand them. Showcasing examples of how drill lyrics have been plugged out of their initial textual context, leading to fundamentally changed meaning. Quinn made a convincing argument for how linguistic expertise can help address the inequality of arms, and tackle injustice. They concluded with a call to combat racial injustice within the system by providing expert knowledge to defence council. They argue that this would address the inequality of resources when police and prosecution use drill lyrics as evidence in court to secure a conviction, often using inadequate approaches lacking linguistic expertise, and failing to appropriately contexualise them and possibly prejudicing jurys.
The event was completed by a vivid and inspiring roundtable discussion about the role(s) (forensic) linguists can and should contribute to ensuring fair(er) trials and thereby making a well-founded “attempt to improve the delivery of justice” (Tim Grant) using linguistic methods and expertise.
- Researcher and Practitioner Ethics in Forensic Linguistics
Reflections from Dawn Knight, ‘Ethical considerations for corpus construction: a Welsh language case study’ below are by Fiona Klecher, PhD student, Aston Institute for Forensic Linguistics.
Dawn Knight discussed the ethical considerations, from data collection through to publication, which affected each stage of creating the open-source CorCenCC corpus, the first national corpus of contemporary spoken, written and digital Welsh. One consideration was explaining the concept of a corpus and gaining informed consent, particularly with children. They were helped in this task by “Cor-pws-the-cat”, the project’s mascot, who was used to explain the idea of ‘sharing words’.
Some contributors were reluctant to be recorded because of concerns about being identified. Anonymisation is a complicated issue, compounded by the relatively small numbers of Welsh speakers. Even when identifying features such as names and addresses were removed, there were still concerns that people could be ‘re-identified’ through accent, dialect or recognisable situations. Knight questioned whether it is possible to be truly anonymous in a minority language context. This led to an interesting discussion about finding the right balance between protecting contributors’ anonymity, and retaining as much linguistic detail as possible.
- Perspectives on Transcription in Criminal Justice
Date: 11th March 2021
You can read an excellent Event Summary written by Debbie Loakes, University of Melbourne over at the Research Hub for Language in Forensic Evidence here.
Transcription is almost always an institutional practice (Park & Bucholtz 2009). Across a range of institutional settings, ‘practitioners’ are eliciting and capturing spoken talk from ‘clients’ (Sarangi 1998), transcribing that talk, and later repurposing the transcripts in place of the original interaction. The transcription provides a written record of the spoken interaction, to be used by another party at a later date, in another setting or context.
Our point of departure for this event was that written records, and hence transcripts, are certainly necessary. However, we acknowledge that no transcript of spoken interaction can be exact. Over the three sessions we highlight how the transcripts are only ever a representation of the spoken talk and never direct copies, and they inevitably result in a loss of detail. While these ideas are well established in branches of linguistics that deal with transcription, they are not always clearly understood within the law. This has implications for the administration of justice, as our speakers will demonstrate in relation to transcription of police interviews, and of indistinct forensic recordings. In organizing this event we invite and encourage further linguistic input into this area of professional practice.
Dr Martha Komter, Netherlands Institute for the Study of Crime and Law Enforcement (NSCR)
In the Netherlands, police reports are drawn up by the interrogators in the course of the interrogations. These reports eventually serve to be quoted or summarised as pieces of evidence in court. Thus, the reports are removed from the context of their production, and inserted into the context of the court proceedings. These de- and recontextualisations inevitably entail changes of meaning. A more detailed inspection of relevant contexts reveals that de- and recontextualisations also occur in the process of transforming talk into text, in transforming this text into an official document, and in inserting that document into the case file.
Changes of meaning are a result of selective reporting and of the transformation of the interaction in the police interrogation. However, legal practitioners appear to rely on the assumption that what is reported represents 'the suspect's own words'. This can be associated with language ideologies that are deeply engrained both in the law books and in legal practice.
Dr Kate Haworth, Dr Felicity Deamer & Dr Emma Richardson, Centre for Spoken Interaction in Legal Contexts (SILC), Aston Institute for Forensic Linguistics, Aston University
In our presentation we introduce the ‘For the Record: applying linguistics to improve evidential consistency in police investigative interview records’ project. An examination of evidential consistency in investigative interview records; asking do the records serve as an accurate representation of the spoken interaction? Investigative interviews with suspects in England and Wales, are audio recoded as standard procedure and a Record of Taped Interview, or ROTI is produced. The original spoken data are (necessarily) substantially altered through the process of being converted into written format, yet little attention is paid to this. The extent to which the ROTI is an accurate representation of the audio recording is worthy of examination as the ROTI is routinely presented in court as part of the prosecution case; heavily relied upon, in place of the original audio recording. We share findings from an experimental study exploring to what extent we find variation in interpretations of police interviews when we manipulate the medium in which subjects (potential jurors) are exposed to the interview (i.e. as a written transcript or as an original audio recording). We also consider why records are not routinely standardised; considering a set of influencing factors which collectively result in varied records of spoken interaction.
Professor Helen Fraser, Research Hub for Language in Forensic Evidence, The University of Melbourne
Transcription of indistinct forensic audio – and a framework for understanding factors affecting the creation and evaluation of transcripts
This talk discusses transcription of indistinct covert recordings – conversation captured without the knowledge of the speaker(s), and used as forensic evidence in a criminal trial. This type of transcription is unusual in several ways. First, the audio is often of extremely poor quality, to the extent it is hard for independent listeners to make out what is said. Second, the content, and often the context, of the recording is unknown or contested. Third, the transcript is used, not as a record of what was said, but as assistance to the trier of fact (judge or jury) in hearing what is said, and thus in reaching a verdict.
These and other factors make forensic transcription even more difficult and problematic than other forms of transcription discussed in this symposium. Paradoxically, however, the transcripts are often produced and evaluated by personnel lacking specialised expertise in transcription (police and lawyers). Unsurprisingly, this causes significant problems (forensictranscription.net.au).
Of course, linguistic scientists are keen help create a better process – but how should that process work? Answering that question well requires a broad understanding of transcription in general. I present (as a starting point for discussion) a framework within which different types of transcript can be located, and suggest how this might form a useful tool for understanding factors that affect the creation and evaluation of transcripts.
The event was followed by a panel discussion.
- Idiolectal variation across discourse types
Date: 6 May 2021
This event was hosted by the Centre for Forensic Text Analysis and provided a platform for scholars interested in the linguistic individual and how empirically-based findings can inform and improve methods of forensic authorship analysis.
Lars Bülow (University of Vienna): Systematic and non-systematic idiolectal variation from a variationist perspective
This talk introduces not only systematic but also non-systematic idiolectal variation in spoken language from a variationist perspective. Whereas in variationist sociolinguistics, attention has always been given to those cases in which individuals systematically vary across different discourse types or styles, i.e. intra-speaker variation (cf. e.g. Labov 1966, 1972; Bell 1984; Coupland 2001; Hernández-Campoy 2016), very few studies have focussed specifically on idiolectal variation which occurs in the same style of speech irrespective of the context, the situation, or the communication partner (cf. Bülow et al. 2019: 98; Bülow and Pfenninger 2021). It will be argued that both types of idiolectal variation need to be considered in relation to the dimension of time. In addition to the theoretical background, this talk will also present a panel study that spans over 40 years dealing with idiolectal variation across two discourse types (formal and informal speech) in Austria.
Neus Alberich, Andrea Batel, Krzysztof Kredens and Piotr Pezik (Aston University): Idiolectal variation in Spanish across four discourse types
The idea of authorship attribution is based on two assumptions: that every language user has a unique linguistic style, or 'idiolect', and that features characteristic of that style will recur with a relatively stable frequency. Hundreds of style markers and a variety of attribution techniques have been proposed over the years with some recent studies reporting very high attribution success rates for the less complex closed-set tasks. However, one problem with such studies has been their tendency to use sociolinguistically homogeneous data, whereas a forensically useful author identification system needs to be able to capture stylistic similarities between texts created in different genres and contexts, and for different purposes and audiences.
This paper reports on a study involving nine participants providing linguistic input in Spanish in four discourse types (interview, all-group meeting, email, Whatsapp messages). Using word n-grams as the basic classification tool, we have measured within-author and between-author variability, and identified features that appear to be stable in some of the idiolects across all four discourse types. We ask why this should be the case and discuss the potential of those features to be used in authorship attribution tasks beyond our study.
Malvina Nissim (University of Groningen): Do author traits survive variation? Profiling across genres and languages
Author profiling is the task of predicting some of the author’s traits, like gender or age, disclosed through writing. To perform profiling automatically we develop systems that from existing data learn to make predictions over new, unseen examples. How similar should existing and new examples be in order for systems to be successful? This obviously depends on a more general question: what's the persistence of author traits across different texts, different genres, and even different languages? I will unpack this core point through the presentation of a series of cross-genre and cross-lingual profiling experiments. By discussing not only results but also experimental choices and settings, I will end the talk with reflections on what the optimal experiment to answer our question should look like.
Tatiana Litvinova (Voronezh State Pedagogical University): Idiolect identification in cross-genre and multi-genre scenarios using an approach from bioinformatics
Despite one and a half century of research efforts, identification of an idiolect based on quantifiable linguistic features remains a challenging task in practice. The complexity of the task increases when training documents and the documents in question differ in topic and/or genre, albeit this scenario is not uncommon in forensic settings where small training corpora are typical (Kredens and Coulthard 2012). To address this type of idiolect identification problem, a corpus of multiple texts per author is needed. The texts should represent the author’s idiolect in various ways, i.e. should differ in topic, genre, mode (written/oral), type, way of production (hand-written, typed on physical or touchscreen keyboard), etc. The authorship of all the texts in such a corpus should be unquestionable
A team at the Corpus Idiolectology Lab has collected the first freely available resource of this type, RusIdioStyle, which is now a part of the RusIdiolect database (Litvinova 2021). RusIdiolect has metadata related to both text and author.
Three datasets were derived from RusIdioStyle with each of them containing texts by four different authors. Each author’s idiolect was represented by four genres: picture description, essay, narrative, description of the day. Two scenarios of idiolect identification were constructed: 1) multi-genre, i.e. both training and test sets were compiled from texts in four genres; 2) cross-genre, i.e. the classifier was trained on picture descriptions, stories, essays and tested on descriptions of the day. A range of stylometric markers were used: most frequent word forms and punctuation marks, most frequent lemmas with and without punctuation marks, character n-grams, POS n-grams, full morphological tags, indices of lexical diversity, etc. They were used separately to test the efficiency of each type of the features. As an analytical tool, methods for multivariate data analysis as implemented in the R package mixOmics (Rohart et al.2017) were used, namely PCA for assessing the main source of variation and its supervised version – PLS-DA used as classifier.
Using the above methodology, it was shown that genre was the major source of variation for most types of the features despite the general claim about their context-independent nature. Nevertheless, for all the datasets and for both scenarios the accuracies of idiolect identification higher than the baselines were obtained (the significance of the results was tested), albeit the performance of the classifiers differed with respect to the datasets, as well as the most efficient features.
A possible explanation of the results is discussed, and directions of further research are outlined.
Kredens K. and Coulthard M. (2012). Corpus Linguistics in Authorship Identification. In: The Oxford Handbook of Language and Law. Edited by Lawrence M. Solan and Peter M. Tiersma.
Litvinova T. (2021). RusIdiolect: A New Resource for Authorship Studies. In: Antipova T. (eds) Comprehensible Science. ICCS 2020. Lecture Notes in Networks and Systems, vol 186. Springer, Cham.
Rohart F, Gautier B, Singh A, Lê Cao K-A (2017). mixOmics: An R package for ‘omics feature selection and multiple data integration. PLoS Comput Biol 13(11): e1005752.
- Calibration in Forensic Science
Date: 3 June 2021
Abstract for symposium
In the first decade of the 2000s, procedures and statistical models were developed for calibrating the likelihood-ratio output of automatic-speaker-recognition systems. These procedures and models were quickly adopted for calibrating the likelihood-ratio output of human-supervised-automatic forensic-voice-comparison systems. Since at least the early 2010s, recommendations have been made to use the same calibration procedures and models in other branches of forensic science. Interest in doing this is now growing. Published examples can be found in the context of multiple branches of forensic science, including fingerprints, DNA, mRNA, glass fragments, and mobile telephone colocation. There are also published examples of the use of these procedures and models to calibrate human judgements. The 2021 Consensus on validation of forensic voice comparison and the Forensic Science Regulator of England & Wales’s 2021 Development of evaluative opinions both recommend/require the use of calibration.
This symposium brings together some of the leading researchers in the calibration of the likelihood-ratio output of automatic-speaker-recognition systems and of forensic-evaluation systems. They explain what calibration is and why it is important. They present algorithms used for calibrating likelihood-ratio systems, and metrics used for assessing the degree of calibration of likelihood-ratio systems. They discuss aspects of calibration on which there is consensus, aspects on which there is disagreement, and aspects requiring additional research. They also discuss how to encourage wider adoption of calibration of likelihood-ratio systems in forensic practice.
You can view the presentation slides here.
Forensic Data Science Laboratory, Department of Computer Science & Aston Institute for Forensic Linguistics, Aston University
Calibration in forensic science
You can view the presentation slides here.
Geoffrey Stewart Morrison
Forensic Data Science Laboratory & Forensic Speech Science Laboratory, Department of Computer Science & Aston Institute for Forensic Linguistics, Aston University
In the first decade of the 2000s, procedures and statistical models were developed for calibrating the likelihood-ratio output of automatic-speaker-recognition systems. These calibration procedures and models were quickly adopted for calibrating the likelihood-ratio output of human-supervised-automatic forensic-voice-comparison systems. They were adopted in both research and casework. The 2021 Consensus on validation of forensic voice comparison recommended that “In order for the forensic-voice-comparison system to answer the specific question formed by the propositions in the case, the output of the system should be well calibrated” and that “forensic-voice-comparison system should be calibrated using a statistical model that forms the final stage of the system”. Since at least the early 2010s, recommendations have been made to use the same calibration procedures and models in other branches of forensic science. Interest in doing this is now growing. Published examples can be found in the context of multiple branches of forensic science, including fingerprints, DNA, mRNA, glass fragments, and mobile telephone colocation. There are also published examples of the use of these procedures and models to calibrate human judgements. In this presentation I answer the questions: What is calibration? Why is it important? and How is it performed? I also discuss how this approach to calibration relates to the calibration requirements in the Forensic Science Regulator of England & Wales’s 2021 appendix to the Codes of Practice and Conduct: Development of evaluative opinions.
Dr Morrison is Director of Aston University’s Forensic Data Science Laboratory & Forensic Speech Science Laboratory. Since 2008, he has published multiple papers related to calibration of forensic-evaluation system, including a 2013 tutorial paper on the topic. He was lead author of the 2021 Consensus on validation of forensic voice comparison.
Calibration in automatic speaker recognition
You can view the presentation slides here.
Instituto de Ciencias de la Computación, Universidad de Buenos Aires – CONICET
Most modern speaker verification systems produce uncalibrated scores at their output. Although these scores contain valuable information to separate same-speaker from different-speaker trials, their values cannot be interpreted in absolute terms – they can only be interpreted in relative terms. A calibration stage is usually applied to convert scores to useful absolute measures that can be interpreted, and that can be reliably thresholded to make decisions. In this presentation, I review the definition of calibration and explain its relationship with Bayes decision theory. I then present ways to measure quality of calibration, discuss when and why we should care about it, and show different methods that can be used to fix calibration when necessary.
Dr Ferrer is a researcher at the Computer Science Institute, affiliated with the University of Buenos Aires and with the National Scientific and Technical Research Council of Argentina (CONICET). She received her PhD in Electronic Engineering from Stanford University in 2009. Her primary research focus is machine learning applied to speech processing tasks.
Calibration in forensic voice comparison
You can view the slides from the presentation here.
AUDIAS Lab, Escuela Politécnica Superior, Universidad Autónoma de Madrid
In this presentation, I describe the role of calibration in forensic voice comparison, focusing on the use of automatic systems in a Bayesian decision framework. I describe computation of calibrated likelihood ratios in the context of scenarios and recording conditions typically encountered in forensic casework. I present algorithms commonly used for calibration. I also discuss the importance of calibration in the process of validating forensic-voice-comparison systems, and discuss recommendations and guidelines published by the European Network of Forensic Science Institutes (ENFSI).
Dr Ramos is an Associate Professor at the Audio, Data, Intelligence and Speech (AUDIAS) Laboratory of the Autonomous University of Madrid. He is author of numerous publications on applying and measuring calibration, especially in the context of forensic problems. He has served on scientific committees, and has often been invited to present on the role of calibration in forensic science.
Measuring calibration of likelihood-ratio systems
You can view the slides from the presentation here.
Netherlands Forensic Institute
In this presentation, I explain the concepts of what constitutes well-calibrated probabilities and well-calibrated likelihood ratios. I briefly describe graphical representations for assessing degree of calibration. I then focus on several metrics designed to assess degree of calibration, and present the results of a study comparing the performance of different metrics. Three metrics are taken from the existing literature, and one is a novel metric. One existing metric is based on the expected value of different-source likelihood-ratio values and the expected value of the inverse of same-source likelihood-ratio values (after Good, 1985), another is based on the proportion of different-source likelihood ratios above 2 and the proportion of same-source likelihood ratios below 0.5 (after Royall, 1997), and the third is Cllrcal (Brümmer & du Preez, 2006). The novel metric is devPAV (Vergeer et al., 2021).
Dr Vergeer is a research scientist in forensic statistics at the Netherlands Forensic Institute. His research focuses on computer-based methods for evaluation of strength of evidence, and on measuring and improving the performance of human experts. He has published multiple research papers on calibration of likelihood-ratio systems and on measuring the degree of calibration of likelihood-ratio systems.
Moderator: Rolf J.F. Ypma
Principal Scientist, Netherlands Forensic Institute
Forensic Data Science Laboratory, Department of Computer Science & Aston Institute for Forensic Linguistics, Aston University
The presenters will discuss aspects of calibration on which there is consensus, aspects on which there is disagreement, and aspects requiring additional research. They will also discuss how to encourage wider adoption of calibration of likelihood-ratio systems in forensic practice.
- ‘I'm not talking about guess. Do you know?’ Interactional management of knowledge claims in expert testimony
Date: 24 June 2021
Speaker: Magdalena Szczyrbak
In this talk I will be looking at some of the interactional mechanisms which underlie the construction of “legal truth” in jury trials and which form the discursive processes of turning facts and expert opinions into evidence. Using a case study, I will examine the interactional behaviour of expert witnesses and counsel acting within the constraints of the Anglo-American adversarial system. Adopting a discourse-analytic perspective, I will demonstrate what stances they adopt and what interactional resources they employ to position themselves vis-à-vis their interactants and their knowledge claims. Building on such linguistic concepts as stance, speaker commitment, epistemicity and evidentiality, I will show how expert knowledge is claimed, disclaimed, attributed and contested. To this end, I will consider the interplay of the pronouns I, you and we with verbal markers of experiential, cognitive and communicative stance (Marín-Arrese, 2009), demonstrating a correlation between the participants’ roles and communicative goals and the type of stance they adopt during testimony.
- Morality and entitlement in a digital context: How police chat-handlers manage citizens’ conversational projects on digital 101
Date: 17 June 2021
Speakers: Joanne Meredith, Alexandra Kent; and Magnus Hamann
During this presentation, we will explore interactions on a non-emergency online police chat (Digital 101). Specifically, we are interested in the sequential difference between Digital 101 and telephone calls to 101. This presentation is part of a larger project that investigates how online chat can be (better) used as the medium for non-emergency ‘calls’ to the police. The data for this project was collected over a two year period from a UK police force. Half of the data was collected in 2019 before the Covid pandemic and the other half was collected in 2020, during the periods of lockdown and other restrictions.
Unlike face-to-face interaction, where interlocutors have resources to show how they align with projects as they are being produced (Stivers, 2008), because they are more asynchronous, conversations in chat do not offer this option. Instead, we observe that first turns are constructed with a range of relevant projects that chat-handlers can pick up on. In this presentation, we specifically explore how chat-handlers deal with displays of morality and entitlement.
Stivers, T. (2008). Stance, alignment, and affiliation during storytelling: When nodding is a token of affiliation. Research on language and social interaction, 41(1), 31-57.
- Using mixed methods to do discourse analysis of legal language: The case of suspect identification decisions in the Brazilian High Court (“Superior Tribunal de Justiça”)
Date: 10 June 2021
Speaker: Joao Pedro Padua 1 Professor of Law, “Fluminense” Federal University, Brazil
Professor of Law, “Fluminense” Federal University, Brazil. PhD (Applied linguistics), 2013, LLM., (Constitutional law), 2008, and LLB, 2004, Pontifical Catholic University, Brazil. Visiting Scholar, Center for Law, Language and Cognition, Brooklyn Law School, U.S. (2018). Practicing attorney in Rio de Janeiro, Brazil.
Forensic linguistics as a field has been distributed by Coulthard and Johnson (2007, pp. 8–10) into two (or, more recently, three; cf. Coulthard and Johnson, 2013, p. 7) sub-fields;. The first deals mostly with the language of legal contexts and settings—dubbed “The language of the legal process”. The second deals mostly with language data that becomes relevant as evidence in legal proceedings—dubbed “Language as evidence”. The methods traditionally associated with the former are mostly qualitative, drawing on the concepts, analytical procedures and ontological assumptions about language stemming from linguistic and discourse-analytic subfields such as pragmatics, interactional sociolinguistics, conversation analysis, critical discourse analysis, and so on. Specifically, concepts such as conversational implicatures, sequential organization of face-to-face interaction, textual/discursive genres/registers, information organization of utterances and the like form the bread-and-butter of the (forensic) linguistic approach to the language as used in legal settings (Coulthard and Johnson, 2007, chap. 1; Shuy, 2015).
In this talk, I want to draw upon this tradition of consolidated qualitative analytical concepts, procedures and methods, but also propose to expand it to incorporate quantitative methods that might fill in the gaps that qualitative methods leave; especially in que issue of generalizability and of dealing with substantial amounts of linguistic data. To do this, I will present the findings of a recent pilot study I did on data from judicial decisions from the Brazilian High Court (“Superior Tribunal de Justiça”) on the legal issue of the validity of suspect identification. Drawing on qualitative concepts and methods stemming from ethnomethodology and pragmatics (Pádua, 2019) and on quantitative concepts stemming from corpus linguistics and natural language processing—in this case, N-Gram language models (Jurafsky and Martin, 2019, chap. 3), I proposed that the Court performed what I called a “deontic transformation” of the legal norms relevant to the issue. This deontic transformation differs from a more general interpretive formulation of meaning, in that it negotiates the illocutionary force of the norms. I propose, further, that the linguistic data allow us to formulate a strong hypothesis that this transformation was carried out in order to artificially loosen the legal requirements of suspect identification and, because of that, validate convictions that might otherwise be annulled.
I discuss the relevance and usefulness of mixed methods, that are already pervasive on the Language as evidence side, also to the Language of the legal process side. And I discuss the implications that this type of research can have on both the linguistics and the legal analyses of legal interpretation.
Coulthard, M., Johnson, A., 2013. Introduction: Current debates in forensic linguistics, in: Coulthard, M., Johnson, A. (Eds.), The Routledge Handbook of Forensic Linguistics. Routledge, London, pp. 1–15.
Coulthard, M., Johnson, A., 2007. An introduction to forensic linguistics: Language in evidence. Routledge, London.
Jurafsky, D., Martin, J.H., 2019. Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition [WWW Document]. URL https://web.stanford.edu/~jurafsky/slp3/edbook_oct162019.pdf
Pádua, J.P., 2019. Discursive devices for inserting morality into law: initial exploration from the analysis of a Brazilian Supreme Court decision. Lang. Law=Linguagem e Direito 6, 11–29. https://doi.org/10.21747/21833745/lanlaw/6_1a1
Shuy, R.W., 2015. Discourse analysis in the legal context, in: Tannen, D., Hamilton, H.E., Schiffrin, D. (Eds.), The Handbook of Discourse Analysis. Wiley Blackwell, Oxford, UK, pp. 822–840.
- For the Record: Exploring variability in interpretations of police investigative interviews
Date: 27 May 2021
Speakers: Felicity Deamer, Emma Richardson, Nabanita Basu and Kate Haworth, Spoken Interaction in Legal Contexts, Aston Institute for Forensic Linguistics
We present findings from an experiment, which forms part of a wider project focusing on the treatment of police interview data as evidence, specifically the formats into which it is converted. Recent research (Haworth 2018) has demonstrated how interview data are (unintentionally) distorted as they pass through the criminal justice system, and the experiment we present here has been designed to test our hypothesis that various aspects of the processing of police-suspect interview data have a negative impact on the quality of the official evidential document produced. The findings (both quantitative and qualitative) shed light on, and provide a sound evidence base for this claim, rather than leaving it as an untested assumption. Participants were presented with a police-suspect interview from a murder enquiry, either in the form of a transcript or the original audio recording. Participants who read the transcript of the interview drew significantly different conclusions to those who had heard the original audio recording with respect to the interviewee's emotions and behaviour, as well as the degree of truth in their version of events. Open text box responses provide a rich insight into features of the interviewee's language and/or delivery that influenced participants' perceptions and interpretations of the interview.
- Exploring and analysing web and CMC corpora: a discursive tool-based approach
Date: 20 May 2021
Speaker: Julien Longhi, CY Cergy Paris University
In this talk, Julien presented a linguistic and discursive analysis model for analyzing meaning, which is based on a methodology that falls within the wider framework of digital humanities and is equipped with digital tools. Julien illustrated this approach with various corpora, composed of political discourse extracted from the digital social network Twitter or YouTube, or from the web. He also present a case study witch illustrates he contribution of digital humanities to forensic science.
- Legal advice as intercultural communication: Thinking beyond the legal-lay dichotomy in immigration legal advice practice
Date: 22 April 2021
Speaker: Judith Reynolds, Cardiff University
In this talk, I presented an overview of key findings from a linguistic ethnographic case study exploration of legal advice communication about UK refugee and asylum law. Taking as my starting point the conceptualisation of the lawyer in legal advice consultations as translating and transposing between ‘two competing world views and two associated and competing discourses’ (Maley et al. 1995: 42), I showed how in the context of my research site, such mediatory activity extended well beyond the legal-lay dimensions usually discussed in the literature (e.g. Conley and O’Barr 1990), into linguistic, cultural and other institutionally-connected practices of communicative mediation.
The study examined one lawyer’s communication in face-to-face legal advice meetings with a diverse range of clients seeking immigration advice. Through analysis of data from this rarely-accessed legal communicative context, I illustrated how at the interactional level, lawyer and client negotiate understanding and build rapport through second language use and sometimes with the support of interpreters; whilst at the level of discourse, the genre of legal advice interaction supports a meaningful dialogic exchange of information and perspectives that is fundamental to successful advice-giving. I then draw on the study to critically consider how legal advisors function as mediating professionals at a structural level, both supporting and regulating clients’ interactions with legal actors and institutions. I end with some of the study’s implications for research and for legal practice.
- Examining the use and disguise of persuasion, threat and isolation in romance fraud
Date: 25 March 2021
Speaker: Elisabeth Carter, University of Roehampton
In this seminar we were guided through real romance fraud communications and we explored how these interactions progress from innocuous beginnings to the financial devastation of the victims, without causing alarm. Revealing how the use of language can be akin to the tactics of coercive control and domestic violence and abuse, this session also showed how traditional approaches to prevention and awareness-raising could lull victims into a false sense of security in the fraudulent relationship in which they are unknowingly engaged. This research is being used to inform police protect and prevent strategies, financial institutions’ practices in stopping the harm sooner, and dating service approaches to user protection.
- More Bullshit and Lies? Untruthfulness in the Legal Process
Date: 18 March 2021
Speaker: Chris Heffer, Cardiff University
In Heffer 2020, I outline the TRUST framework for analyzing untruthfulness in everyday life. The key planks of that framework are, firstly, that the concept of untruthfulness needs to include not just insincerity (typified by lying) but also epistemic irresponsibility (typified by bullshit) and, secondly, that it is possible to systematically analyse putative cases of untruthfulness through a simple heuristic. Though simple, that heuristic brings out the categorical and ethical complexity of untruthfulness in situated context. I exemplify the framework primarily through examples from the media and politics, partly because the book was written during the years of Brexit and Trump but also because the forensic contexts merit another book.
In this talk, then, I explain how the TRUST framework can be applied to forensic contexts and consider whether the framework needs to be adapted to these contexts. On the one hand, we can easily see at work the major categories of insincerity e.g. the withholding of information in police interrogation; misleading in cross-examination; and the lying enshrined in perjury. On the other, the legal process, like science, is meant to be a bullshit-free zone where the ultimate aim is to achieve evidential accuracy. To some extent this is achieved in court because the highly strategic nature of trial discourse means that untruthfulness is usually deliberate. Yet rape cases flounder due to a first level of epistemic irresponsibility: dogma in the form of entrenched rape myths, which are sincerely but irresponsibly held. And police forces have fallen victim all too often to bullshit techniques and technologies proposed by ‘experts’ who might sincerely believe their inventions.
Heffer, C. (2020) All Bullshit and Lies? Insincerity, Irresponsibility, and the Judgment of Untruthfulness. New York: Oxford University Press.
- Disinformation in the news media — Can news be analysed based on its communicative purpose?
Date: 4 March 2021
Speaker: Helena Woodfield
News is understood to be a way for individuals to inform themselves of current important events, a way to gain information upon which we form our global outlook and opinions (Gelfert 2018). What if that information is false? Or worse: What if you can’t tell if that information is false? Researchers are attempting to tackle fake news from different angles but it’s possible we are talking at crossed purposes (Markines et al. 2009; Horne & Adali 2017; Yang et al. 2017). Using the term “fake news” doesn’t allow for a distinction between disinformation and misinformation — intentionally vs factually false information respectively (Lewandowsky et al. 2017). Depending on the data, the results will either encompass misinformation and coincidentally include disinformation or only apply to misinformation. The issue being that misinformation occurs without there necessarily being intent — mistakes happen. With fact checked corpora used increasingly as an easy data source, the research has been pigeonholed and we are only able to address the known side of the problem - misinformation (Markines et al. 2009; Horne & Adali 2017; Yang et al. 2017; Tacchini et al. 2017; Conroy et al. 2015).
In this presentation Helena presented a case study to address the issue of disinformation, exploring whether communicative intent (in terms of deception but also journalism) can be measured through the assessment of the linguistic choices made by the author. The study analyses a single author, Jayson Blair, from a single news source, the New York Times, producing a consistent linguistic style and news-type. These controls are used to explore whether it is possible to identify a deceptive style within a single author. We analyse his communicative purpose through the application of a register analysis (Biber 1988) and focussed corpus linguistic approach. The results demonstrate where his communicative purpose varies (intent to deceive or tell the truth) his linguistic style also varies. This shows a way forward for the analysis of fake news. Next steps? To apply this to more individuals to see if the results are transferable and subsequently answer more fully the question posed above.
Biber, D., 1988. Variation across speech and writing. Cambridge University Press.
Conroy, N.J., Rubin, V.L. and Chen, Y. 2015. Automatic deception detection: Methods for finding fake news. Proceedings of the Association for Information Science and Technology, 52(1), pp.1-4.
Gelfert, A., 2018. Fake news: A definition. Informal Logic, 38(1), pp.84-117.
Horne, B.D. and Adali, S. 2017. This just in: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. arXiv preprint arXiv: 1703.09398.
Lewandowsky, S., Ecker, U.K. and Cook, J., 2017. Beyond misinformation: Understanding and coping with the “post-truth” era. Journal of Applied Research in Memory and Cognition, 6(4), pp.353-369.
Markines, B., Cattuto, C. and Menczer, F., 2009, April. Social spam detection. In Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web(pp. 41-48). ACM.
Yang, F., Mukherjee, A. and Dragut, E. 2017. Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features. arXiv preprint arXiv:1709.01189.
Tacchini, E., Ballarin, G., Della Vedova, M.L., Moret, S. and de Alfaro, L. 2017. Some like it hoax:
Automated fake news detection in social networks. arXiv preprint arXiv:1704.07506.
- How speech alters norms
Date: 25 February 2021
Speaker: Mihaela Popa-Wyatt, ZAS Berlin
In this talk, I explained how social norms are changed by particular kinds of speech, notably, but not only, oppressive speech. I take slurs as a case study and then argue that the mechanism can be extended to other data. The starting point for the mechanism is a speech-act model of slurs (Popa-Wyatt & Wyatt, 2017). This model says that a slur is a move in a conversational game that assigns a low-power role to the target. The new idea in this paper is that we can use game theory to explain how the slur alters the conversational dynamics in a way that also alters social norms. Specifically, I describe the mechanisms by which the low-power conversational role leaks out into the larger social game. As an example, consider a case where a slur is used against a person and this slurring use alters the social norms that are subsequently applied to them by other audience members. My model assumes that a conversational game is embedded in a social game, and argues that a move in the former can change the norms in the latter. Key to this is the idea of a two-way inheritance rule. This rule has an import component and an export component. Conversational roles are typically imported from the social game. Once a low-power role is accommodated in the conversation, it may be exported to the social game, thus changing the norms associated with the corresponding target group. There are several candidate mechanisms for this export rule. I consider these and suggest the mechanism of inferential presupposition has the most explanatory power. I argue that this inferential presupposition shifts norms by changing the social roles associated with the target.
Popa-Wyatt, Mihaela, Wyatt, Jeremy L. (2017), Slurs, Roles and Power, Philosophical Studies 175: 2879-2906.
- The domino effect of speakers’ misidentification in transcriptions: critical remarks from a working case
Date: 18 February 2021
Speakers: Chiara Meluzzi and Sonia Ceneschi
In this seminar the speakers presented the results of a technical consultancy carried out for the defense in a case of misidentification of the speaker during preliminary investigations. Using a real case consultancy, the speakers addressed the issues connected with the use of noisy audio of limited duration in forensic settings, and considered the possibilities of cross-disciplinary work by matching linguistic and engineering skills.
- Pledging to Harm: A Linguistic Analysis of Violent Intent in Threatening Language
Date: 11 February 2021
Speaker: Marlon Hurt
Legal systems around the world assume that violent intent is not only real, but that it is also detectable in threatening language. However, empirical studies examining how, or even whether, violent intent is encoded in language are rare, and tend to explore the issue primarily through psychological theory. This linguistic analysis hypothesizes that authorial intent is indeed detectable in the language of threats, if only obliquely, because the functional aim of a threat issued with true violent intent is different than one issued for other communicative purposes, e.g., to cause fear. A novel combination of frameworks is employed to test this hypothesis on a dataset of six realized and eight non-realized threats. First, Audience Design Theory and Speech Act Theory delimit the investigation to the most common kind of threatening language, called ‘leakage’ in the threat assessment literature and a ‘pledge to harm’ in Speech Act Theory. Next, the Folk Concept of Intentionality and Biological Naturalism theorize which cognitive elements of intent may be expressed by pledges to harm. Finally, Systemic Functional Linguistics, and the discourse semantic method of Appraisal in particular, identify the various attitudinal and interpersonal meanings in the pledge dataset. Non-realized pledges are discovered to contain significantly more violent ideation, creating a prosody of heightened menace, while the realized pledges are more concerned with ethical evaluations. Hypothetically, these patterns of stance taking show that the non-realized and realized texts are engaged in divergent ‘fields of activity’, that of announcing and explaining respectively. Different communicative purposes point to different psychological intentions spurring the production of each pledge type, potential evidence that violent intent is indeed detectable in the language of pledges to harm.
- The Ayia Napa Statements
Date: 4 February 2021
Speaker: Andrea Nini, University of Manchester
- Using grammar as a shield in threatening, hateful online communications
Date: 28 January 2021
Speaker: Tanya Karoli-Christensen, University of Copenhagen
Reflections from the event by Felicity Deamer, Aston Institute for Forensic Linguistics.
Tanya’s presentation allowed us to take a step back, and gain some conceptual clarity over harmful forms of speech online. Although the primary focus of the talk was to foreground and tease apart three distinct forms of illegal speech online, Tanya shone the light a little wider, and drew our attention to the psychology of online communication, and how hateful and harmful communication is born out of and allowed to proliferate in the virtual world.
Tanya introduced us to threats, hate speech, and incitement, and showed us how they are closely connected, but also how they come apart, and as such need to be considered separately. Referencing a unique data set of reports of hate speech in Denmark (provided by the Center for Prevention of Exclusion), Tanya picked apart the grammatical profile of a large set of hateful and threatening online communication to reveal specific features, which Tanya argues are suggestive of specific devices being employed by authors of online hate speech, in order to distance themselves from the harm they wish upon their victims (and others by association).
Tanya’s presentation shed light on the form and function of online harmful communications by mapping linguistic analysis onto contemporary thinking surrounding the motivations behind the damaging use of social media platforms.
- The use of corpora in many types of forensic linguistic analysis is becoming increasingly commonplace”: A decade on from Cotterill (2010).
Date: 10 December 2020
Speaker: David Wright
In her excellent chapter on forensic linguistics in O’Keeffe and McCarthy’s Routledge Handbook of Corpus Linguistics, Janet Cotterill (2010) outlines the various ways in which corpora and corpus linguistic methodologies have been (and can be) applied in forensic linguistics. She includes in her discussion the use of already existing general reference corpora by forensic linguists, the building of specialised corpora in forensic contexts and the use of web-as-corpus. The chapter concludes by detailing some challenges of using corpora for forensic purposes and predicting future challenges. This talk revisits Cotterill’s chapter and takes stock of the work that has been done in the ten years since its publication. It considers the potential that has been fulfilled, the opportunities that remain and the new challenges that have emerged in the application of corpus methods in forensic linguistics.
- Changing Landscape of Advice Provision: The language of legal advice on social media run by lay advisers
Date: 26 Nov 2020
Speaker: Tatiana Tkacukova, Birmingham City University
In her presentation, Dr Tkacukova conducted a linguistic and socio-legal analysis of online forums for Litigants in Person (People representing themselves in court - LIP). Her study focuses on forums and social media groups run by McKenzie Friends (litigation friends that help people represent themselves in court on a voluntary basis or for a fee - MFs). She uses Corpus Linguistics (Sketch Engine) and a qualitative approach (content analysis) to explore the corpora and LIP’s concerns and advice need. Her study highlighted functions performed by MFs on social media and the quality of MFs’ advice.
Her interesting quali-quantitative analysis reveals many characteristics that shed lights on positioning of both LIP and MFs. For example, LIP’s subcorpus shows a preponderance of N-gram with negation of abilities or wishes, lack of knowledge… Conversely, MFs use expression of support, certainty, advice etc. MFs’ subcorpus also shows that MFs highlight difficulties of LIPs with legal discourse and construct their professional image by advertising their services and positioning themselves as trusted experts.
Reflections from the event below are by Leigh Harrington, Research Associate, Aston Institute for Forensic Linguistics.
This interdisciplinary project focussed on issues of access to justice for the general public, namely legal advice provision on social media. Recent cuts to legal aid in civil and family cases have led to an increase in people representing themselves in court, known as Litigants in Person (LiPs). There has consequently, also been an increased use of alternative sources of DIY law online which legal information and advice, including McKenzie Friends (MFs), lay advisers or litigation friends (mostly without a formal legal background) who provide LiPs with help and support on a voluntary or paid basis.
This talk explored the forums and social media groups ran by MFs where these lay advisers interact with LiPs seeking help. Dr Tkacukova’s data for this study comprised of a corpus of exchanges on these online platforms between these different users. Using a combination of corpus analysis and then content analysis, the talk explored the concerns and advice needs of LiPs and the functions of MFs and the quality of their advice.
Initial corpus analysis of the LiP subcorpus revealed N-grams which negated abilities or wishes, such as lack of knowledge. These highlighted the reduced legal capabilities of LiPs, their potential vulnerability, and need for expert advice. Corpus techniques were also used to define the roles and functions of MFs, which mainly entailed explaining court procedures and the role of social services, setting expectations for LiPs, and advising them. The analysis found that MFs perform these functions services from a closer socially and linguistically defined position than would be expected from legal professionals. Dr Tkaculova showed that whilst MFs construct their professional image as trusted experts and often provide useful procedural information and clarifications, their legal advice is often problematic in terms of its substantive content (it can obstruct justice, mislead, or be technically correct whilst giving unrealistic advice) and linguistic framing (for instance, through defamatory comments).
Dr Tkacukova commented that whilst these lay advice platforms have their advantages, they can also give rise to unfounded trust between users, which can be to the detriment of LiPs who are socially and emotionally vulnerable. She recommended an increase in public awareness of the advantages and limitations/potential risks of MFs, the roles and functions of MFs, and strategies for how to identify biased advice.
- Disinformation online: social media users’ motivations for sharing false content
Date: 19 Nov 2020
Speaker: William Dance, Lancaster University
In his presentation, William Dance from Lancaster University discussed social media users’ motivations for sharing false content online. William first explained the difference between disinformation, misinformation, and fake news, arguing that the concept of disinformation can be best defined as intentionally factually incorrect news that is published to deceive and misinform its reader. The second part of the talk introduced the corpus that William built from Tweets containing URLs of disinforming news. Finally, William gave an overview of the semantic fields identified in the corpus and discussed some strategies social media users utilise when sharing disinformation, including concession, dramatic amplification, hypotheticals, presupposition violation, and omission.
- Everybody does things differently: Evidential consistency and the production of official police interview records
Date: 12 Nov 2020
Speaker: Kate Haworth, Aston Institute for Forensic Linguistics
This presentation was given to the Preston Linguistics Circle, hosted by UCLAN, at the invitation of Dr Dominik Vajn. It considers a vitally important but generally undervalued aspect of investigative interviewing, namely the process of converting the spoken interview interaction into an institutionally approved, written evidential document. Formal interview records have significant legal standing in the criminal justice system. Around the world, various different approaches are taken to producing them, some significantly more reliable than others. The UK system of routinely audio-recording and transcribing all police-suspect interviews is often regarded as an example of best practice. However, this paper demonstrates that even such an apparently robust method of processing linguistic evidence is still problematic, and argues that contamination and bias are currently institutionally embedded in the system.
The full abstract for the presentation:
This presentation considers a vitally important but generally undervalued aspect of investigative interviewing, namely the process of converting the spoken interview interaction into an institutionally approved, written evidential document. Formal interview records have significant legal standing in the criminal justice system. Around the world, various different approaches are taken to producing them, some significantly more reliable than others. The UK system of routinely audio-recording and transcribing all police-suspect interviews is often regarded as an example of best practice. However, this paper demonstrates that even such an apparently robust method of processing linguistic evidence is still problematic, and argues that contamination and bias are currently institutionally embedded in the system.
I will present the key findings of my research into the production of police interview records in England & Wales, grounded in both academic linguistic theory and professional practice. Uniquely, it includes interviews with interview transcribers at a major English police force, offering a perspective which has hitherto received scant attention, despite the enormous practical impact of their work. Indeed, a key part of this research project is to give voice and recognition to this much under-valued group of workers, whose very existence is often entirely overlooked, yet whose work holds the key to fairer representation of interviewees’ voices in the criminal justice process.
I will show how transcribers deal with aspects known from linguistic research to convey substantial meaning, but for which there are currently no standards regarding their representation in official transcripts. These include pauses, discourse markers, ‘no comment’ interviews, and transcription of video-recorded data. This is combined with linguistic analysis of authentic interviews and their official transcripts, and legal analysis of potential consequences in court of the representational choices which transcribers are tasked with making on a daily basis.
The presentation will conclude with practical recommendations as to how to improve evidential consistency in investigative interview data, thereby setting out a manifesto for the new centre for Spoken Interaction in Legal Contexts (SILC) within Aston’s Institute for Forensic Linguistics.
- How will your emergency be interpreted?
Date: 13 Nov 2020
Speaker: Joanne Traynor, Anglia Ruskin University
In this presentation, Joanne Traynor, a PhD student at Anglia Ruskin University presented her work exploring the factors which may influence communications officers as they code and interpret police incident logs. Her experience of working in this context, combined with her mixed method approach made for an extremely insightful presentation.
Joanne seeks to bring voice to the communications officers, highlighting how linguistic focus in this area – often undertaken by Conversation Analysts – does not examine how communications officers perform and interpret their role outside of the telephone calls in the police control room.
- Discursive orientations towards age in police interviews with 17- and 18-year-old suspects
Date: 8 Oct 2020
Speaker: Dr Annina Heini
This presentation was based on Annie’s recently completed PhD research, which investigated discursive manifestations of the statutory child-adult divide in police interviews with 17- and 18year-old suspects. In the context of her police interview research, Annie is particularly interested in language ideologies in connection with age, the administration of the police caution, and the discursive role of appropriate adults in interviews with vulnerable suspects.
- Police interviewing between guidance, training, and 'as it happens': Insights from conversation analysis
Date: 27 Feb 2020
Speaker: Liz Stokoe & Charles Antaki, Loughborough University
In this presentation, Liz described her collaborative research that, across multiple institutional settings, shows how written guidance, standardized scripts and protocols are actually turned into talk. Shed analysed multiple datasets including police interviews with alleged suspects and victims, crisis negotiations, doctor-patient interaction, and simulated client/mystery shopper encounters. She showed that a) the empirical reality of apparently scripted talk often differs wildly, and consequentially, from the script or guidance and b) effective practice that can be identified by conversation analysis seldom finds its way into guidance. She discussed the ethical and epistemic implications of simply not knowing how apparently guided, scripted, standardized is manifested in real talk.
- I could kill them, I said. I didn’t say I would’: Threats on trial – the role of intent in cases involving threatening communications
Date: 13 Feb 2020
Speaker: Marie Bojsen-Møller, PhD fellow, Department of Nordic Studies and Linguistics, University of Copenhagen, Denmark
Threats constitute what may be termed an illicit genre (Bojsen-Møller, Auken, Devitt & Christensen, 2019),
,since they are often socially and sometimes legally proscribed (Fraser 1998; Gales 2010; Muschalik 2018). The (il)legality of a threat is dependent on legislation (Solan & Tiersma 2005), and, notably, on the emphasis legislation and precedent have placed on threateners’ intent.
However, being ultimately a psychological state, the intent is notoriously difficult to assess (cf. Hurt & Grant 2018), and in court, defendants may claim that they never intended to threaten. Furthermore, they can use more or less persuasive linguistic strategies to distance themselves from the language crime they are accused of committing, particularly if the wording of the threat was indirect. Indirect threats are particularly difficult to prosecute and penalize, since reasonable doubt may be raised regarding their intended meaning, possibly allowing the sender a recourse to ‘plausible deniability’ (Solan & Tiersma 2005).
In this seminar, Marie Bojsen-Møller discusses her new paper, which takes its starting point by comparing the role of ‘intent’ (mens rea) in Danish, UK and US legislation or case law on threats:
- Danish Criminal Code, § 266
- British Offences Against the Person Act 1861, Section 16
- US ‘true threat’ case law: e.g. Watts v. United States 1969; Elonis v. United States 2015; Perez v. Florida 2017).
Bojsen-Møller further examines several Danish court cases which have threatening messages at the heart of the cases and focuses on the appeals to defendants’ intent as argued by prosecutors, defence lawyers, defendants and judges.
Q&A with Marie Bojsen-Møller
1. Is there a specific threat letter or message you came across during your research that really stood out to you? And why?
I would say the teenage girl who on Instagram wrote “I’ll be the next school shooter LMAO watch out”. That was one of the threats that stuck with me the most because there is an aspect of youth, just trying to test the limits of discourse. It’s difficult to say what their purpose really was, if it was just to be a part of a group that has a very extreme way of talking to each other or if it was the beginning of thoughts about actually doing a school shooting. We don’t know, and that really interests me because a lot is at stake. It’s the difference between ignoring someone who’s going to kill people or over-reacting and maybe creating people who are angrier because she must be angrier now that she has been in jail for a long time and she was kicked out of her high school and so on. So yeah, there’s a lot at stake. There’s a lot on the line.
2. In your opinion, what is one of the most common misconceptions laypeople (in DK) have about threats and threatening communications?
People misconceive threats, believing that the only force they have is the potential of something else happening, which is the thing they are threatening to do. Threats in themselves are also an act of violence, a linguistic act of violence. I think that’s very important to understand because it underlines a very basic thought in linguistics, that languages act. It’s not something abstract where you don’t do anything. That’s why I like to show these examples where people say “I didn’t do anything” and they did. I’m not saying they were going to do something else, but they did try to intimidate someone and that’s a crime in itself in many countries.
3. Are lawyers in DK aware of the work forensic linguists do and its benefits for the legal system?
No, not at all. Lawyers know a lot about language, we just know other aspects of it, and it’s difficult for them to understand. It’s not the same as saying that we know more about how to do a proper cross-examination or something; It’s more something about analysing the ways that language works.
- Devoted Users – the 2019 EU elections and gamification on Twitter
Date: 6 Feb 2020
Speaker: Francesco Grisolia, University of Pisa
Hybrid political campaigns can be influenced by gamification strategies, that is the use of video game elements in non-gaming domains. In recent years, gamification has been applied to the realm of politics with the intended goal of increasing voter engagement and citizens’ participation.
In Francesco’s talk, he presented the results of collaborative research on the last EU elections campaign in the Italian Twittersphere. He focused on the online activity of Matteo Salvini, former Italian Interior Minister and leader of the League (Lega), and a specific social media contest, Vinci Salvini! (“Win Salvini!”), whose second edition was launched three weeks before the 2019 EU Elections.
He discussed the impact of such gamification strategy, clarifying how it increased the volume of Salvini’s retweeted tweets and, in turn, his message visibility. Francesco identified a small but particularly active group of suspect users, which he labelled as devotees for their commitment and intense retweeting activity in the run-up to the elections. They share some characteristics (account creation date, type and amount of followers and friends, etc.) and, most fundamentally, reveal an ambivalent nature. On the one hand, they show a quasi-automated behaviour, systematically liking and retweeting any content shared by Salvini. On the other hand, they manifest distinctive human features (type of replies). Being driven by political affinity and the potential reward of a social media contest, we think they represent a peculiar type of crowdsourced political agents.
- Rhetorical moves, online grooming and the performance of offenderness
Date: 30 Jan 2020
Speaker: Emily Chiang, Aston University
One of the most unfortunate consequences of the internet is that child sexual abusers have various online channels through which to communicate with victims and other offenders. Emily’s talk demonstrated the use of move analysis (Swales, 1981; 1990) as applied to two types of online child abuse interaction and how it might assist in online police investigations. Emily first considered offenders' moves in 'grooming' conversations, revealing patterns in move use, which pointed to individual ‘styles’ of grooming. Secondly, she focused on interactions between suspected offenders and undercover police officers posing as offenders, showing how interactants' moves work towards the performance of the offender identity, and how this performance compares across the two groups.
- Non-Violent Direct Action and the Courts: Necessity, Remorse, and the Right to Protest
Date: 23 Jan 2020
Speaker: Graeme Hayes
The criminal trials of direct action protesters are in many ways extraordinary episodes within the criminal justice process. Here, the protection of philosophical belief required by the ECHR and Human Rights Act 1998 vouchsafe the sincerity of the defendants’ commitment: as a result, remorse displays are not central to mitigation decisions, while defendants may seek to justify their actions through pleas of necessity.
Yet socio-legal discussion of these issues is often absent, and discussion of necessity are characteristically highly normative; whilst practical questions of the protection of philosophical belief at trial typically focus on sentencing, at the expense of a focus on the conduct of the trials themselves. Graeme used a forensic sociology lens to the trials of activists, to discuss how justification and excuse, remorse and recidivism are constructed, and how processes of separation and remediation are performed and imposed in the courtroom space. He applied this lens to two recent high profile activist trials in the English courts: the trial of the Stansted 15 and the appeal trial of the Frack Free Three. In both cases, the charges brought against the activists were much more severe than those typically experienced by non-violent direct action protesters. Yet, both ultimately resulted in lenient sentencing, thus apparently upholding ECHR-protected rights to freedom of speech and assembly. Through ethnographic observation and discussion of legal decisions, he argued however that the conditions and conduct of these trials, particularly concerning questions of remorse, necessity, duress, and good character, effectively serve to narrow rather than uphold the expression of Convention freedoms. Specifically, he argued that the emphasis on remorse in the Frack Free Three ruling, and the interpretation of necessity as duress (and its subsequent effective unavailability) in the Stansted trial effectively forces activists to divest their political beliefs as a cost of securing their liberty. As such, he argued that these prosecutions and the terms of their outcomes should be considered serious acts of the chilling of dissent.