1 Introduction

Modal auxiliaries have always been a central point in the study of language change and variation, yet nonstandard forms of core modals have mostly been overlooked by researchers. While there have been studies on the use of informal semi-modals (cf. Krug 2010; Mair 2015), no comparably extensive research has been done for informal core modals. For this reason, I chose to analyze the use of the standards forms should have, would have and could have as well as corresponding informal forms shoulda, woulda and coulda. Furthermore, this study examines modals across two varieties, namely Great Brit­ain and the United States. Considering the ‘standard’ language of these dialects is rather established, it is particularly interesting to analyze the use of relatively unstable nonstand­ard forms. Informal modals can mostly be found in colloquial speech and texts, whereas standard forms are predominantly used in more formal instances:

(1) I shoulda seen it coming.
(2) I coulda helped you.
(3) She woulda gone to the movies with us, but she is busy tonight.
(4) He should have listened to you better then you could have avoided a fight.
(5) We would have invited you, but we thought you were on vacation.

In Section 2, I will provide background information on this topic and describe previous studies on modal auxiliaries and nonstandard language in Great Britain and the United States. Details about the selected corpus as well as an overview of the specific data, its collection and statistics will be given in Section 3. The analysis of the collected data and its results are discussed in section 4. Section 5 concludes this paper with a sum­mary on the findings and recapitulatory remarks. The aim of this study is to find out to what extent US-Americans prefer formal or informal verb forms (such as should have or shoulda) in their language use by comparing it to the preference of these verb forms in Great Britain because I want to analyze the degree of acceptance of informal language use in the United States and Great Britain.

2 Background

Modal auxiliaries are of central importance in the English language. In a previous study, Krug (2010: 23f.) observes that among the 30 most frequently used verbs in British English, half of them are modal auxiliaries. This study focuses on the commonly used modals could have, should have and would have, all of which appear among these 30 most frequent verbs (without the nonfinite participle have, although). As per Freudinger (2017: 320), shoulda, woulda and coulda are “shortened forms of the modal auxiliaries should, would, and could together with the verb have in the auxiliary function”. Unfortu­nately, there is no uniform definition of modal auxiliaries. Merriam-Webster.com broadly characterizes a modal auxiliary as “an auxiliary verb [...] that is characteristically used with a verb of predication and expresses a modal modification and that in English differs formally from other verbs in lacking -s and -ing forms”. According to Quirk et al. (1985: 137), could, should and would are classified as central modals and are therefore regarded as ‘full’ auxiliaries. Economy is a highly significant principle in language change, thus these contracted forms follow the sub-principles of minimal effort and easy articulation (cf. Krug 2001: 313).

The process of frequently occurring sequences becoming single units, such as could have being rendered to coulda, is called chunking (cf. Krug 2001: 321), which can in turn be considered part of the simplification process in language change. Especially regarding modals, there is a tendency in spoken English to delete consonant chunks around word boundaries and replace them with an -a, which is a result of the general preference of weak forms in speech (cf. Bloomer 1998: 223).

A study by Leech (2012: 71ff.) determined that the use of core modals in general is de­clining. This applies to both American and British English, where the decline is occurring at a similar rate. The decreasing numbers may be due to the use of semi-modals becoming more popular than core modals, especially in American English, which is considered highly influential for language change (cf. Leech 2012: 79). For this reason, these changes have not affected British English yet, as colloquialization, particularly an increased used of semi-modals, is prevalent in AmE. Leech (cf. 2012: 239) describes colloquialization as “writing becoming more like speech” and as a trend in both written and spoken English. Furthermore, the ongoing process of grammaticalization in which elements of vocabulary are gradually becoming classified as grammatical forms, can be observed in the form of nonstandard spellings and ellipsis (cf. Svartvik and Leech 2010: 206). Yet is more visible in spoken language, as written English takes longer to adapt new grammatical elements (cf. Leech 2012: 237). The current development of Americanization is especially notable in the follow-my-leader pattern, where “one variety, moving in the same direction as the other, takes the lead, which the other follows” (Leech 2012: 253). This can be applied to the use of shortened forms, such as coulda, where these forms are more frequent in Amer­ican English than in British English. Moreover, it indicates a leadership position of the former. Other varieties, particularly BrE, are following the changes in the English lan­guage, which at first occur in the American dialect (cf. Leech 2012: 253f.). The influence of American English is not just restricted to language, but can rather be defined as a so- ciolinguistic one, where the dominance of Americanism is prevalent in various aspects of another country’s culture (cf. Schneider 2013: 51f.). Baker (2017:237) describes the pro­cess of Americanization as the “dominance of American English in terms of being at the forefront of change at the grammatical level” and points out that the term ‘leader’ implies that someone is following that leadership. This is a valuable notion, as the US is known for presenting itself as a forerunner, but also challenges the assumption in how far the UK wants to follow this lead.

Since this study focuses on standard and nonstandard forms, it is to mention that a “stand­ard” American English, like British RP, does not exist because the formality status of a situation and the speaker’s educational status are linked to regional origin (cf. Schneider 2013: 81). However, there is a general understanding regarding modal auxiliaries that woulda, shoulda and coulda are considered informal. In contrast, nonstandard grammar is usually socially marked and does not rely on the regional marker (cf. Schneider 2013: 82f.).

When comparing nonstandard forms of modals, it is worth mentioning a study by Freudinger (2017), which also examined shoulda, coulda and woulda in British and American English. Freudinger (2017: 336) concluded that these nonstandard forms have been stable since the 1980s and have reached a certain level of acceptance. Additionally, Freudinger (2017: 336) distinguishes between canonical and non-canonical forms, which relies on the criteria of frequency. The higher the frequency of a word the more canonical it is (cf. Freudinger 2017: 322f.). In the aforementioned study, the frequency of standard and nonstandard modals was tested across several corpora, including GloWbE. Overall, standard forms were significantly more frequent and thus more canonical. Furthermore, Freudinger (2017) described that nonstandard forms of these modal auxiliaries are more popular in American English. Regarding register, shoulda, coulda and woulda are mostly used in fictional writing, particularly fictional dialogue, which mirrors spoken language, whereas the standard forms predominantly appear in narrative sequences (cf. Freudinger 2017: 329f.). When examining the context, Freudinger (2017: 331f.) also observed that the nonstandard expressions are sometimes used non-canonically without subjects or lex­ical verbs. In contrast to this, full forms are generally used with a subject and are followed by a lexical verb since they function as an auxiliary. Additionally, word class membership of the informal forms was analyzed. Freudinger (2017: 333f.) concluded that the expres­sion ‘shoulda, coulda, woulda ’ periodically functions as a noun, interjection or adjective. This is particularly interesting as this paper will analyze the collocations of these non­standard forms. An idiomatic use may offer valuable clues to the question in which in­stances speakers prefer informal forms and whether there is an underlying meaning be­hind the forms’ general denotation.

Overall, modal auxiliaries are a complex field of study, which has been heavily researched over time and offers extensive insight into language change and variation. Yet nonstand­ard forms have not been in the center of research, therefore this study aims to shed more light onto their role in the English language. This paper follows up several studies that have also used British and American English as subjects of their research, since there is a substantial amount of data available and both varieties have a dominant status in the English language.

3 Methods and Data

For this study, the two-billion-word corpus of Global Web-based English (GloWbE) was selected. This corpus consists of texts only, 60% of which were extracted from informal blogs, the remaining 40% are from other web-based, more formal materi­als. Altogether, data for 20 different varieties, with 14 Inner Circle and six Outer Circle countries, is available for research. In total, the word count of the US-American and Brit­ish corpus is around 387 million words each, therefore both can be compared regarding the scope. GloWbE is considered one of the largest corpora available for these varieties, as the well-known British National Corpus, for example, only contains around 100 mil­lion words and the Corpus of Contemporary American English is comprised of 560 mil­lion words (cf. Davies). Due to the fact that solely the US and British corpora have enough data available to get a significant amount of results for each variable, they were chosen for this specific study. Furthermore, GloWbE was released in 2013, and consequently contains very recent data. This is particularly useful for this study, as the informal modal verb forms are considered a modern phenomenon. Unfortunately, GloWbE does not pro­vide any further information on demographic characteristics. In addition, there is a poten­tial risk that writers of a specific variety may not be native speakers of this variety, but the creators state that they ensured the texts are produced by speakers of the respective varieties. Even though this corpus contains written content only, the data from more informal web sites, such as blogs, is not a disadvantage since this study focuses on infor­mal language. In fact, “GloWbE is the ideal corpus for an analysis of regional lexical features.“ (Loureiro-Porto 2017: 460). Considering the goal of the study is to find out to what extend Americanization is occurring in Great Britain, it is particularly valuable that this corpus is more likely to contain Americanisms, as it features recent data from the web, which is known for adapting new trends fairly quickly (cf. Loureiro-Porto 2017: 463).

Since it is not possible to examine all three variables at the same time, each one was analyzed individually. First, the variable was typed into the standard query, then fil­tered by distribution. Afterwards, each variable was evaluated by number of hits and fre­quency per million words in the respective category (Great Britain or US) and then com­pared to the frequency per million words of its standard form (e.g. shoulda and should have). Following this step, the hits in each category were analyzed in greater detail by evaluating the collocations of each variable. The collocation span was set to five words to the left and right, with the MI-Score selected to measure the association score. The Mutual-Information value indicates the effect size, which is preferable in this case be­cause it focuses on how strongly the two words are attracted to each other and on more idiomatic co-occurrences. Since idiomatic speech is a major part of colloquial, nonstand­ard language, the MI-Score can be regarded as a valid choice for this study. One downside of this measurement tool, however, is the tendency for misleading scores of low frequency words because it ignores how much evidence is present for this collocation (cf. Hunston 2002). As a result, the minimal collocation frequency was set to five and only results with a score ≥ 3 were taken into account to lessen excessively high scores if the collocates’ frequency is low.

The downloaded text files were then coded into tables and edited to exclude punc­tuation marks and other symbols. In addition, register and lexical category of each collo­cation were added as further comparison measures. The latter should provide a more lin­guistic analysis of these results, while the MI-Score provides a more mathematical view­point. The number of collocations per table ranges from four to 18. The low amount is due to the total hits for each variable, which are significantly low for the British variety (94 on average), whereas the number of hits of the US-American dialect noticeably higher (259 on average) but compared to the overall size of the respective sub-corpora, these numbers are rather small. This supports the aforementioned point that only GloWbE is large enough for a significant analysis of these variables.



