• unlimited access with print and download
    $ 37 00
  • read full document, no print or download, expires after 72 hours
    $ 4 99
More info
Unlimited access including download and printing, plus availability for reading and annotating in your in your Udini library.
  • Access to this article in your Udini library for 72 hours from purchase.
  • The article will not be available for download or print.
  • Upgrade to the full version of this document at a reduced price.
  • Your trial access payment is credited when purchasing the full version.
Buy
Continue searching

Modeling Lexical Diversity Across Language Sampling and Estimation Techniques

ProQuest Dissertations and Theses, 2011
Dissertation
Author: Gerasimos Fergadiotis

TABLE OF CONTENTS

Page

LIST OF TABLES ................................ ................................ ................................ ... vi ii

LIS T OF FIGURES ................................ ................................ ................................ ... xi

CHAPTER

1

LITERATURE REVIEW ................................ ................................ ...... 1

Lexical Diversity Research in Communication Disorders ................ 2

Defining Lexical Diversity ................................ ................................ . 8

Estimating Lexical Diversity In Language Samples ....................... 1 1

Type Token Ratio ................................ ................................ ... 1 2

Quantifying Lexical Diversity Using Sophis ticated Measures ....... 20

D ................................ ................................ .............................. 20

Measure Of Textual Lexical Diversity ................................ ... 23

Maas ................................ ................................ ........................ 24

Moving Average Type Token Ratio ................................ ....... 26

Eliciting Languag e to Measure Lexical Diversity ........................... 27

Effects o f Language Sampling Techniques ............................ 27

Co n text and Discourse Production ................................ ......... 30

Validity ................................ ................................ ............................. 3 2

Statement of the Problem ................................ ................................ . 37

Go als of the Study ................................ ................................ ... 42

2

METHOD ................................ ................................ ............................ 4 5

Participants ................................ ................................ ....................... 4 5

vi

CHAPTER Page

Discourse Elicitation ................................ ................................ ........ 4 5

Stimuli and Instruct ions ................................ .......................... 4 5

Transcription ................................ ................................ ........... 47

Estimating Lexical Diversity ................................ .................. 48

Modeling App roach ................................ ................................ ......... 48

Multi - Trai t Multi - Method Approaches ................................ .. 54

Corelat e d Traits – Correlated Methods ........................... 56

Correlated Traits - Correlated Uni qu enesses ................. 58

Modeling Level 1 ................................ ................................ .... 60

Hierarchical Factor Ana lysis ................................ .................. 62

Addressing Aim 2 ................................ ................................ ... 62

3

RESULTS ................................ ................................ ............................ 68

Preliminary Analyses ................................ ................................ ....... 68

Main Analyses ................................ ................................ .................. 70

L evel 1 ................................ ................................ ..................... 71

Level 2 ................................ ................................ ..................... 81

Post Hoc Analyses ................................ ................................ ............ 83

4

DISCUSSION ................................ ................................ ...................... 90

Level 1 ................................ ................................ .............................. 90

MATTR and MTLD ................................ ................................ 96

D ................................ ................................ ............................... 99

Maas ................................ ................................ ....................... 101

vii

CHAPTER

Page

Th e Nature of the Method Factors ................................ ................. 102

Clinical and Research Implications : Level 1 ................................ . 107

Level 2 ................................ ................................ ............................ 110

Story telling and Eventcasts ................................ .................... 113

Procedures ................................ ................................ ............... 116

Recounts ................................ ................................ ................. 119

Clinical and Research Im plications : Level 2 ................................ . 124

Conclusions and Future Directions ................................ ................ 126

REFERENCES ................................ ................................ ................................ ....... 13 2

viii

LIST OF TABLES

Table

Page

1.

Transformations of TTR

................................ ................................ ....

14 6

2.

Participants’ Demographic Information

................................ ............

147

3.

Descriptive Statistics of the Untr ansformed Major Study Variable before the Removal of Outliers

................................ .........................

148

4.

Descriptive Statistics of the Number of Types, Tokens, and

Type - Token Ratios for Each Type of Discourse

..............................

149

5.

Patterns of M issing D ata B efore and A fter the R emoval

of outliers for each Type of Discourse

................................ ..............

150

6.

Descriptive Statistics of the Untransformed Major Study

Variables after the Removal of Outliers

................................ ...........

151

7.

Variance - Covariance Matrix of Lexical Diversity Variables

...........

152

8 .

Covariance Coverage

................................ ................................ ........

153

9 .

Solution for Model 1

................................ ................................ .........

154

10 .

Unst andardized Solution for Model 2

................................ ...............

155

11 .

Standardized Solution

for Model 2

................................ ...................

156

12 .

Unstandardized Factor Loadings for Model 3

................................ ..

157

13 .

Standardized Factor Loadings for Model 3

................................ ......

158

14 .

Intercorrelations Among Discour s e - Specific Factors in Model 3

....

159

15 .

Standardized Factor Loadings and Residual Variances

for Model 4

................................ ................................ ........................

160

16 .

Intercorrelations Among Discour s e - Specific Factors in Model 4

....

161

ix

Table

Page

17 .

Unstandardized Factor Loadings and Residual Variances

for Mode l 4a

................................ ................................ ......................

162

18 .

Standardized Factor Loadings and Residual Variances

f or Model 4a

................................ ................................ ......................

163

19 .

Intercorrelations Among Discoure - Specific Factors in Model 4a

...

164

20 .

Unstandardized Factor Loadings and Residual Variances

for Model 5

................................ ................................ ........................

1 65

21 .

Standardized Factor Loadings and Residual Variances

for Model 5

................................ ................................ ........................

166

22 .

Intercorrelations Among Discourse - Specific Factors in Model 5

....

167

23 .

Unstandardized Factor Loadings and Residual Variances

for Model 6

................................ ................................ ........................

168

24 .

Standardi zed Factor Loadings and Residual Variances

for Model 6

................................ ................................ ........................

169

25 .

Intercorrelations Among Discoure - Specific Factors in Model 6

.....

170

26 .

Variance Decomposition for Model 6

................................ ..............

171

27 .

Unstandardized Factor Loadings and Residual Variances

for Model 7

................................ ................................ ........................

172

28 .

Standardized Factor Loadings and Residual Variances

for Model 7

................................ ................................ ........................

173

29 .

Variance Decomposition for Model 7

................................ ..............

174

x

Table

Page

30 .

Model Fit for Models 3 an 6 Applied to Complete and Truncated Language Samples

................................ ................................ ............

175

31 .

Standardized Factor Loadings and Residual Variances

for Model 6(2)

................................ ................................ ...................

176

32 .

Intercorrelations Among Factors in Model 6(2)

...............................

177

33 .

Unstandardized Factor Loadings for Model 3(2)

.............................

178

34 .

Standardized Factor Loadings for Model 3( 2)

................................ .

179

35 .

Intercorrelations Among Discourse - Specific Factors

in Model 3(2)

................................ ................................ ....................

180

36 .

Standardized Factor Loadings and Residual Variances

for Model 6e

................................ ................................ ......................

181

xi

LIST OF FIGURES

Figure

Page

1.

Estimating D ................................ ................................ ........................

182

2.

Measure of textual lexical diversity flow chart ................................ .

183

3 .

Emprical type - token ratio and fitted logarithmic curve ....................

184

4 .

The Moving average type - token ratio in action ................................

185

5 .

Toulmin’s argument structure ................................ ............................

186

6 .

Latent Variable Modeling

................................ ................................ ..

187

7 .

A unidimensional measurement model of lexical diversity ...............

188

8 .

An argument structure for LD

................................ ...........................

189

9 .

Model 1

................................ ................................ ..............................

190

10.

Model 2

................................ ................................ ..............................

191

11.

Model 3

................................ ................................ ..............................

192

12.

Model 4a

................................ ................................ ............................

193

13.

Model 5

................................ ................................ ..............................

19 4

14.

Model 6

................................ ................................ ..............................

195

15.

Model 7

................................ ................................ ..............................

196

16.

Model 3(2)

................................ ................................ .........................

197

17.

Toulmin’s argument structure for the best combination of

language sampling and estimation technique

................................ ...

198

1

Chapter

1

Literature Review

Discourse is a naturally occurring form of communication that entails the activation and interaction of mul tiple interconnected cognitive and linguistic subsystems. Because of this, discourse analysis offers an opportunity to observe complex cognitive/linguistic behaviors. And further, it carries the potential of allowing clinicians and researchers to conduct a

wide variety of analyses to understand the nature of cognitive - communicative deficits and age - related changes.

It is not surprising then that eliciting and analyzing language samples has been gaining prominence among clinicians and researchers. Language s ample analysis has been used as a clinical tool for differential diagnosis (e.g., Fleming & Harris, 2008; Murray, 2009), a key indicator for determining the efficacy of treatment approaches for individuals with aphasia (e.g., Cameron, Wambaugh, Wright, & N essler, 2006; del Toro, Altmann, Raymer, Leon, Blonder, & Rothi, 2008; Rider, Wright, Marshall, & Page, 2008) as well as an indicator of social validity (e.g., Ballard & Thompson, 1999).

Various content analyses are often used to evaluate the microlinguist ic processes that give rise to specific discourse features. Examples include assessments of informativeness and efficiency of a speaker’s production. The focus of this paper is on one of the most illuminative predictors of oral performance, lexical diversi ty

(LD). LD has been defined broadly as ‘…something [related to] the range of vocabulary displayed’ in different

2

instantiations of discourse (Durán, Malvern, Richards, & Chipere, 2004; pp. 220). LD has been linked to a wide variety of variables, such as vo cabulary knowledge, writing quality, school success, and general characteristics of verbal competence (Avent & Austermann, 2003; Carrell & Monroe, 1993; Grela, 2002; Ransdell & Wengelin, 2003; Verhallen & Scoonen, 1998).

Within the domain of speech - langua ge pathology specifically, LD has been used to ask a wide range of questions in various populations. In the following section, I provide some illustrative examples of how LD has been used and why. Emphasis is placed on applications that focus on the study of language samples for research or clinical purposes within the field of communication disorders. Then, LD will be more formally defined for the purposes of this paper.

Lexical Diversity Research in Communication Disorders

Several studies have focused on

whether LD can be used to help differentiate typically developing children (TD) from children with specific language impairment (SLI) (e.g. Kapantzoglou, Fergadiotis, & Restrepo, 2010; Klee, 1992; Klee, Stokes, Wong, Fletcher & Gavin, 2004; Owen & Leonard , 2002; Thordardottir & Namazi, 2007; Watkins, Kelly, Harbers, & Hollis, 1995). For example, Owen and Leonard (2002) analyzed spontaneous language samples from play interactions and found that younger and older children with SLI differed from their age ma tched peers in terms of how lexically diverse their language samples were. Klee et al. (2004) found similar results when they assessed whether Cantonese - speaking children (27 - 68 months old) with and

3

without SLI differed in terms of LD. Based on their findi ngs they concluded that LD could be used to accurately differentiate the two groups. This finding was replicated by Klee, Gavin and Stokes (2007).

Kapantzoglou et al. (2010) used two different tasks to elicit language samples in predominately Spanish - spea king children with and without SLI and explored the classification accuracy based on LD. They compared performance on spontaneous and retell

story

tasks and found that the type of language elicitation procedure influenced LD scores, which in turn may influ ence classification accuracy. Also, children with SLI demonstrated low LD scores regardless of the type of the task but TD children performed significantly better than the SLI group on the retell story task only.

In research with children with hearing impa irment, LD has often been used as a criterion for evaluating the development of expressive language skills and their improvement after cochlear implantation. Ertmer, Strong, and Sadagopan (2002) examined the language progress of a young, profoundly hearing - impaired girl who had been fitted with a cochlear implant when she was 20 months old. Ertmer et al. used LD measures to quantify the participant’s vocabulary growth and compared her spoken output with that of normally developing children. Ertmer et al. we re able to document to some extent the developmental trend of vocabulary growth after activating the cochlear implant and pointed out the need for “…longitudinal studies and age - at - implantation comparisons […] to increase understanding of the effects of ea rly implantation on oral language development” (p. 338).

4

Dillon and Pisoni (2003) explored whether lexicon size, as reflected in LD scores, mediates the relationship between non - word repetition tasks and reading skills in children with cochlear implants. T heir study was based on the hypothesis that children’s ability to represent phonological units separately from words emerges as a consequence of vocabulary growth. In a sample of 76 children with cochlear implants, non - word repetition significantly correla ted with several reading outcome measures (partial correlations ranged from .41 to .55, controlling for age and IQ). However, when LD scores were introduced in the model, the partial correlations were reduced substantially in magnitude (.15 to .32) and wer e no longer significant. Based on these findings, Dillon and Pisoni argued that as children’s LD increases, the robustness of phonological representations is strengthened; which in turn, influences the development of reading skills.

Further, Maner - Idrissi et al. (2009) investigated how several variables such as age at implantation, communication mode before implantation, and school integration level influenced the development of language skills including LD. By videotaping and analyzing language samples fro m a one year period, Maner - Idrissi et al. found that in a sample of 38 children averaging (3, 66 years) only school integration in a hearing environment impacted LD; a finding that was attributed to “peer pressure” to use spoken language. The authors concl uded that certain school environments might be more conducive to language development than others.

In addition, Geers, Spehar, and Sedey (2002) investigated the development of speech and language skills of children who were enrolled in total

5

communication

programs, which make use of multiple modes of communication, after receiving cochlear implantation. Emphasis was placed on identifying predictors of spoken language proficiency because they have been related to children’s educational placement in mainstre am classes after cochlear implantation. Language samples were obtained that included both sign communication and spoken language. Participants, who were identified as using more spoken language as opposed to sign language, were found to have significantly higher LD compared to children who used more sign language during their interactions. Further, the former group was more likely to be placed in mainstream classrooms.

Researchers who study aphasia in adults have used measures of LD both as an index of gene ral discourse ability and as a tool for hypothesis testing. First, LD has been used to differentiate individuals with aphasia (IWA) from neurologically intact adults (NIA). For example, Holmes and Singh (1996) analyzed conversational language samples from 100 participants, 70 IWA and 30 NIA, in terms of eight linguistic variables, including indices of LD. Their goal was to create a statistical method of assessing an individual’s lexical ability that could differentiate the two groups. The results from a dis criminant analysis showed that using these variables 88% of the subjects were classified accurately. Further, the analysis showed that LD was one of the most important variables in terms of discriminating power. Lind et al. (2009) also noted the significan ce of measuring selected aspects of semi - spontaneous discourse in IWA (i.e., not

6

conversation) and developed a battery of tools to capture clinically relevant aspects of noun and verb production.

Wright, Silverman, and Newhoff (2003) examined whether LD d iffered across adults with fluent and nonfluent aphasia. Wright et al. analyzed language samples from picture descriptions and manipulated length and LD estimation technique. When language sample length was not controlled, the participants with fluent apha sia yielded significantly higher LD for two out of three LD indices. When samples were truncated to be equal in length, groups differed significantly for all measures.

LD has also been used as an external criterion to investigate the validity of linguistic

indices derived from different elicitation techniques. For example, McNeil et al. (2007) included measures that reflect LD and verbal productivity to explore the concurrent validity of the Story Retell Procedure (SRP; Doyle et al., 1998), a method designe d to elicit language samples in IWA.

LD has also been used as a crucible for testing hypotheses. For example, Gordon (2008) explored the productive vocabulary of individuals with fluent and non - fluent aphasia in the context of the “division of labor” hypo thesis (Gordon & Dell, 2003). According to this hypothesis, some words vary on the extent to which they rely on semantic or syntactic contributions for production. This gives rise to the different speech patterns fluent and non - fluent IWA demonstrate with regard to the number of function and content words they use. Based on the observed diversity of individual word classes, Gordon concluded that results added, at least, partial support to the division of labor hypothesis.

7

Further, Crepaldi et al. (2011) st udied the disproportionate impairment of nouns and verbs in seven IWA in spontaneous speech to examine the functional damage underlying their grammatical - class - specific impairment. Using a similar approach to Gordon’s, Crepaldi et al. concluded that their data were consistent with the idea that the noun – verb dissociation might not be as evident in spontaneous speech as it is in picture naming tasks. This finding reinforces the hypothesis that lexical access and retrieval during picture naming and discourse production might be based on different underlying processes with little things in common.

Also, LD has been used to measure the efficacy of treatment studies and generalization to discourse. For example, Rider, Wright, Marshall, and Page (2008) evaluated whether training lexical items using semantic feature analysis would improve the verbal output of individuals with non - fluent aphasia. Using a multiple probes approach, they found that even though their three participants improved in terms of their confron tational naming skills LD did not increase from pre -

to post - treatment sessions.

Bucks, Singh, Cuerden, and Wilcock (2000) explored whether the lexical retrieval deficits in dementia are reflected in measures of overall range of vocabulary and whether they

can be used to discriminate people with a diagnosis of probable dementia and age matched healthy adults. In their study they used several linguistic variables (including indices of LD) to assess conversational language samples from 24 participants (16 hea lthy adults). Based on the results, Bucks et al. concluded that the pattern observed suggested that it was possible to

8

measure lexical differences between the groups that could be used to reliably differentiate them.

Building on previous work from Garrard,

Maloney, Hodges, and Patterson (2005), Velzen and Garrard (2008) tracked the gradual decline in LD

in three books by Gerard Reve (1923 – 2006), an acclaimed Dutch author who wrote his last book shortly before being diagnosed with Alzheimer’s disease. They s plit each book into first and second halves and estimated LD for each half. Then, using univariate analyses of variance they found a clear

drop in LD that coincided chronologically with when the first reports of forgetfulness started.

Defining Lexical Di versity

LD was defined earlier as ‘…something [related to] the range of vocabulary displayed’ in different instantiations of discourse (Durán, Malvern, Richards, & Chipere, 2004; pp. 220). Durán et al. resorted to this definition in an attempt to reconcile

decades of disagreement and confusion regarding the nomenclature and nature of LD. Part of this confusion stems from the fact that the term LD has been used in a wide range of scientific areas in which it has been conceptualized differently (e.g., forensi c linguistics, stylometry, bilingualism, aphasia, assessment of first language speaking and writing; see Malvern et al., 2004, pp. 5 - 15 for a review). The picture is further distorted because researchers have used tools to quantify LD that focus on differe nt aspects of it. As a result, analysis, synthesis, and generalization of findings across studies that could shed light on the nature of LD may often be problematic. Further, according to Yu

9

(2009), confusion arises because LD has been used to characterize

a person’s knowledge of vocabulary in some research areas; whereas, LD has been treated as a quality of a verbal or written product in other areas.

Yu pointed out that the two could be related, in the sense that a product (e.g., a book) reflects its produ cer’s (e.g., a writer’s) vocabulary breadth.

For the purposes of this paper, LD will be defined within Chapelle’s (1994) model of vocabulary knowledge. Chapelle, drawing from the area of applied linguistics and the work of Bachman (1990), proposed a mode l of vocabulary ability that consists of four dimensions. The first dimension

is vocabulary size and denotes one’s breadth of lexicon that is exhibited in a specific context. The second dimension is knowledge of word characteristics, i.e. aspects of a pers ons’ word knowledge in terms of its phonology, semantics, syntactic properties etc. The third dimension is related to how lexical items are organized in the mental lexicon and how rich the semantic network is. The fourth dimension

relates to the processes that are involved in lexical access and retrieval. These dimensions are not meant to be orthogonal nor static; they can vary as individuals develop or as a result of events such as a cerebrovascular accident.

Within this four - dimensional space, LD aligns more closely with vocabulary size and, under certain conditions, the state of the cognitive - linguistic mechanisms that support the access and retrieval of lexical items. The definition of LD offered by Durán et al. (2004) highlights the quantitative aspect

of lexical knowledge. Indeed, most researchers agree that LD reflects one’s breadth of vocabulary and thus it is more indicative of the lexical knowledge in terms of vocabulary size. LD

10

does not reflect primarily the depth or the complexity of vocabulary knowledge expressed in Chapelle’s second and third dimensions, respectively 1 . In terms of the fourth dimension, exhibiting a range of vocabulary is contingent upon the fundamental processes associated with lexical processing. This is true both with respect

to neurologically intact and neurologically impaired adults. Particularly, in the latter case, LD would also reflect the extent to which the cognitive system can support access and retrieval of target lexical items in a given context.

Following Chapelle’ s perspective, then, language performance is assumed to depend on both the implicit knowledge one possesses (e.g., size of lexicon) and the mechanisms that allow her/him to process it (e.g., access and retrieval). This is in agreement with the idea that kn owledge of

vocabulary and the capacity to demonstrate that knowledge cannot be equated (Chomsky, 1980). It is also consistent with clinical neuropsychological performance definitions of language disorders. For example, McNeil and Pratt (2001) de - emphasize the loss of language knowledge as the primary deficit in stroke - induced aphasia and instead recast it as an access deficit.

Based on these premises and for the purposes of this paper, LD i

will be defined as an individual’s capacity to deploy a diverse vo cabulary, by accessing and retrieving lexical items from a relatively intact knowledge base (i.e., lexicon) for the construction of higher linguistic units. The subscript i in LD i

stands for “individual”. This definition remains consistent with Chapelle’s,

according to

1 " E ven though it could be argued that it is likely that individuals that have extended vocabularies might also exhibit greater w ord sophistication and denser semantic networks "

11

whom vocabulary ability reflects “both knowledge of language and the ability to put language to use in context” (Chapelle, 1994, p. 163; see also, Nation, 2007, p. 42 for a similar view). However, it is different in two ways. First, it is tai lored to LD rather than vocabulary ability (the latter is considered a superordinate construct that subsumes the former).

Second, it recasts LD as a characteristic of the individual and differentiates it from LD S

which for the remainder of this paper wil l refer to LD of a given language sample that might take the form of a book, an essay, or telling Cinderella to a child (the subscript S

denotes sample). Explicitly distinguishing LD i

from LD S

alleviates the confusion identified by Yu (2009) and allows LD i

to be conceptualized as an unobserved trait that characterizes individuals whereas LD S

is considered a quality of a sample. Also, henceforth, LD with no subscript will be used to denote either one when the distinction is not critical or an argument applie s to both.

Estimating Lexical Diversity in Language Samples

“A review of the literature on quantifying vocabulary richness gives the sense of a quest for the Holy Grail” (Malvern et al., 2004, p. 3). In this section, why identifying a robust approach to measure LD S

has been challenging will be presented. I will begin by considering some of the major limitations of the most commonly used measures of LD S , the type - token ratio (TTR). First, I discuss Heap’s law (Heap, 1978) as it applies to linguistics and m ore specifically to the study of LD S . According to this law, the more a speaker talks, the less probable it

12

is that he/she will produce new words. Holding everything else constant, shorter language samples often appear to be more lexically diverse when usi ng measures such as TTR, rendering comparisons across speakers and language samples problematic. I will discuss the assumptions that underlie one of the most widely used approaches to “salvage” TTR, truncation , and why results from this technique might be misleading. Subsequently, examples will be provided from the field of communication disorders that illustrate why interpretations based on the TTR might be biased and inconclusive. Finally, I will present four measures from the field of computational lingu istics that claim to produce valid and reliable scores for LD S

and (i) control for length effects at least to some degree, (ii) use the whole language sample to estimate a score without discarding any data, and (iii) are accompanied by some evidence for th eir psychometric properties.

Type

token ratio . The most obvious way to measure LD S

would be to count the number of different words (i.e. types) in a language sample. Types are the unique lexical items that are used in a language sample. For example, the se ntence “The birds are playing on the branch” contains the types the, birds, are, playing, on, branch.

If the samples have the same number of total words (i.e. tokens), then their LD S

could be inferred based on their respective number of tokens. However, wh en the number of tokens is not kept constant, conclusions based strictly on comparisons of the number of types might be misleading and also not meaningful. Is a sample of 50 tokens that contains 40 types less diverse than a sample of 400 tokens that contai ns 60 types? Quickly it becomes evident that unless the number of tokens is equal, the number of types would reflect both

13

LD S

as well as the contribution of length. That is, language samples that were longer would be credited with higher LD S .

To overcome this obstacle, one could consider the ratio of the types divided by the tokens (TTR) to control for length. TTR has been the traditional method for measuring LD S

Full document contains 210 pages