# The spatial and temporal variability of atmospheric teleconnection patterns: A continuum perspective

TABLE OF CONTENTS

List of Tables.…………………………………………………………………………….vi List of Figures...………………………………………………………………………….vii Acknowledgements………………………………………………………………………..x

Chapter 1. INTRODUCTION…………………………………………………………......1 1.1 The regime versus continuum perspective of teleconnection patterns………........3 1.2 Conceptual framework for the continuum perspective of teleconnection patterns …………………………………………………………………..5 1.3 Dissertation outline…………………………………………………………….. ..9

Chapter 2. CLUSTERING METHODS………………………………………………….11 2.1 Description of the sequential training process of the SOM method……………..11 2.2 The batch version of the SOM algorithm………………………………………..19 2.3 The combination of k-means cluster analysis with LUS………………………...20

Chapter 3. THE NAO CONTINUUM AND THE EASTWARD SHIFT………………24 3.1 The SOM of Northern Hemisphere teleconnection patterns……………………26 3.1.1 The P1 and P2 periods………………………………………………….....31 3.1.2 The P3 period.……………………………………………………………..34 3.2 Application of the SOM to Changes in the NAO………………………………..37 3.2.1 Frequency distribution of NAO events……………………………………40 3.2.2 Distinguishability between P1 and P2 composites………………………..43 3.3 Chapter summary and discussion………………………………………………...47

Chapter 4. INTRASEASONAL, INTERANNUAL, and INTERDECADAL VARIABILITY OF THE PNA CONTINUUM…………………………50 4.1 Data and analysis methods……………………………………………………….54 4.1.1 K-means cluster analysis and linear unidimensional scaling……………...54 4.1.2 SOM analysis of coupled OLR-SLP variability…………………………..56 4.2 The continuum of North Pacific SLP patterns…………………………………...58 4.2.1 Intraseasonal variability of the North Pacific continuum…………………62 4.2.2 Coupling between the North Pacific SLP patterns and tropical convection………………………………………………………64 4.2.3 Relationship between the MJO and the North Pacific continuum………..67 4.2.4 Interannual variability of the North Pacific continuum…………………..73 4.2.5 Interdecadal variability of the North Pacific continuum………………….76 4.3 Chapter summary and discussion………………………………………………...81

Chapter 5. SUMMARY AND CONCLUSIONS………………………………………..83

REFERENCES…………………………………………………………………………..87

v

LIST OF TABLES

Table 3.1: Pattern correlations and Euclidean distances (hPa) between the SOM- derived hemispheric SLP anomaly field and the actual hemispheric SLP anomaly fields for each of the three periods considered. In each cell, the pattern correlation lies above the Euclidean distance. 36

Table 4.1: Monthly mean composite values of the PNA, WP, EP/NP, TNH, and the NP indices corresponding to the SLP patterns in Fig. 4.1. Bold values are statistically significant above the 95% confidence level for a two-sided t-test. 61

Table 4.2: Timescale (days) of each of the 24 North Pacific SLP patterns in Fig. 4.1. 63

Table 4.3: Frequencies of occurrence (%) of the North Pacific SLP patterns in Fig. 4.1 for all winter (DJFM) months, and for La Niña, neutral ENSO, and El Niño months in the winters of 1958-2005. Values in bold are significantly different from climatology above the 95% confidence level. 75

vi

LIST OF FIGURES

Figure 1.1: Positive phase of the wintertime (December – February) NAO pattern of the Climate Prediction Center, where the pattern has been obtained by a rotated empirical orthogonal function analysis of the Northern Hemisphere 500-hPa geopotential height field. Regions shaded in blue (red) correspond with negative (positive) geopotential height anomalies during the positive phase, whereas during the negative phase, the sign of the anomalies is reversed. This image was obtained from the Climate Prediction Center website at http://www.cpc.ncep.noaa.gov. 2

Figure 1.2: As in Fig. 1.1, but for the PNA pattern. This image also has been obtained from the Climate Prediction Center website at http://www.cpc.ncep.noaa.gov. 3

Figure 2.1: SOM classification of 10,000 random two-dimensional data vectors with components uniformly distributed between 0 and 1. (a). Schematic depiction of nodes in a 10 x 10 SOM, with the first 3 and last 3 nodes numbered for reference. (b). Initial distribution of reference vectors corresponding to each node. (c). Broadly ordered reference vectors after 500 training iterations. (d). Fine-tuned reference vectors after an additional 50,000 iterations of training. 13

Figure 2.2: Fine-tuned reference vectors of a 10 x 15 SOM. The analyzed dataset consists of 10 000 random two-dimensional data vectors; the x-components are normally distributed with a mean of 0 and a standard deviation of 1, and the y-components are uniformly distributed between 0 and 1.

17

Figure 3.1: The 4x5 SOM of SLP anomaly maps, contoured at intervals of 2 hPa. Solid (dashed) lines depict positive (negative) values, with the zero contour omitted. Percentages on the bottom right of each map describe the pattern frequency for the entire period. Percentages on the top left of each map describe the pattern frequency for 1958-1977 (P1; top), for 1978-1997 (P2; middle), and for 1998-2005 (P3; bottom). 28

Figure 3.2: Sammon map corresponding to the patterns in Fig. 3.1. The black circles, which describe the positions of the SOM patterns, have the same general orientation as in Fig. 3.1. 29

vii

Figure 3.3: SOM-derived composite SLP anomalies for (a) P1 and for (b) P2, and the actual SLP anomaly fields for (c) P1 and (d) P2. The contour interval is 0.2 hPa for (a) and (b) and 0.5 hPa for (c) and (d), with the same contouring conventions as in Fig. 3.1. 33

Figure 3.4: The (a) SOM-derived composite SLP anomalies and (b) actual SLP anomalies for P3, with the contour interval and contouring conventions as in Fig. 3.3. 35

Figure 3.5: The leading EOF of daily sea level pressure, which defines the NAO pattern. Contouring conventions are the same as in Fig. 3.1, and the contour interval is arbitrary. 38

Figure 3.6: The SLP anomaly composites for positive NAO events during (a) P1 and (b) P2 and for negative NAO events during (d) P1 and (e) P2. The difference between composites (P2 – P1) for positive events is shown in (c) and for negative events in (f). The contour interval is 3 hPa, and the zero contour has been omitted in all plots. 40

Figure 3.7: The pattern frequencies (%) for NAO events during (a) P1 and (b) P2, and (c) the difference in pattern frequencies between the two periods. The thick white line separates positive events (right) from negative events (left). Light (dark) shades within gray scale denote relatively large (small) percentages. 42

Figure 3.8: The SOM-derived composite SLP anomaly differences (P2-P1) for (a) positive NAO events and (b) for negative NAO events, as constructed by (3.2), and the actual composite SLP anomaly differences for (c) positive and for (d) negative NAO events. The contour interval is 1 hPa in (a) and (b) and 2 hPa in (c) and (d). All other contouring conventions are identical to those of Fig. 3.1. 44

Figure 4.1: The SLP anomaly patterns obtained by k-means cluster analysis, contoured at intervals of 2 hPa. Solid (dashed) lines depict positive (negative) values, with the zero contour omitted. The pattern number is displayed in bold below each pattern, and the percentages at the bottom right of each map describe the pattern frequency for the period of 1958-2005. 59

viii

Figure 4.2: The 4 x 6 SOM of coupled OLR-SLP anomaly patterns, where the OLR anomaly pattern lies below the associated SLP anomaly pattern. The contouring convention is the same as in Fig. 4.1, but the contour interval is 1 hPa for the SLP anomaly plots and 5 Wm -2 for the OLR anomaly plots. In addition, OLR anomalies below -10, -25, and -50 Wm -2 are shaded in light, medium and dark blue, and OLR anomalies above 10, 25, and 50 Wm -2 are shaded in yellow, orange, and red. (Note that the longitude ranges are 120°E to 120°W for the SLP plots and 20°E to 60°W for the OLR plots.) In the SLP anomaly plots, land is shaded in gray. The percentage at the bottom right of each OLR pattern describes the frequency of occurrence of the coupled pattern for the period of 1974-2005.

65

Figure 4.3: December-March composite OLR anomalies for each phase of the MJO. The contouring convention is the same as in Fig. 4.1, but the contour interval is 5 Wm -2 . Shading levels denote OLR anomalies less than -5, -15, and -25 Wm -2 , respectively, and stippling levels denote OLR anomalies greater than 5 and 15 Wm -2 , respectively. 69

Figure 4.4: The anomalous frequencies of occurrence for the SLP patterns in Fig. 4.1 for each of the 8 phases of the MJO. In each panel, the colored plot corresponds to the anomalous frequency of occurrence (%) for each pattern (numbered as in Fig. 4.1 on the ordinate) as a function of lag (days) with respect to the onset day. The plot on the right side of each panel depicts the patterns and lags for which the frequency anomalies are statistically significant above the 95% (gray) and 99% (black) significance levels. 71

Figure 4.5: The wintertime frequency of occurrence time series for each pattern depicted in Fig. 4.1. The colors describe the frequency of occurrence of each pattern for each winter season between 1958 and 2005, where a five-year moving average has been applied to the frequency time series of each pattern. Five separate periods, P1 – P5, are identified by the thin vertical black lines. The dashed white line corresponds to the Pacific regime shift at the beginning of 1977. 77

Figure 4.6: The analysis-derived SLP anomalies (a, c, e, g, i; see text for details) and the actual SLP anomalies (b, d, f, h, j) for the periods P1 (a, b), P2 (c, d), P3 (e, f), P4 (g, h), and P5 (i, j). The contour interval is 0.1 hPa for the analysis-derived composites on the left side and 0.2 hPa for the actual composites on the right side. The zero contour has been omitted in each plot. 80

ix

ACKNOWLEDGEMENTS

I offer my sincerest thanks to my Ph.D. committee, Steven Feldstein, Hans Verlinde, Jerry Harrington, Eugene Clothiaux, Nels Shirer, and David Pollard. I especially appreciate the mentorship from Steven Feldstein over the past few years, and I am so grateful for the various forms of guidance from the rest of the committee throughout my time as a graduate student. I greatly appreciate the time I have spent with my officemates, Steve Greenberg, Victor Yannuzzi, Lindsay Sheridan, Zach Lebo, Ethan Davis, Eric Wertz, and Alex Avramov. I am indebted particularly to Alex for the generous offering of his time and effort to guide me as I began my research. I thank Chad Bahrmann for the technical support and particularly for the good-natured banter that I have enjoyed on a daily basis. I also offer my sincere appreciation to the professors in the Department of Meteorology, particularly Yvette Richardson, Sukyoung Lee, Paul Markowski, and several members of my Ph.D. committee, for giving so much of their time and effort through their teaching to benefit my education as a graduate student. I will greatly miss all of you when I leave Penn State University, although I am thankful that I will keep such wonderful connections. I appreciate the many contributions that Steven Feldstein has made to the research that I present in this dissertation, particularly in Chapters 3 and 4. I also thank two anonymous reviewers who offered valuable input to the discussion presented in Chapter 3. Finally, I offer my gratitude for support through the Office of Science (BER), U.S. Department of Energy, Grant No. DE-FG02-05ER64058 and through the National Science Foundation, Grant No. ATM-0649512.

x

Chapter 1

Introduction

Low-frequency variability in the Northern Hemisphere geopotential height field is often described in terms of recurrent, persistent teleconnection patterns (e.g., Wallace and Gutzler 1981; Barnston and Livezey 1987), where a teleconnection may be defined as a “simultaneous significant temporal correlation” in the geopotential height field between two widely separated locations (van den Dool 2007). The North Atlantic Oscillation (NAO) and the Pacific/North American (PNA) patterns are considered to be the two dominant teleconnection patterns in the Northern Hemisphere. The dominance of these two patterns extends from weekly (Feldstein 2000) to interannual timescales (e.g., Barnston and Livezey 1987; Kushnir and Wallace 1989). In addition, both patterns occur in all seasons, although each is most prominent during the winter months (Barnston and Livezey 1987). The NAO features meridional fluctuations in atmospheric mass in the North Atlantic and Arctic regions, with centers of action over Iceland and the subtropical Atlantic that extend from the surface through the depth of the troposphere (e.g. Hurrell 1995). During the positive phase of the NAO (Fig. 1.1), a stronger-than-average Icelandic low lies to the north of a stronger-than-average subtropical Atlantic high, resulting in above average westerly winds in the North Atlantic basin and a more northerly Atlantic storm track. In contrast, the negative phase of the NAO features a weaker-than-average Icelandic low that lies to the north of a weaker-than-average subtropical Atlantic high, which results in below average westerly winds in the North

1

Figure 1.1: Positive phase of the wintertime (December – February) NAO pattern of the Climate Prediction Center, where the pattern has been obtained by a rotated empirical orthogonal function analysis of the Northern Hemisphere 500-hPa geopotential height field. Regions shaded in blue (red) correspond with negative (positive) geopotential height anomalies during the positive phase, whereas during the negative phase, the sign of the anomalies is reversed. This image was obtained from the Climate Prediction Center website at http://www.cpc.ncep.noaa.gov.

Atlantic basin, and a more southerly Atlantic storm track. Both phases of the NAO have significant weather and climate impacts over North America, the North Atlantic, Europe, and Asia (Hurrell 1995). Whereas the NAO dominates the geopotential height variability in the North Atlantic basin, the PNA dominates in the North Pacific and North American sector. The PNA (Fig. 1.2) encompasses four centers of action in the mid- and upper-tropospheric height field, two centers of the same sign over the North Pacific and southeast United States and two centers of the opposite sign near Hawaii and along the west coast of North America (Wallace and Gutzler 1981). The surface signature of the PNA, however, is largely confined to the North Pacific in the vicinity of the Aleutian Islands (Wallace and Gutzler 1981). The more regional confinement of the surface PNA contrasts with the NAO, for which both centers of action are prominent at the surface. Like the NAO, the

2

Figure 1.2: As in Fig. 1.1,but for the PNA pattern. This image also has been obtained from the Climate Prediction Center website at http://www.cpc.ncep.noaa.gov. PNA has significant weather and climate impacts over much of the Northern Hemisphere, particularly over East Asia, the North Pacific, and North America regions.

1.1 The regime versus continuum perspective of teleconnection patterns

Previous works have suggested at least two interpretations for Northern Hemisphere teleconnection patterns. Traditionally, investigators have treated such patterns as discrete, recurrent regimes or modes of variability. This interpretation has resulted in their description as “standing oscillations with geographically fixed nodes and antinodes” (Wallace and Gutzler 1981). Alternatively, a few studies have suggested the existence of a continuum of teleconnection patterns rather than a small number of geographically fixed standing oscillations. In support of the continuum perspective, Kushnir and Wallace (1989) find that teleconnection patterns obtained in rotated empirical orthogonal function (EOF) analysis often occur in orthogonal pairs with

3

comparable amplitudes. This finding suggests that, rather than interpreting the loading vectors as discrete modes of variability, we instead may interpret the loading vectors as basis functions that describe a continuum of patterns. Franzke and Feldstein (2005) elaborate on this perspective by investigating the dynamical processes associated with a continuum of teleconnection patterns defined by a set of nonorthogonal basis functions. In providing additional support for the continuum perspective, they find that most members of the continuum have variance, autocorrelation time scales, and dynamical characteristics that are intermediate between those of the constituent basis functions. Although the continuum perspective provides an attractive framework for describing teleconnection patterns, such a framework also introduces constraints on the interpretation of patterns obtained by conventional means such as EOF analysis. For example, if we adopt the perspective suggested by Kushnir and Wallace (1989), then we must consider linear combinations of loading vectors to describe teleconnection patterns most accurately. A compact method of illustrating the continuum of patterns is the use of joint probability distribution functions (e.g., Kimoto and Ghil 1993; Franzke and Feldstein 2005). The drawbacks of this method, however, include the limitation to only two basis functions in a single plot, which means that the two-dimensional phase space may explain only a relatively small fraction of the total variance. In addition, physical interpretation becomes challenging when visualizing probability distributions in phase space rather than patterns in physical space. In order to circumvent these challenges, this dissertation presents two alternative methods, self-organizing maps (SOMs; Kohonen 2001) and k-means clustering (e.g. Anderberg 1973; Michelangeli et al. 1995) combined with linear unidimensional scaling (LUS; Hubert and Arabie 1986), for the purpose of

4

visualizing the continuum of Northern Hemisphere teleconnection patterns. These clustering approaches, which are described more thoroughly in the following section and in Chapter 2, provide a means for visualizing the distribution of large-scale circulation patterns, yet treat the data as a continuum (Hewitson and Crane 2002; Reusch et al. 2007).

1.2 Conceptual framework for the continuum perspective of teleconnection patterns

If we adopt a continuum perspective of teleconnection patterns, then each dominant pattern actually comprises a continuum of similar patterns, each with a unique probability of occurrence. For example, if we focus on the NAO for the purpose of illustration, then we might consider the NAO pattern as a weighted mean of all NAO-like patterns. If we describe each NAO-like sea level pressure (SLP) pattern as an n- dimensional vector m, where n describes the number of grid points in the spatial domain, then we may express the mean NAO pattern as

( )p d= ³ m m m m , (1.1) where p(m) is the probability density function of m, dm is a shorthand notation for the n- dimensional volume differential in phase space, and the integral is taken over the complete phase space of m. Because we cannot examine all members of the continuum, we may choose a finite number of representative patterns to describe this distribution of NAO-like patterns. If we choose K representative patterns, where K is sufficiently large, and denote each of these patterns as , then we may choose each * c m * c m

in a way that allows us to express a discretized form of (1.1) by

5

* * 1 ( ) K c c c p = = ¦ m m m . (1.2) We may interpret p( ) as the probability of occurrence of the NAO-like pattern. If we extend (1.2) from phase space to physical space, then we may express the mean, two- dimensional NAO pattern as * c m * c m * 1 (,) (,) ( ) K c c m x y m x y p m = = ¦ * c , (1.3) where x and y denote positions in the physical domain. Both the SOM and combined k- means/LUS methods provide a means for describing a continuum of patterns, as in (1.1), by a discrete number of representative patterns such that (1.2) and (1.3) may hold approximately (the method for determining the is presented in Chapter 2). * c m Of course, the use of (1.1), (1.2), and (1.3) may apply more broadly to other teleconnection patterns and to other meteorological fields. In practical applications, we must rely on a set of discrete data samples to determine the representative patterns and the corresponding probability or frequency of occurrence. Both the SOM and k-means clustering algorithms determine the representative patterns by attempting to maximize the similarity between and the nearby samples in the phase space, as explained below. In k-means cluster analysis, each equals the mean of the nearby data samples, thus defining a group centroid, whereas in SOM analysis the use of the neighborhood kernel (see Chapter 2) may result in each only approximating the mean of the nearest data samples in phase space. * c m * c m * c m In addition, both the SOM and k-means methods capture the distribution of the data in that the algorithms tend to minimize the average differences between the samples

6

in the dataset, i.e., the daily SLP fields, and the nearest representative patterns in phase space. If we describe each SLP field in the dataset as an n-dimensional vector z, then we may express the difference between z and the best-matching representative pattern in terms of Euclidean distance, ||z – || * c m * c m 1 . Thus, we may determine the best-matching representative pattern for a particular sample z with * c m

{ }

c i − = −z m z m * i , (1.4) where i ∈{1,…K}, K is the number of representative patterns, and the subscript c denotes the pattern index for which the Euclidean distance * i −z m attains a minimum. The k- means and SOM algorithms tend to minimize the average distance between each sample in the dataset and the best-matching representative pattern. We describe this optimization as a minimization of * 1 S j c j E S =1 § · = − ¨ © ¹ ¦ z m ¸

, (1.5) where E is termed the average quantization error (Kohonen 2001), and S describes the number of samples in the dataset 2 . The use of (1.4) and (1.5) indicates that each sample in the dataset becomes associated with a similar representative pattern in phase space. Therefore, we may classify each sample to a group defined by , as is typical in cluster analysis. In * c m

1 If the grid point density within the SLP fields is variable, then each entry in z must be weighted in proportion to the square root of the grid box area to achieve the correct distance calculation. For the data in this dissertation, where the area of each grid box is proportional to the cosine of latitude, we may express z as z = Az uw , where z uw is the unweighted SLP field, and A is an n x n diagonal matrix containing the square root of cosine of latitude at the n grid points. 2 The optimization described here holds precisely for k-means cluster analysis, as discussed in Chapter 2, but only approximately for SOM analysis. A precise optimization criterion for the SOM is difficult to define (Kohonen 2001).

7

addition, we may determine the frequency of occurrence of * c m , or f( ), by calculating the ratio of samples associated with to S, the total number of samples in the dataset. We then may use f( ) as an estimate for p( ) in (1.2) and (1.3). * c m * c m * c m * c m The above discussion indicates that, given a specified number of representative patterns, the SOM and k-means cluster analysis methods tend to maximize the similarity between the representative patterns and the daily SLP fields in the dataset. In addition, the SOM and k-means patterns at least approximate the mean of all samples within a cluster, so that we can view each SOM or k-means clustering pattern as a composite of relatively similar daily SLP fields within a local cluster. These characteristics of SOM and k-means cluster analysis provide reasonable assurance that each represen tative pattern corresponds with a physical pattern to some degree. This characteristic provides an advantage of SOM and k-means cluster anal ysis over E OF analysis, because E OFs may or may not represent physical patterns. The degree of similarity between the representative patterns and the daily SLP fields depends on the number of representative patterns: as we increase the number of representative patterns, the daily and representative patterns attain a greater degree of similarity. Correspondingly, the average quantization error decreases. A significant advantage of SOM and k-means cluster analysis over other methods is that with the two methods described here, we assume that the data are continuous, and so these methods allow the representative patterns to span the phase space even when the data are not highly clustered. Chapter 2 provides two examples of SOM analyses with artificial, unclustered data. The SOM method most significantly distinguishes itself from other clustering methods, including k-means clustering, through its topological ordering of the

8

representative patterns. The patterns become organized on a regular, two-dimensional grid such that the SOM assigns similar representative patterns to nearby locations on the grid and dissimilar patterns to widely separated locations on the grid. Ultimately, the SOM method allows us to visualize the full set of representatives that describe patterns in physical space on a two-dimensional display that describes the complete phase space of the system. Because the k-means clustering algorithm does not organize the representative patterns in any particular manner, as does SOM analysis, we may combine k-means clustering with LUS to achieve an ordering whereby similar patterns tend to be grouped together and dissimilar patterns tend to be widely separated. In contrast with SOM analysis, however, the combination of k-means clustering with LUS organizes the patterns along a line rather than in a two-dimensional grid. Chapter 2 provides for a more thorough description of the combined k-means clustering/LUS approach. In Chapter 4, we combine k-means clustering with LUS to illustrate the continuum of North Pacific SLP patterns. We choose the k-means clustering/LUS approach instead of SOM analysis in order to describe the temporal evolution of the frequency of occurrence of each North Pacific pattern, as is described more thoroughly in Chapter 4.

1.3 Dissertation outline

The main purpose of this dissertation is to demonstrate how the continuum perspective, as illustrated with SOM and k-means cluster analyses, offers a much simpler framework for understanding the spatial and temporal variability of Northern Hemisphere

9

teleconnection patterns than the more commonly used discrete modal approach. In particular, the approach adopted here shows that both the spatial and temporal variability of the NAO and PNA occur through changes in the frequency distribution within the continuum of NAO-like and PNA-like patterns. In the case of the NAO, we use this perspective in Chapter 3 to illustrate the changes associated with the eastward shift of the NAO (Hilmer and Jung 2000) that began in the 1970s. For the PNA, we use this perspective in Chapter 4 to investigate the intraseasonal, interannual, and interdecadal variability of the continuum of North Pacific SLP patterns, which, as is demonstrated, primarily consists of the continuum of PNA-like patterns. The remainder of this dissertation is organized as follows. Chapter 2 offers a thorough description of the SOM, k-means clustering, and LUS methods. Chapter 3 discusses the aforementioned eastward shift of the NAO and illustrates how changes in the frequency distribution of NAO-like patterns within a SOM may capture this eastward shift. As mentioned above, Chapter 4 investigates the intraseasonal, interannual, and interdecadal variability of the PNA continuum. In addition, with the use of a SOM analysis of coupled SLP and outgoing longwave radiation data, Chapter 4 explores the relationship of these North Pacific SLP patterns with convection in the tropical Indo- Pacific region. Finally, Chapter 5 provides a summary and conclusions.