• unlimited access with print and download
    $ 37 00
  • read full document, no print or download, expires after 72 hours
    $ 4 99
More info
Unlimited access including download and printing, plus availability for reading and annotating in your in your Udini library.
  • Access to this article in your Udini library for 72 hours from purchase.
  • The article will not be available for download or print.
  • Upgrade to the full version of this document at a reduced price.
  • Your trial access payment is credited when purchasing the full version.
Buy
Continue searching

Credit card fraud detection using artificial neural networks tuned by genetic algorithms

Dissertation
Author: Carsten A. W. Paasch
Abstract:
Credit card fraud is a major problem in the financial industry. It is responsible for billions of dollars in losses per annum globally. This work develops a methodology and resulting system prototype for fraud detection on credit card transaction data. The detection engine is based on Artificial Neural Networks (ANNs). The ANNs are tuned in three aspects by Genetic Algorithms (GAs), namely in the determination of the optimum set of input factors to the ANN, the determination of the optimum topology of the ANN, and the determination of the optimum weights connecting the ANN neurons. The purpose of this research is to determine whether any specific choice of application of one or more GAs to certain aspects of the ANN dominates other choices. Care was taken to deal with the different cost structure of false positive and false negative results. The detection engine prototype was trained on subsets of a labeled data set from a major financial institution that covers all transactions that were made in a period of 13 months on cards issued by that institution. The results of our investigations are encouraging in that GAs applied to ANNs for credit card fraud detection can improve detection engine performance as expressed through minimization of the objective function value when applied to weight optimization and input feature selection, while there are only mixed results for topology optimization.

vi TABLE OF CONTENTS

Title Page i Copyright ii Authorization Page iii Signature Page iv Acknowledgements v Table of Contents vi List of Figures xi List of Tables xiii Abstract xvi CHAPTER 1 INTRODUCTION 1 CHAPTER 2 FRAUD AND CREDIT CARD FRAUD 4 2.1

An Introduction to Fraud 4

2.2

Credit Card Fraud 9

2.2.1

Credit Card Transactions 9

2.2.2

Credit Card Fraud Types 12

2.2.3

Size of the Problem 15

2.2.4

Other Fraud Types Specific to the Financial Industry 17

2.3

Fraud Detection Methodologies 25

2.3.1

Supervised versus Unsupervised Approaches 25

2.3.2

Surveys of Fraud Detection Techniques in the Literature 29

2.3.3

Basic Techniques 30

2.3.4

Statistical Techniques 31

2.3.5

Expert Systems and Fuzzy Logic 34

vii 2.3.6

Neural Networks 35

2.3.7

Parallels to Intrusion Detection Methodologies 37

2.3.8

Other Approaches 38

2.4

Credit Card Fraud Detection 40

2.4.1

Overview 40

2.4.2

Challenges 43

2.4.3

Approaches with Artificial Neural Networks 53

2.4.4

Other Approaches 57

CHAPTER 3 DETECTION TECHNOLOGY TO BE DEPLOYED 61 3.1

Overview 61

3.2

Artificial Neural Networks 61

3.2.1

Overview 61

3.2.2

Supervised Learning of ANNs 67

3.2.3

Criticism of ANNs 71

3.3

Genetic Algorithms 74

3.3.1

Overview 74

3.3.2

Genomes and Genome Operations 79

3.3.3

Types of Genetic Algorithms and Operators 82

3.3.4

Applications of Genetic Algorithms 84

3.3.5

Genetic Algorithms in Fraud Detection 85

3.4

Genetic Algorithms Modifying Neural Networks 88

3.4.1

Background 88

3.4.2

Connection Weights 89

3.4.3

Architecture 92

3.4.4

Input Attribute Selection 96

3.4.5

Learning Rules and Other Areas 97

3.4.6

Applications in Business and Finance 98

viii CHAPTER 4 MODEL DESIGN AND IMPLEMENTATION 101 4.1

Motivation 101

4.2

Research Question and Hypotheses 102

4.3

The Data Set 104

4.3.1

Attributes 106

4.3.2

Derived Attributes 110

4.3.3

Data Sets Created for Trial Runs 112

4.4

The Detection Model - Conceptual Design 124

4.4.1

Genome Encoding 129

4.4.2

Objective Function 130

4.4.3

GA Types and Other Parameters 132

4.5

Implementation 134

4.5.1

Overview 134

4.5.2

Implemented Code – Some Key Points 138

CHAPTER 5 OPTIMIZING NEURON WEIGHTS 141 5.1

Problem Description 141

5.2

Test Cases 144

5.3

Determining the initial ANN Parameters 145

5.4

Determining the Initial GA Parameters 152

5.5

Results 164

5.6

Sensitivity Analysis 169

5.7

Cross Validation 172

ix 5.8

Issues and Additional Work Required 174

CHAPTER 6 OPTIMIZING INPUT ATTRIBUTES 177 6.1

Problem Description 177

6.2

Test Cases 179

6.3

Determining the initial GA Parameters 180

6.4

Results 187

6.5

Sensitivity Analysis 193

6.6

Issues and Additional Work Required 195

CHAPTER 7 OPTIMIZING NETWORK TOPOLOGY 198 7.1

Problem Description 198

7.2

Test Cases 200

7.3

Determining the Initial GA Parameters 201

7.4

Results 207

7.5

Sensitivity Analysis 211

7.6

Penalizing complexity 212

7.7

Issues and Additional Work Required 216

CHAPTER 8 DISCUSSION, OUTLOOK AND CONCLUSIONS 218 8.1

Discussion 218

8.2

Outlook – Tasks for Continuation of Research 221

8.2.1

Source Data 222

x 8.2.2

ANN Design 223

8.2.3

GA Design 224

8.2.4

Model Implementation 225

8.2.5

Other Aspects for Future Research 226

8.3

Conclusion 226

BIBLIOGRAPHY AND REFERENCES 228 APPENDIX 1 SCREEN SHOTS 243 APPENDIX 2 PROTOTYPE SOURCECODE 246

xi LIST OF FIGURES Figure 1: Fraud Triangle................................................................................6 Figure 2: A typical credit card transaction....................................................10 Figure 3: An example ANN..........................................................................64 Figure 4: Typical GA....................................................................................77 Figure 5: Model diagram.............................................................................126 Figure 6: Execution flow in prototype..........................................................128 Figure 7: Prototype implementation............................................................138 Figure 8: TC0.1 - Determining initial number of network nodes..................147 Figure 9: TC0.2 - Determining initial number of network nodes..................148 Figure 10: TC0.6 - Determining initial number of network nodes................148 Figure 11: Determining the appropriate learning rate..................................149 Figure 12: TC0.6 - Determining training epochs.........................................150 Figure 13: Varying momentum on all 6 test cases......................................151 Figure 14: TC0.6 - Setting weight GA population size and number of generations using roulette wheel selection..........................................153 Figure 15: TC0.6 - Setting GA population size and number of generations, using rank order selection....................................................................155 Figure 16: TC0.1, Finding the appropriate number of weight GA generations .............................................................................................................156 Figure 17: TC0.1 - Varying crossover probability and mutation probability.157 Figure 18: TC0.1 - Varying crossover probability between 0 and 1.............158 Figure 19: TC0.1 - Varying number of generations with different population sizes....................................................................................................159

xii Figure 20: TC0.1 - Varying number of generations for different population sizes....................................................................................................160 Figure 21: TC0.1-TC0.5 - Varying population sizes....................................160 Figure 22: TC0.6 – Varying population sizes.............................................161 Figure 23: TC0.1-TC0.6 - Comparing selection methods............................162 Figure 24: Prototype Main Menu.................................................................243 Figure 25: Data Selection Module...............................................................243 Figure 26: ANN Configuration Screen.........................................................244 Figure 27: GA Configuration Screen...........................................................245

xiii LIST OF TABLES

` Table 1: Attributes in transaction file (Dataset A)........................................108 Table 2: Derived attributes..........................................................................111 Table 3: Data sets used for training and testing..........................................123 Table 4: Manually chosen input attributes for the ANN...............................143 Table 5: Test cases to determine initial back propagation and GA parameters .............................................................................................................144 Table 6: Test cases to compare weight optimization scenarios..................145 Table 7: Finalized ANN parameters for subsequent work...........................152 Table 8: Initial weight GA parameters.........................................................153 Table 9: Finalized parameters for weight optimization GA..........................163 Table 10: Weight optimization – Scenario comparison...............................165 Table 11: BP False positive and negative results and runtimes..................167 Table 12: Weight GAs - False positive and negative results and runtimes.168 Table 13: Sensitivity Analysis of Error Function..........................................171 Table 14: Cross validation results...............................................................173 Table 15: Test cases to determine initial GA parameters...........................179 Table 16: Test cases to compare input attribute optimization scenarios.....180 Table 17: Initial input attribute selection GA parameters.............................183 Table 18: First trial run results for input attribute selection GA...................183 Table 19: Input attribute selection GA - Modifying population size..............184 Table 20: Input attribute GA - Modifying number of generations.................185 Table 21: Input attribute GA: Modifying crossover probability.....................185

xiv Table 22: Input attribute selection GA: Modifying mutation probability........186 Table 23: Input attribute selection GA - Finalized parameters....................187 Table 24: Input attribute selection – Scenario comparison with modified objective function.................................................................................189 Table 25: Input GA - results with bespoke optimum input attribute sets.....190 Table 26: Input GA - Suggested genomes by test case..............................191 Table 27: Re-running TC1.1-TC1.9 with optimum input set for TC1.7........192 Table 28: Test cases to determine initial topology GA parameters.............200 Table 29: Test cases to compare topology optimization scenarios.............201 Table 30: Initial topology GA parameters....................................................202 Table 31: First trial run results for topology GA...........................................203 Table 32: Topology GA - Modifying population size....................................204 Table 33: Topology GA - Modifying number of generations........................204 Table 34: Topology GA: Modifying crossover probability............................205 Table 35: Topology GA: Modifying mutation probability.............................205 Table 36: Topology GA - Finalized parameters..........................................206 Table 37: Determining topology with GA - Comparison of results...............208 Table 38: Topology Optimization - False positive and negative results......208 Table 39: Topology GA - Suggested topologies by test case.....................209 Table 40: Topology GA - Sensitivity Analysis.............................................212 Table 41: Determining topology with GA – Penalizing Complexity - Comparison of results..........................................................................214 Table 42: Topology Optimization - Penalizing Complexity – False Positive and false negative results....................................................................215

xv Table 43: Topology GA – Penalizing Complexity - Suggested topologies by test case..............................................................................................215

xvi

Credit Card Fraud Detection using Artificial Neural Networks Tuned by Genetic Algorithms

by Carsten Paasch

Information and Systems Management, Business School The Hong Kong University of Science and Technology

Abstract

Credit card fraud is a major problem in the financial industry. It is responsible for billions of dollars in losses per annum globally. This work develops a methodology and resulting system prototype for fraud detection on credit card transaction data. The detection engine is based on Artificial Neural Networks (ANNs). The ANNs are tuned in three aspects by Genetic Algorithms (GAs), namely in the determination of the optimum set of input factors to the ANN, the determination of the optimum topology of the ANN, and the determination of the optimum weights connecting the ANN neurons. The purpose of this research is to determine whether any specific choice of application of one or more GAs to certain aspects of the ANN dominates other choices. Care was taken to deal with the different cost structure of false positive and false negative results. The detection engine prototype was trained on subsets of a labeled data set from a major financial institution that covers all transactions that were made in a period of 13 months on cards issued by that institution. The results of our investigations are encouraging in that GAs applied to ANNs for credit card fraud detection can improve detection engine performance as expressed through minimization of the

xvii objective function value when applied to weight optimization and input feature selection, while there are only mixed results for topology optimization.

1

CHAPTER 1 INTRODUCTION

Credit card fraud is a major problem for financial institutions globally. The annual cost due to it is in the billions of dollars [1]. While this should in theory be incentive enough to give rise to a frenzy of research activities, there is not a lot to be found in the academic literature that deals with this specific subject. At the same time, combinations of artificial neural networks (ANNs) and Genetic Algorithms (GAs) have been applied very successfully to a variety of other domains [2]. The suspected reason for this absence of lots of research activities in the field of credit card fraud detection is the fact that data sets are very difficult to obtain to perform research on, as this type of data is very sensitive in the eyes of card issuing companies and hence not easily parted with by these companies. Hence we 1 reckoned that this is a fruitful area to conduct research in as we are in the fortunate situation that we have a data set of 50 million labeled credit card transactions we were authorized to use for our research by the card issuing institution. So we decided to research the abilities of GA and ANN combinations to detect credit card fraud.

1 The term “we” here and in subsequent places in this document refers primarily to the author, and secondarily to Professor Christopher Westland with whom the author had exchanges throughout the duration of this research project.

2 An ANN is a mathematical model or computational model based on biological neural networks. It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. In more practical terms neural networks are non-linear statistical data modeling tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data. A genetic algorithm (GA) is a search technique used in computing to find exact or approximate solutions to optimization and search problems. Genetic algorithms are categorized as global search heuristics. Genetic algorithms are a particular class of evolutionary algorithms (also known as evolutionary computation) that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover (also called recombination). Genetic algorithms are inspired by biology and model some of the features than can be found in natural genetics such as selection along the lines of survival of the fittest. The combination of these two techniques inspired by nature has the potential to create powerful methods for learning and classification in various domains. One of them is credit card fraud detection. This document is structured as follows. Chapter 2 provides a very comprehensive overview on the subject of fraud, and specifically credit card fraud, and the detection techniques that have been devised to deal with it. The reason why other types of fraud are also covered in chapter 2 with some degree of detail is the fact that the methodologies we apply to credit card

3 fraud detection are likely to be applicable to general fraud detection as well in future research efforts, hence it makes sense to keep this possibility in mind early on. Chapter 3 introduces the technologies we intend to deploy – ANNs and GAs, and shows areas where combinations of the two have been applied successfully in the past. Chapter 4 is where our original work commences; it introduces the motivation for this work, the research question and hypotheses, the data set we are using, the detection model we have devised, and a description of the implementation of the same. Chapters 5, 6 and 7 show the results we have obtained through the use of our model. Chapter 8 discusses the findings we have made, points out the limitations we have encountered during this research work, points out the next steps that should be conducted in this research stream, and concludes.

4

CHAPTER 2 FRAUD AND CREDIT CARD FRAUD 2.1 An Introduction to Fraud Fraud is a major problem for a multitude of industries, including, but not limited to, the insurance industry [3-6], the telecommunications industry [7-10], the finance industry [11-14], and even the public sector in the form of benefits fraud [15], tax fraud [16, 17], or procurement fraud [18], to name but a few. According to Black’s Law dictionary [19], fraud is defined as follows. Definition 1:

“[Fraud is] an intentional perversion of the truth for the purpose of inducing another in reliance upon it to part with some valuable thing or to surrender a legal right; a false representation of a matter of fact, whether by words or conduct, by false or misleading allegations, or by concealment of that which should have been disclosed, which deceives and is intended to deceive another so that he shall act upon it to his legal inquiry; anything calculated to deceive, whether by a single act or combination, or by suppression of truth, or suggestion of what is false, whether it be by direct falsehood or innuendo, by speech or silence, word of mouth, or look or gesture; fraud comprises all acts, omissions, and concealments involving a branch of legal or equitable duty and resulting in damage to another. " A somewhat simpler definition that is given in the same dictionary is the following definition 2.

5 Definition 2:

“[Fraud is] a generic term, embracing all multifarious means which human ingenuity can devise, and which are resorted to by one individual to get advantage over another by false suggestions or by suppression of truth, and includes all surprise, trick, cunning, dissembling, and any unfair way by which another is cheated.” Black’s law dictionary relates mainly to the US legal system. The specific legal definition of fraud varies slightly by legal jurisdiction; a very good overview on this aspect can be found in [20]. A slightly less legalistic definition of fraud, more useful for our understanding of the subject matter within the scope of our work, is given in a very recent report on the status of fraudulent crime in the United Kingdom issued by a number of researchers in the UK to the Association of Chief Police Officers’ Economic Crime Portfolio. It is the following [21]. Definition 3:

“Fraud is the obtaining of financial advantage or causing of loss by implicit or explicit deception; it is the mechanism through which the fraudster gains an unlawful advantage or causes unlawful loss.” The key elements here are deception (pretending something which is not true) and the aspect of gaining an unlawful advantage. These aspects also apply to the subject of credit card fraud which we will investigate in this work, so we shall use this definition as our working definition. Fraud is of course generally considered a crime in most jurisdictions (as reviewed thoroughly in [20]). For such a specific type of crime, it is useful to use a model in order to be better able to explain the specific attributes of

6 the crime type in their context. Just as Westland has proposed in [22] a rational choice model of computer and network crime to enable better analysis of how this type of crime comes about, fraud as well has its specialized model to explain its unique features. In the earliest approaches, the causes of fraud are summarized in a model known as the Fraud Triangle, developed and described by Donald Cressey in [23] and shown in Figure 1 below. . FRAUD Motive Opportunity Rationalization

Figure 1: Fraud Triangle

The three elements of the Fraud Triangle are: Motive (or Pressure), Opportunity, and Rationalization. Generally, fraud tends to occur when someone with a financial need or pressure (motive) gains improper access to funds (opportunity) and is able to justify the act to himself/herself and possibly others (rationalization). The Fraud Triangle suggests there are at least three general ways of preventing fraud—by altering the motives of

7 individuals; by limiting the opportunities for secretly gaining access to funds; and by undermining common rationalizations, e.g. through general education or the interrogation and public punishment of individuals. In early definitions, fraud was typically classified as a “white-collar crime”. The term "white-collar crime" was first coined in 1939 during a speech given by Edwin Sutherland to the American Sociological Society to distinguish it from the more traditional forms of crime such a physical robbery and theft. In his subsequent writings [24], Sutherland defined the term as "crime committed by a person of respectability and high social status in the course of his occupation." Sutherland clearly had the type of management fraud in mind where senior company members defraud the company of its assets. This type of fraud exists today of course as well, however, it would be wrong to see fraud purely as a white collar crime as this notion does not necessarily apply to all the various types of fraud that exist today. However, one specific aspect of all types of fraud is that it is generally a non-violent crime. This aspect has a huge impact on the rationalization component of the fraud triangle. Since there is no obvious victim in fraud as compared to violent crimes such as robbery, rape or murder, fraud is also often described as a “victimless crime” – the victim of fraud is often a large corporation which in the eyes of the fraudster “can cope easily” with the damage as the result of the fraudulent activity, hence there is the aspect of rationalization through selective perception of benign consequences, and the absence of any real victim dominates the fraudster’s thinking. Of course this is fallacious thinking, one only needs to look at Enron and Arthur Anderson to see that in reality even fraud directed at a company and not an individual (as

8 is the case typically in credit card fraud) can have its victims, in this specific case the thousands of employees and investors of Enron [25] and Arthur Andersen [26] that ended up bankrupt and/or unemployed, with their retirement savings in shambles. IT facilitates committing fraud, as IT systems allow the fraudster to remain relatively anonymous, while at the same time the fraudster is able to access a wide variety of services electronically, e.g. over the internet, without a need for physical presence. Plvsic et al discuss some of the fraud possibilities facilitated by IT in [27], which also include specific types of credit card fraud as we shall discuss in the next subsection. To summarize, while there are many slightly different technical definitions, fraud in general is a type of crime that implies deception of the victim, mainly for the financial advantage of the perpetrator. Fraud is a non- violent crime, and for a number of reasons (e.g. victim is a large corporation or perpetrator and victim are physically distant from each other and/or non- acquainted), is perceived by the perpetrator as if it were a victimless crime. This fact is relevant to the rationalization aspect of the fraud triangle model. While initially described as a white-collar crime due to the work of Sutherland with the implied image of white-collar employees who are of certain elevated social status defrauding their own company, that definition is too narrow, and the working definition of fraud today includes essentially any crime where the perpetrator tricks the victim for the purpose of financial gain.

9 2.2 Credit Card Fraud 2.2.1 Credit Card Transactions Credit card fraud certainly falls neatly under the general definition of fraud as given in our working definition (Definition 3 in the previous subsection), however, as already discussed, it does not share all the aspects of the traditional definition of fraud necessarily being a “white-collar” crime as Sutherland has described it in [24]. Credit card fraud is not even necessarily always a victimless crime, even though in some cases it could be perceived as such by the fraudster. Some variations of credit card fraud such as “card not present” fraud (as will be described further below) do provide a sense of anonymity and detachedness of the perpetrator from the victim, and hence could be perceived by the criminal to be more on the “clean” side of the crime spectrum. Other variations of credit card fraud that involve the physical theft of the card from its victim have at least as a precursor elements of traditional crime in them (i.e. the theft or robbery of the card). Hence credit card fraud could be considered as a white-collar crime only in some scenarios, while in others it resembles more the traditional types of crime [28]. Credit card fraud is the most widely recognized form of retail financial fraud, and has been around for the past 40 years, with its first peak in the 1970s due to over-zealous marketing campaigns by financial institutions, sending pre-approved credit cards to consumers without these customers actually having requested them [29], and these cards falling into the hands of people they were not intended for.

10 In the context of this work we shall focus on such credit card purchase transactions in which the customer deceives the merchant by pretending to be the legal owner of the credit card (either physically, or by means of having possession of the relevant credit card credentials) and hence gains an unlawful advantage over the credit card owner, the merchant and/or the card issuing bank. In order to provide some background for the subsequent discussion, Figure 2 below gives a schematic overview of a typical conventional credit card transaction in a retail store.

Figure 2: A typical credit card transaction

In step 1, the customer hands over his credit card to the merchant. The merchant swipes the card through the reader of the Point-Of-Sales

11 (POS) terminal which has been provided to the merchant by the acquiring bank, and enters the transaction amount into the POS terminal as well. In step 2, the transaction details, merchant details and credit card details (as stored on the card’s magnetic stripe) are sent to the acquiring bank. In step 3, the acquiring bank sends this information on behalf of the merchant to the issuing bank, which is the bank that had issued the card to the customer originally, in order to obtain authorization from the issuing bank to proceed with the transaction. This step occurs, depending on what type of credit card it actually is, over the network of e.g. VISA if this is a VISA branded card, the MasterCard network if this is a MasterCard branded card, or other networks as appropriate. The authorization process on the side of the issuing bank includes checks on whether the card has been possibly overdrawn already, whether the transaction may be considered fraudulent based on the issuing bank’s fraud detection mechanisms, or whether there are any other reasons not to authorize the transaction. The transaction authorization (or rejection) is sent to the acquiring bank in step 4 and passed back to the merchant in step 5. If authorization has been obtained, the customer confirms the transaction with a signature in step 6 which concludes the transaction. Eventually, the acquiring and issuing banks exchange settlement and payment information in step 7. Funds are transferred from the issuing bank to the acquiring bank which in turn will eventually credit the account of the merchant to complete the transaction.

12 2.2.2 Credit Card Fraud Types A credit card fraud begins with either the theft of the physical card or the compromise of the card information and/or cardholder information. The compromise can occur through many common routes, including something as simple as a merchant store clerk copying sales receipts. The rapid growth of credit card use internationally and particulalrly over the Internet can make security lapses on databases containing credit card information particularly large and costly due to the geographic reach that the lapse may entail. In one particular case in 2005, 40 million credit card customer accounts were stolen due to a single compromise of a particularly big database containing credit card credentials [30]. This had global repercussions in the form of new cards which had to be issued to millions of card holders world-wide. Stolen cards can be reported quickly by card holders, but a compromised account could be hoarded by a thief for weeks or months before any fraudulent use occurs, making it difficult to identify the source of the compromise. The card holder may not discover fraudulent use until receiving a billing statement, which typically is delivered only once per month. There are a number of different possible types of credit card fraud. They are described in detail in the following subsections. For further details please refer to [31]. 2.2.2.1 Counterfeiting The credentials of the credit card, most prominently card number, expiry date, cardholder name and the card verification value (CVV) have fallen into the hands of a fraudster and are used to create a duplicate of the original card. This could occur for example if a merchant with criminal intent

13 has siphoned off the information from the card’s magnetic stripe as it was provided to him by the customer as shown in step 1 in Figure 2 above. 2.2.2.2 Card not Present The normal signature verification step that occurs during a credit card transaction as shown in step 6 above does not take place in a so-called card- not-present transaction which could occur e.g. over the Internet or over the telephone. The customer and merchant are in different physical locations, and hence the merchant has no ability to verify whether the customer providing the credit card credentials is indeed the authorized cardholder or an imposter. 2.2.2.3 Card Theft The most straightforward credit card fraud occurs when the card is stolen. The fraudster here typically attempts to clock up as many transactions as possible in the shortest possible amount of time, potentially forging the signature of the legal card holder, before the card theft has been detected and the card gets black-listed and blocked. 2.2.2.4 Delivery Intercept In this type of fraud, a newly issued card, typically mailed by regular mail, gets intercepted and used before it reaches its real owner. A newly issued card typically requires an activation step for which the user (or fraudster) is required to provide certain authentication credentials such as date of birth of the cardholder to the card issuing institution. Hence for delivery intercept to work, the fraudster typically also needs to perform some

14 form of at the least limited identity theft to be able to activate the card and use it.

2.2.2.5 Card ID Theft Identity theft occurs when a fraudster uses a consumer’s name, personal identification number (tax id/social security number), phone PIN or other personal information to apply for a credit card, to make unauthorized purchases, or to gain access to bank accounts or obtain loans and even mortgages [32], thereby taking on the identity of the victim. There are two ways for the fraudster to go about taking on the identity of the card holder in the context of credit card transactions – the first way is to submit a fraudulent application under a stolen identity, and then use the card under this identity; the second way is to take over the identity of an existing cardholder e.g. by forging a request for a change of the mailing address to divert the credit card statements, assume the identity of the cardholder, and execute card transactions under the victim’s identity. Identity theft poses a great risk to banks and their customers. The liabilities can quickly add up and reach into the hundreds of thousands of US dollars for each victim. Indeed, identity theft victims may spend years – and large sums of money - restoring their credit histories and their good names. Some consumers have been denied jobs or insurance, or been arrested for crimes they did not commit [33]. Potentially even more costly to financial institutions are identity theft schemes where fraudsters use a stolen employee identification number and

Full document contains 574 pages
Abstract: Credit card fraud is a major problem in the financial industry. It is responsible for billions of dollars in losses per annum globally. This work develops a methodology and resulting system prototype for fraud detection on credit card transaction data. The detection engine is based on Artificial Neural Networks (ANNs). The ANNs are tuned in three aspects by Genetic Algorithms (GAs), namely in the determination of the optimum set of input factors to the ANN, the determination of the optimum topology of the ANN, and the determination of the optimum weights connecting the ANN neurons. The purpose of this research is to determine whether any specific choice of application of one or more GAs to certain aspects of the ANN dominates other choices. Care was taken to deal with the different cost structure of false positive and false negative results. The detection engine prototype was trained on subsets of a labeled data set from a major financial institution that covers all transactions that were made in a period of 13 months on cards issued by that institution. The results of our investigations are encouraging in that GAs applied to ANNs for credit card fraud detection can improve detection engine performance as expressed through minimization of the objective function value when applied to weight optimization and input feature selection, while there are only mixed results for topology optimization.