Loading...

Signature verification based on a feature extraction technique

Master's Thesis 2012 59 Pages

Technology

Excerpt

Table Of Contents.

List Of Figures

List Of Tables.

List Of Acronyms

Chapter 1 Introduction
1. Introduction
1.2 Problem Motivation
1.3 Biometrics Introduction
1.3.1 Past
1.3.2 Present
1.3.3 Future
1.4 Personal Biometric Criteria
1.5 Biometric System-Level Criteria
1.6 Performance parameters
1.7 Thesis outline

CHAPTER 2 Introduction to Signature Verification
2. Introduction
2.1 Pattern recognition
2.2 Feature extraction
2.3 Handwritten signatures
2.3.1 On-line and off-line signatures
2.4 Forgery types
2.4.1 Random forgeries
2.4.2 Simple forgeries
2.4.3 Skilled forgeries
2.5 Writer-dependent and writer-independent verification
2.6 Objectives

Chapter 3 Literature survey
3.1 Texture Analysis
3.1.1Inspection
3.1.2 Medical Image Analysis
3.2 Signature verification
3.3 Gray level run length encoding

Chapter 4 Problem Definition and Methodology
4. Introduction
4.1 Problem Definition
4.2 Steps Involved
4.2.1Signature enrolment
4.2.2 Obtaining region of interest
4.2.3 Feature extraction
4.2.4 Definition of the Run-Length Matrices
4.2.5 Calculation of Euclidean Distances
4.2.6 Creation of the known signature template.
4.2.7 Signature Verification
4.3 Measurement of the Signature Verifier Accuracy
4.4 Proposed Algorithm using Euclidean distance

Chapter 5 Experiment and Results
5. Introduction The simple statistical approach :
5.1.1 Examples of verified signatures:
5.2 Euclidean distance model
5.2.1 Threshold

Chapter 6 Conclusion
6.1 Conclusion

Chapter 7 Future work
7.1 Future work

References

Figure References

List Of Figures

1.1 Performance parameters comparison.

2.1 Pattern recognition process.

2.2 Categorization of popular features associated with off line signatures.

2.3 Types of forgeries.

4.1 Genuine Signature of enroller

4.2 Cropped signature

4.3 Matrix of image

4.4 GLRLM of Image

4.5 Run directions

4.6 Training signature of a signer.

4.7 Test for accuracy

5.1 Training signatures

5.2 Test signatures

5.3 Forged signatures

5.4 FAR-FRR Graph

5.5 ROC graph

List Of Tables.

5.1 Performance statistics for simple classifier.

List Of Acronyms

illustration not visible in this excerpt

Acknowledgement

With exception, I would like to express my sincere gratitude to the Almighty Allah who is full of mercy and compassion for giving me strength and good health during the whole period of my study.

I wish to extend sincere thanks to my supervisor Dr. Ajaz Hussain Mir (H.O.D Electronics and Communication ) for his time, support, suggestions ,criticism and ideas that shaped this work.

I acknowledge the entire Faculty of Department of Electronics & Communication Engineering NIT, Srinagar for their guidance and knowledge.

Special thanks goes to Mrs. Farida Khurshid for her invaluable advice as a friend and a mentor.

I am highly indebted to my family for their love, blessings, support and encouragement during the days of research.

Lastly I thank my friends and not forgetting my course mates for their openness and availability to discuss diverse social and academic issues, some of whom contributed to this study by providing constructive criticism and sample signatures.

(Saba Mushtaq)

Abstract

In this research we evaluate the use of GLRLM features in offline handwritten signature verification. For each known writer we take a sample of fifteen genuine signatures and extract their GLRLM descriptors. We also used some forged signatures to test the efficiency of our system.

We calculate the simple statistical measures and also inter and intra-class Euclidean distances (measure of variability within the same author) among GLRLM descriptors of the known signatures. The key points Euclidean distances, the image distances and the intra class thresholds are stored as templates. We evaluate use of various intra-class distance thresholds like the mean, standard deviation and range. For each signature claimed to be of the known writers, we extract its GLRLM descriptors and calculate the inter-class distances, that is the Euclidean distances between each of its GLRLM descriptors and those of the known template and image distances between the test signature and members of the genuine sample. The intra-class threshold is compared to the inter-class threshold for the claimed signature to be considered a forgery. A database of 525 genuine signatures and 30 forged signatures consisting of a training set and a test set are used.

Chapter 1 Introduction

1. Introduction

Information security is concerned with the assurance of confidentiality, integrity and availability of information in all forms. There are many tools and techniques that can support the management of information security. But system based on biometric has evolved to support some aspects of information security ( Bhattacharyya et al.,2009). Biometric authentication supports the facet of identification, authentication and non-repudiation in information security. Biometric authentication has grown in popularity as a way to provide personal identification. Person’s identification is crucially significant in many applications and the hike in credit card fraud and identity theft in recent years indicates that this is an issue of major concern in wider society(Fleming , 2005). Individual passwords, pin identification or even token based arrangement all have deficiencies that restrict their applicability in a widely-networked society. Biometric is used to identify the identity of an input sample when compared to a template, used in cases to identify specific people by certain characteristics. Possession based: using one specific "token" such as a security tag or a card and knowledge-based: the use of a code or password(Cattin, and Claude, 2002). Standard validation systems often use multiple inputs of samples for sufficient validation, such as particular characteristics of the sample. This intends to enhance security as multiple different samples are required such as security tags and codes and sample dimensions. So, the advantage claimed by biometric authentication is that they can establish an unbreakable one-to-one correspondence between an individual and a piece of data

1.2 Problem Motivation

Amongst the different biometric authentication schemes for security verification including voice detection, retina scan, fingerprint verification, handwritten signature verification, during monitory transactions and other security policies both on-line and off-line, is increasingly becoming popular. Our motivation behind this project is to implement a simple texture analysis approach for such handwritten signature verification avoiding all such complexities of handling a huge database of monochrome pictures corresponding to signatures of each individual. Here to avoid complex image processing methods like thinning, scaling and other morphological schemes, the signatures taken in the form of monochrome tiff images are firstly converted into 2D data arrays . Then, texture features are calculated .Statistical formula for Mean, standard deviation are being followed. The Recognition scheme is based on texture Analysis of signature images.

The purpose of such schemes is to ensure that the rendered services are accessed only by a legitimate user, and not anyone else. Signatures are composed of special characters and flourishes and therefore most of the time they can be unreadable. Also intrapersonal variations and the differences make it necessary to analyze them as complete images and not as letters and words put together. As signatures are the primary mechanism both for authentication and authorization in legal transactions, the need for research in efficient automated solutions for signature recognition and verification has increased in recent years. Various methods have already been introduced in this field but by far the texture method for feature extraction we have used has not been used for signatures till now.

1.3 Biometrics Introduction

Biometrics (ancient Greek: bios ="life", metron ="measure") refers to two very different fields of study and application(Abi-Char et al. , 2011). The first, which is the older and is used in biological studies, including forestry, is the collection, synthesis, analysis and management of quantitative data on biological communities such as forests. Biometrics in reference to biological sciences has been studied and applied for several generations and is somewhat simply viewed as "biological statistics". Authentication is the act of establishing or confirming something (or someone) as authentic, that is, that claims made by or about the thing are true. A short overview in this field can be divided into three parts and they are Past, Present and Future(Bhattacharyya et al., 2009).

1.3.1 Past

European explorer Joao de Barros recorded the first known example of fingerprinting, which is a form of biometrics, in China during the 14th century. Chinese merchants used ink to take children's fingerprints for identification purposes. In 1890, Alphonse Bertillon studied body mechanics and measurements to help in identifying criminals. The police used his method, the Bertillonage method, until it falsely identified some subjects. The Bertillonage method was quickly abandoned in favor of fingerprinting, brought back into use by Richard Edward Henry of Scotland Yard. Karl Pearson, an applied mathematician studied biometric research early in the 20th century at University College of London.( Nigeria, 2012). He made important discoveries in the field of biometrics through studying statistical history and correlation, which he applied to animal evolution. His historical work included the method of moments, the Pearson system of curves, correlation and the chi-squared test. In the 1960s and '70s, signature biometric authentication procedures were developed, but the biometric field remained fixed until the military and security agencies researched and developed biometric technology beyond fingerprinting.

1.3.2 Present

Biometrics authentication is a growing and controversial field in which civil liberties groups express concern over privacy and identity issues. Today, biometric laws and regulations are in process and biometric industry standards are being tested. Face recognition biometrics has not reached the prevalent level of fingerprinting, but with constant technological pushes and with the threat of terrorism, researchers and biometric developers will stimulate this security technology for the twenty-first century(Farouk et al., 2012). In modern approach, Biometric characteristics can be divided in two main classes:

a. Physiological are related to the shape of the body and thus it varies from person to person Fingerprints, Face recognition, hand geometry and iris recognition are some examples of this type of Biometric.

b. Behavioral are related to the behavior of a person. Some examples in this case are signature, keystroke dynamics and of voice . Sometimes voice is also considered to be a physiological biometric as it varies from person to person.

Recently, a new trend has been developed that merges human perception to computer database in a brain-machine interface. This approach has been referred to as cognitive biometrics. Cognitive biometrics is based on specific responses of the brain to stimuli which could be used to trigger a computer database search.

1.3.3 Future

A biometric system can provide two functions. One of which is verification and the other one is Authentication. So, the techniques used for biometric authentication has to be stringent enough that they can employ both these functionalities simultaneously. Currently, cognitive biometrics systems are being developed to use brain response to odor stimuli, facial perception and mental performance for search at ports and high security areas. Other biometric strategies are being developed such as those based on gait (way of walking), retina, Hand veins, ear canal, facial thermogram, DNA, odor and scent and palm prints(Bhattacharyya et al., 2009a). In the near future, these biometric techniques can be the solution for the current threats in world of information security. Of late after a thorough research it can be concluded that approaches made for simultaneous authentication and verification is most promising for iris, finger print and palm vain policies. But whatever the method we choose, main constraint will be its performance in real life situation. So, application of Artificial System can be a solution for these cases(Barral 2010). We have given emphasis on the signature verification. According to us, after detection of an signature pattern, the distance between two samples of signature can be computed. This metric can be used for the verification purposes because this feature remains unique for each and every individual. Again, an artificial system can be designed which will update the stored metric as the proposed feature may vary for a particular person after certain time period.

1.4 Personal Biometric Criteria

Any human biological or behavioral characteristics can become a biometric identifier, provided the following properties12 are met:

- Universality: Every person should have the characteristic. There are always exceptions to this rule: mute people, people without fingers, or those with injured eyes. These exceptions must be taken into account through “work-around” such as conventional non-biometric authentication processes. Most biometric devices have a secure override if a physical property is not available, such as a finger, hand, or eye. In these cases, the person is assigned a special access device, such as a password, PIN, or secure token(Yadav et al., 2010). This special access code or token is entered into the biometric device to allow access.

- Distinctiveness: No two people should have identical biometric characteristics. Monozygotic13 twins, for example, cannot be easily distinguished by face recognition and DNA-analysis systems, although they can be distinguished by fingerprints or iris patterns(sharma and Singh, 2010).

- Permanence: The characteristics should not vary or change with time. A person’s face changes significantly with aging and a person’s signature and its dynamics may change as well, sometimes requiring periodic re-enrollment (Wilson, 2010).

- Collectability: Obtaining and measuring the biometric feature(s) should be easy, non-intrusive, reliable, and robust, as well as cost effective for the application.

1.5 Biometric System-Level Criteria

The preceding personal biometric criteria may be used for evaluating the general viability of the chosen biometric identifier. Once incorporated into a system design, the following criteria14 are key to assessing a given biometric system for a specific application(Jain et al. 1997):

- Performance refers to the accuracy, resources, and environmental conditions required to achieve the desired results.

- Circumvention refers to how difficult it is to fool the system by fraudulent means. An automated access control system that can be easily fooled with a fingerprint prosthetic or a photograph of a user’s face does not provide much security—particularly in an unattended environment.

- Acceptability indicates to what extent people are willing to accept the biometric system. Face recognition systems are personally not intrusive, but there are countries where taking photos or images of people are not viable. Systems that are uncomfortable to the user, appear threatening, require contact that raises hygienic issues, or are basically non-intuitive in practical use will probably not find wide acceptance.

1.6 Performance parameters

The following are used as performance metrics for biometric systems :

- False accept rate or false match rate (FAR or FMR) – the probability that the system incorrectly matches the input pattern to a non-matching template in the database. It measures the percent of invalid inputs which are incorrectly accepted.

- False reject rate or false non-match rate (FRR or FNMR) – the probability that the system fails to detect a match between the input pattern and a matching template in the database. It measures the percent of valid inputs which are incorrectly rejected.

- Receiver operating characteristic or relative operating characteristic (ROC) – The ROC plot is a visual characterization of the trade-off between the FAR and the FRR. In general, the matching algorithm performs a decision based on a threshold which determines how close to a template the input needs to be for it to be considered a match. If the threshold is reduced, there will be less false non-matches but more false accepts. Correspondingly, a higher threshold will reduce the FAR but increase the FRR. A common variation is the Detection error trade-off (DET), which is obtained using normal deviate scales on both axes. This more linear graph illuminates the differences for higher performances (rarer errors).

Abbildung in dieser Leseprobe nicht enthalten

Figure : 1.1 Performance parameter comparison

- Equal error rate or crossover error rate (EER or CER) – the rate at which both accept and reject errors are equal. The value of the EER can be easily obtained from the ROC curve. The EER is a quick way to compare the accuracy of devices with different ROC curves. In general, the device with the lowest EER is most accurate.

- Failure to enroll rate (FTE or FER) – the rate at which attempts to create a template from an input is unsuccessful. This is most commonly caused by low quality inputs.

- Failure to capture rate (FTC) – Within automatic systems, the probability that the system fails to detect a biometric input when presented correctly.

- Template capacity – the maximum number of sets of data which can be stored in the system.

Each biometric technology has its set of strengths and weaknesses, depending upon its application. It is therefore imperative that there is a clear understanding of the final application(s) and their operational requirements before any purchase and implementation decisions are made. Although the use of each biometric is clearly different, some striking similarities can emerge when considering various applications. Most biometric applications can be divided into the following categories:

- Overt or covert systems—Will the user proactively and knowingly be identified by the system or will it be designed to covertly scan the secured area? Either way, a person must have a biometric template on file for him/her to be recognized.
- Voluntary or involuntary systems—Will system users be required to participate in the system to receive access or benefits, or are there opt-out or work-around options?
- Attended or non-attended systems—Will the system be designed for people to use in a remote location, without assistance? Or will users always have technical assistance and/or attendants available? Involuntary and/or covert systems usually require supervision or attendance to monitor system use. Voluntary and/or overt systems may be “unattended.”
- Standard or non-standard operating environments— How much customization will be required for the readers to operate appropriately and the network to communicate and function properly? Will the system be used outdoors or indoors? Outdoors environments typically fall into “non-standard” operating environments.
- Public or private systems—Is the use of the biometric system for a public program or access to a public facility, or for access to a private company or information? Cooperation with the biometric system can often be directly attributed to whether a system is public or private (i.e., employees).
- Physical security and access control—Are users trying to gain access to a facility or area?
- Cyber and computer/network security—Are users trying to gain access to a computer or protected information on a computer or the Internet?
- Identification—Is the biometric being used for identification purposes for access to benefits, information, border crossing, licensing, etc.

1.7 Thesis outline

Chapter2: Introduction to signature verification discusses the basic definitions related to signature verification.

Chapter 3: Literature Study discusses selected previous works pertaining to off-line signature verification, thereby providing the reader with a contextual perspective regarding the range of available techniques and corresponding levels of success achieved.

Chapter 4: Problem Definition and methodology discusses that how at the end of literature survey we identify a problem and how the GLRLM convert raw signature images into robust feature vector representations. Also discusses the GLRLM method.

Chapter 5: Results discusses the data and experimental protocol considered during training. Results yielded by the GLRLM are discussed. In addition, the contributions made by the novel concepts proposed in this study are verified experimentally.

Chapter 6: Conclusion presents concluding remarks regarding the complexity and effectiveness of the systems developed in this study.

Chapter 7: Future Work discusses selected additional topics deemed to be potentially beneficial to the systems developed during this study as possible future work.

CHAPTER 2 Introduction to Signature Verification

2. Introduction

The field of automatic signature verification has intrigued researchers the world over during recent decades, as it not only serves as an exciting platform for the development of innovative mathematical modeling techniques, but also holds undeniable economic potential. As a result, signature verification systems have experienced quantum leaps regarding both complexity and efficiency at a continuous and relentless pace. As the world population continues to increase, so too does the potential for ill intentioned individuals to perpetrate identity fraud. Such efforts are further supported by the relatively recent paradigm shifts regarding point-of-sale payment options. The uses of cheques and especially credit cards have quickly become the preferred method of payment for most individuals, particularly in the developed world. Even though this monetary evolution holds obvious benefits, as it all but eradicates the need for individuals to carry large amounts of cash on their person, it is entirely based on the notion that these tokens would be of no use whatsoever to anyone other than the owner, as a transaction cannot be completed without a valid signature (Swanepoel, 2009).

This is simply not the case, as both cheque and credit card fraud cost financial institutions an unfathomable amount of money on an annual basis. Reports by the American Bankers Association (2007) suggest that annual attempted cheque fraud in the United States increased from $5.5 billion to $12.2 billion during the period 2003–2007, whilst actual losses increased from $677 million to $969 million during the same period. Also, the Association for Payment Clearing Services (2008) report that during the first semester of 2008, losses due to cheque fraud in the United Kingdom reached £20.4 million, whilst losses due to point-of-sale credit card fraud reached £47.4 million. These levels constitute increases of 35% and 26%, respectively, when compared to the same period in 2007.

All the aforementioned factors suggest that effective automatic handwritten signature verification systems are no longer a technological luxury as in years past, but have in fact become a true necessity in the modern document processing environment

2.1 Pattern recognition

The process of pattern recognition constitutes the intelligent foundation of any decision making process. In order to perform anything from menial tasks to complex data analysis, human beings rely greatly on the ability of the brain to perform pattern recognition on a daily basis (Parasher Mayank, et al., 2011). Consider, for example, attempting to drive a vehicle without the ability to recognize and interpret traffic signals. Such real-time pattern recognition processes govern nearly every scenario in modern society. From a mathematical perspective, pattern recognition involves classifying a pattern, represented by an observation sequence X, as belonging to one of finite pattern classes {ω1, ω2, . . . , ω}. An observation sequence is constructed from a set of T, d-dimensional feature vectors {x1, x2, . . . , xT }, where each element of xi denotes a measurement of arbitrary origin, referred to as a feature.

The pattern recognition process, as illustrated in Figure , consists primarily of two phases, namely feature extraction and classification. In some cases, depending on the nature of the data being modelled and the classification technique utilised, certain preprocessing and/or post-processing of the system data may be required (Patel, et al.).

Abbildung in dieser Leseprobe nicht enthalten

Figure 2.1 : The pattern recognition process.

2.2 Feature extraction

During the feature extraction phase, the system analyses a given pattern and records certain features, in order to yield structured data in the form of an observation sequence. Any measurable quantity may constitute a feature. However, since the ultimate aim is to classify a test pattern based solely on such features, it becomes advisable to select a feature set such that patterns belonging to different pattern classes are maximally separated in the feature space.(Sahoo et al., 2005) A selection of popular feature types, in the context of signature verification, is categorized in Figure 2.2. below.

Abbildung in dieser Leseprobe nicht enthalten

Figure 2.2: categorization of popular features associated with off line signatures.

2.3 Handwritten signatures

Handwritten signatures, henceforth referred to only as signatures, have been considered valid proof of identity and consent for centuries Even in our present day and age, dominated by advanced technological systems and protocols, signatures remain the preferred method for identity verification, as they are both nonintrusive and easily collectable.

According to Schmidt (1994), an individual’s signature is usually composed of stroke sequences much unlike those used in ordinary handwriting and, in addition, tends to evolve towards a single, unique design. This is not only as a result of repetition3, but also the innate desire of each person to create a unique signature. Signatures are therefore able to reflect a writer’s subtle idiosyncrasies to a much greater extent than ordinary handwriting.

2.3.1 On-line and off-line signatures

The field of automatic signature verification may currently be divided into two distinct sub-categories, namely those systems concerned with on-line signature verification and those concerned with off-line signature verification. In the on-line scenario, signature data is captured in real time by means of an electronic pen and digitizing tablet, yielding not only pen stroke coordinates, but also dynamic signature data such as pen pressure, velocity and acceleration. On-line signatures are therefore also commonly referred to as dynamic signatures. In the off-line scenario, ink-signed documents require digitization by means of a scanning device. The obtained signature image therefore only provides the coordinates of pixels representative of pen strokes. For this reason, off-line signatures are also referred to as static signatures.

2.4 Forgery types

In the context of off-line signatures, forgeries may generally be categorized as either random, simple or skilled, in increasing order of quality(Swanepoel,, 2009). Furthermore, skilled forgeries may be sub-categorised as either amateur or professional, as illustrated in Figure 2.3

Abbildung in dieser Leseprobe nicht enthalten

Figure 2.3 : Types of forgeries

In this section we discuss the key requirements for forgery categorization. Each discussion also provides a typical example of when such a forgery type may be encountered in practice, within the context of cheque fraud.

2.4.1 Random forgeries

Random forgeries encompass any arbitrary attempt at forging a signature, generally without prior knowledge of the owner’s name. This type of forgery may constitute random pen strokes and is usually easy to detect. For experimental purposes, genuine signatures from writers other than the legitimate owner are commonly used to represent random forgeries. A random forgery is typically expected when a cheque book is registered to a company or institution, rather than a specific individual. The forger therefore has no information regarding the name of an authorized signer.

2.4.2 Simple forgeries

In the case of simple forgeries, the forger’s knowledge is restricted to the name of the

signature’s owner. Due to the arbitrary nature of signature design, simple forgeries may in some cases bear an alarming resemblance to the writer’s genuine signature. In such cases, more sophisticated systems, able of detecting subtle stylistic differences, are required in order to distinguish between genuine signatures and forgeries of this type. Simple forgeries usually result from forging a cheque, lost or stolen, registered to an unknown individual. As the name of the legitimate owner is printed on the cheque itself, an effort can be made to produce a realistically expected representation of the genuine signature. No writer stylistic information can be incorporated, though. This type of forgery is generally associated with a brief period of forged cheques, each with a relatively small value. This is due to the fact that a simple forger generally attempts to avoid the attention associated with processing exceedingly large cheques or the usage of a cheque book reported as lost/stolen.

2.4.3 Skilled forgeries

In some instances, the forger is not only familiar with the writer’s name, but also has access to samples of genuine signatures. Given ample time to practice signature reproduction, he is able to produce so-called skilled forgeries. The vast majority of skilled forgeries may be categorised as amateur, as this type of forgery may be produced by any given individual. In contrast, to produce a professional skilled forgery, the forger typically requires a certain amount of knowledge regarding forensic document analysis. This enables the forger to mimic subtle writer-specific idiosyncrasies, thereby producing a forgery far beyond the capabilities of the average individual. Skilled forgeries are undoubtedly the most difficult to detect, especially by untrained humans. As the production of a skilled forgery involves both planning and effort, similar effort is required to enforce sufficient countermeasures - typically a sophisticated automatic signature verification system. The ability to produce skilled forgeries constitutes the greatest threat to legitimate cheque processing, as an unacceptable number of forged cheques go undetected. Furthermore, the involvement of professional skilled forgers may facilitate large-scale corporate fraud, potentially causing crippling losses to high-profile businesses.

2.5 Writer-dependent and writer-independent verification

In a writer-dependent verification scenario, there exists a unique, trained model Mω for each writer ω enrolled into the system database. When the system receives a questioned signature pattern X and claim of ownership ω, the pattern is matched with Mω, subsequently yielding a score reflecting the (dis)similarity between X and a typical signature pattern used to train Mω. It should be made clear, however, that a global decision threshold τ is used for verification purposes.

The writer-independent approach, on the other hand, performs verification using a single model M, regardless of the number of writers enrolled in the system database. This is achieved by attempting to model the difference between genuine signatures and forgeries in general. Any classifier employing the writer-independent approach is therefore trained using a set of modified feature vectors, known as difference vectors. In order to construct such difference vectors, each writer ω provides a genuine signature pattern X(ω) k as reference. Any pattern X(ω) belonging to or claimed to belong to writer ω, subsequently presented to the system, is converted to the difference vector Z(ω) by computing

Abbildung in dieser Leseprobe nicht enthalten

where D(・) denotes any suitable distance measure.

In order to effectively model the difference between genuine signatures and forgeries by using the writer-independent approach, though, one typically requires the efforts of a discriminative classifier such as a neural network (NN) or support vector machine (SVM), as both genuine signatures and forgeries are used during model training.

2.6 Objectives

During the course of this study, we aim to achieve two primary objectives, namely the successful design and implementation of:

- A novel feature extraction technique, utilizing the GLRLM method
- A robust off-line signature verification system, utilizing the efforts of either a score based

Classifier or distance classifier.

Chapter 3 Literature survey

3.1 Texture Analysis

In many machine vision and image processing algorithms, simplifying assumptions are made about the uniformity of intensities in local image regions. However, images of real objects often do not exhibit regions of uniform intensities. For example, the image of a wooden surface is not uniform but contains variations of intensities which form certain repeated patterns called visual texture. The patterns can be the result of physical surface properties such as roughness or oriented strands which often have a tactile quality, or they could be the result of reflectance differences such as the color on a surface. “We may regard texture as what constitutes a macroscopic region. Its structure is simply attributed to the repetitive patterns in which elements or primitives are arranged according to a placement rule.”

Julesz(1975) has studied texture perception extensively in the context of texture discrimination The question he posed was “When is a texture pair discriminable, given that they had the same brightness, contrast, and color?” Julesz concentrated on the spatial statistics of the image gray levels that are inherent in the definition of texture by keeping other illumination-related properties the same. To discuss Julesz’s pioneering work, we need to define the concepts of first- and secondorder spatial statistics.

(i) First-order statistics measure the likelihood of observing a gray value at a randomly- chosen location in the image. First-order statistics can be computed from the histogram of pixel intensities in the image. These depend only on individual pixel values and not on the interaction or co-occurrence of neighboring pixel values. The average intensity in an image is an example of the first-order statistic.

(ii) Second-order statistics are defined as the likelihood of observing a pair of gray values occurring at the endpoints of a dipole (or needle) of random length placed in the image at a random location and orientation. These are properties of pairs of pixel values.

Julesz proposed the “theory of textons” to explain the preattentive discrimination of texture pairs. Textons are visual events (such as collinearity, terminations, closure, etc.) whose presence is detected and used in texture discrimination. Terminations are endpoints of line segments or corners.

Texture analysis methods have been utilized in a variety of application domains. In some of the mature domains (such as remote sensing) texture already has played a major role, while in other disciplines (such as surface inspection) new applications of texture are being found. Texture is defined to be the local scene heterogeneity and this property is used for classification of land use categories such as water, agricultural areas, etc. In the ultrasound image of the heart texture is defined as the amount of randomness which has a lower value in the vicinity of the border between the heart cavity and the inner wall than in the blood filled cavity. This fact can be used to perform segmentation and boundary detection using texture analysis methods.

3.1.1 Inspection

There has been a limited number of applications of texture processing to automated inspection problems. These applications include defect detection in images of textiles and automated inspection of carpet wear and automobile paints. In the detection of defects in texture images, most applications have been in the domain of textile inspection. Dewaele et al(1988). used signal processing methods to detect point defects and line defects in texture images. They have sparse convolution masks in which the bank of filters are adaptively selected depending upon the image to be analyzed. Texture features are computed from the filtered images. A Mahalanobis distance classifier is used to classify the defective areas.

Chetverikov(1988) defined a simple window differencing operator to the texture features obtained from simple filtering operations. This allows one to detect the boundaries of defects in the texture.

Chen and Jain(1988)used a structural approach to defect detection in textured images. They extract a skeletal structure from images, and by detecting anomalies in certain statistical features in these skeletons, defects in the texture are identified.

Conners et al. utilized texture analysis methods to detect defects in lumber wood automatically. The defect detection is performed by dividing the image into subwindows and classifying each subwindow into one of the defect categories such as knot, decay, mineral streak, etc. The features they use to perform this classification is based on tonal features such as mean, variance, skewness, and kurtosis of gray levels along with texture features computed from gray level co-occurrence matrices in analyzing pictures of wood. The combination of using tonal features along with textural features improves the correct classification rates over using either type of feature alone. In the area of quality control of textured images, Siew et al. proposed a method for the assessment of carpet wear. They used simple texture features that are computed from second-order gray level dependency statistics and from first-order gray level difference statistics. They showed that the numerical texture features obtained from these techniques can characterize the carpet wear successfully.

Jain et al. (2002) used the texture features computed from a bank of Gabor filters to automatically classify the quality of painted metallic surfaces. A pair of automotive paint finish images has uniform coating of paint, but the image has “mottle” or “blotchy” appearance.

3.1.2 Medical Image Analysis

Image analysis techniques have played an important role in several medical applications. In general, the applications involve the automatic extraction of features from the image which are then used for a variety of classification tasks, such as distinguishing normal tissue from abnormal tissue. Depending upon the particular classification task, the extracted features capture morphological properties, color properties, or certain textural properties of the image. The textural properties computed are closely related to the application domain to be used.

Mir et al.(1995)utilized texture for extraction of diagnostic information from CT images. A number of features can be obtained. They established the use of texture for detection of abnormalities in CT images that are beyond human appreciation and otherwise difficult to determine by other classical methods of image processing. As CT scanning is invaluable in abdominal investigations, scans with the liver as a central organ have been used in this study.

Sutton and Hall(1972) discuss the classification of pulmonary disease using texture features. Some diseases, such as interstitial fibrosis, affect the lungs in such a manner that the resulting changes in the X-ray images are texture changes as opposed to clearly delineated lesions. In such applications, texture analysis methods are ideally suited for these images. Sutton and Hall propose the use of three types of texture features to distinguish normal lungs from diseased lungs. These features are computed based on an isotropic contrast measure, a directional contrast measure, and a Fourier domain energy sampling. In their classification experiments, the best classification results were obtained using the directional contrast measure.

Harms et al.(1986) used image texture in combination with color features to diagnose leukemic malignancy in samples of stained blood cells. They extracted texture micro-edges and “textons” between these micro-edges. The textons were regions with almost uniform color. They extracted a number of texture features from the textons including the total number of pixels in the textons which have a specific color, the mean texton radius and texton size for each color and various texton shape features. In combination with color, the texture features significantly improved the correct classification rate of blood cell types compared to using only color features.

Landeweerd and Gelsema extracted various first-order statistics (such as mean gray level in a region) as well as second-order statistics (such as gray level co-occurrence matrices) to differentiate different types of white blood cells.

Insana et al. used textural features in ultrasound images to estimate tissue scattering parameters. They made significant use of the knowledge about the physics of the ultrasound imaging process and tissue characteristics to design the texture model.

(a) A non-defective paint which has a smooth texture.

(b) A defective paint which has a mottled look.

Classify ultrasound images of livers, and used the fractal texture features to do edge enhancement in chest X-rays.

Lundervold used fractal texture features in combination with other features (such as response to edge detector operators) to analyze ultrasound images of the heart . The ultrasound images in this study are time sequence images of the left ventricle of the heart. Texture is represented as an index at each pixel, being the local fractal dimension within an window estimated according to the fractal Brownian motion model proposed by Chen et al.. The texture feature is used in addition to a number of other traditional features, including the response to a Kirsch edge operator, the gray level, and the result of temporal operations. The fractal dimension is expected to be higher on an average in blood than in tissue due to the noise and backscatter characteristics of the blood which is more disordered than that of solid tissue. In addition, the fractal dimension is low at non-random blood/tissue interfaces representing edge information. Texture analysis has been extensively used to classify remotely sensed images. Land use classification where homogeneous regions with different types of terrains (such as wheat, bodies of water, urban regions, etc.) need to be identified is an important application.

Haralick et al. used gray level co-occurrence features to analyze remotely sensed images. They computed gray level co-occurrence matrices for a distance of one with four directions.For a seven-class classification problem, they obtained approximately 80% classification accuracy using texture features.

3.2 Signature verification

The field of off-line signature verification has enjoyed a great deal of attention over the past few decades. In this chapter we present a collection of verification systems proposed over the years. Although some of these systems may seem dated, they represent noteworthy efforts in the field and also provide the reader with a historical perspective regarding advances made in recent years.

The systems presented in this chapter are based on a wide variety of pattern recognition techniques, namely simple distance classifiers, dynamic time warping, hidden Markov models, fuzzy logic and support vector machines.

Ozgunduz, et al, proposed in their paper entitled “offline signature verification and recognition by support vector machine” an off-line signature verification and recognition system using the global, directional and grid features of signatures. Support Vector Machine (SVM) was used to verify and classify the signatures and a classification ratio of 0.95 was obtained. As the recognition of signatures represents a multiclass problem SVM's one-against-all method was used. They also compared their methods performance with Artificial Neural Network’s (ANN) back propagation method.

Buddhika Jayasekara et al proposed in their paper entitled “An Evolving Signature Recognition System” a signature recognition method based on the fuzzy logic and genetic algorithm (GA) methodologies. It consists of two phases; the fuzzy inference system training using GA and the signature recognition. A sample of signatures is used to represent a particular person. The feature extraction process is followed by a selective preprocessing. The fuzzy inference system is followed by a feature extraction step. The projection profiles, contour profiles, geometric centre, actual dimensions, signature area, local features, and the baseline shift are considered as the feature set in the study. The input feature set is divided into five sections and separate five fuzzy subsystems were used to take the results. Those results are combined using a second stage fuzzy system. The fuzzy membership functions are optimized using the GA. A set of signatures consisting of genuine signatures, random forgeries, skilled forgeries of a particular signature and different signatures were used as the training set. Then, that particular optimized recognition system can be used to identify the particular signature identity. System achieved a signature recognition rate of about 90% and handled the random forgeries with 77 % accuracy and skilled forgeries with 70% accuracy.

Banshider Majhi et al. in their paper “ Novel feature for offline signature verification(2006) implement a novel feature extraction method based on geometric centers. Features are obtained by recursively dividing a signature image into sub-images along horizontal and vertical axes located on the geometric centre of the parent image. Geometric centers of the final sub-images subsequently form the feature vector. An Euclidean distance model is used for classification on a database containing 30 genuine signatures, 10 random forgeries, 10 simple forgeries and 10 skilled forgeries per writer. The number of writers considered during testing is not disclosed. The authors reportedly achieve FARs of 2.08% (random forgeries), 9.75% (simple forgeries) and 16.36% (skilled forgeries), associated with an FRR of 14.58%.

Chaudri Bhupendra M. et al proposed that This input image is processed to extract the information by using data acquisition , the fuzzy min-max algorithm can applied to classify the signature pattern and this fuzzy min-max algorithm is totally fit to the neural network framework. The neural network middle layer is work as fuzzified neuron and because of this the output can be correctly classified .Use of fuzzy membership function is increase the accuracy of the classification of signature pattern because the decision boundaries are not crisp rather it is fuzzy. The neural network is designed for this work is for the category learning which can increase the speed of recognition because in this fuzzy min-max neural network the supervised learning algorithm is used also it can learn nonlinear class boundaries in a single pass through the data and provides the ability to incorporate new and refine existing classes without retraining. The advantage of our system is its accuracy in recognizing signature is nearly 53% for single signature pattern per class and if the signature pattern per class are increased then the accuracy is increased up to 92%.

Larkins & Mayo proposed an Adaptive Feature Thresholding (AFT) which is a novel method of person-dependent off-line signature verification. AFT enhances how a simple image feature of a signature is converted to a binary feature vector by significantly improving its representation in relation to the training signatures.The similarity between signatures is then easily computed from their corresponding binary feature vectors. AFT was tested on the CEDAR and GPDS benchmark datasets, with classification using either a manual or an automatic variant. On the CEDAR dataset we achieved a classification accuracy of 92% for manual and 90% for automatic, while on the GPDS dataset they achieved over 87% and 85% respectively. For both datasets AFT is less complex and requires fewer images features than the existing state of the art methods.

Mihai Costin Manolescu in his paper “Signature Recognition Project” presented a signature recognition algorithm based on a new feature extraction method. This low complexity algorithm can be run effectively on low-end hardware (8 bit microprocessors) and requires under 400 bytes for each signature representation. The recognition algorithm is based on a neural networks with about 350 nodes and which can be trained with 4-10 signature samples. Experimental results indicate a score above 0.9 for genuine signatures and below 0.7 for forgeries.

Debnath Bhattacharyya et al(2008 ) proposed an algorithmic approach for the verification of handwritten signatures by applying some statistical methods. The research work was based on the collection of set of signatures from which an average signature was obtained based on our algorithm and then taking decision of acceptance after analyzing the correlation in between the sample signature and the average signature.

Reza Ebrahimpour et al introduced a new and robust model for signature recognition by means of features inspired by the human’s visual ventral stream. A feature set is extracted by means of a feed-forward model which contains illumination and view invariant C2 features from all images in the dataset. Also they used Linear Discrimniant Analysis (LDA) to reduce the dimension of C2 feature vectors that is derived from a cortex-like mechanism. They utilized standard K-Nearest Neighbor (KNN) as classifier. The effectiveness of the approach is evaluated on an experimental signature database. By this new effort the rate of signature recognition is significantly high than other models .

3.3 Gray level run length encoding

Xu Dong-Hui et al present a new approach for volumetric texture analysis using a run length encoding matrix and its texture descriptors. They experiment with their approach on the volumetric data generated from two normal Computed Tomography (CT) studies of the chest and abdomen. Their preliminary results show that there are run-length features calculated from the volumetric run-length matrix that are capable of capturing the texture primitives’ properties for different structures in 3D image data, such as the homogeneous texture structure of the liver.

Ultrasound imaging is one of the promising techniques for early detection of prostate cancer. There are five steps involved in processing the ultrasound images such as pre-processing, segmentation, feature extraction, feature selection, and classification. the Transrectal Ultrasound (TRUS) images are preprocessed with M3 filter. Then it is segmented by using DBSCAN clustering after applying morphological operators, in order to extort the prostate region.

R.Manavalan et al proposed to extract the features by using Gray Level Run Length Matrix (GLRLM) for different direction from the segmented region. To classify the images into benign or malignant, Support Vector Machine (SVM) is adapted to evaluate the performance of the proposed method through classification. Over 5500 digitized TRUS images of prostate are used for the experimental analysis. The results obtained using the classification showed that texture features based on GLRLM by using combined directions (_ = {0º, 45º, 90º, 135º}) distinguish between malignant and benign on TRUS images, with highest accuracy 85% where as sensitivity and specificity of 82% and 100% respectively.

Xiaoou Tang developed a new run-length texture feature extraction algorithm that preserves much of the texture information in run-length matrices and significantly improves image classification accuracy over traditional run-length techniques. The advantage of this approach is demonstrated experimentally by the classification of two texture data sets. Comparisons with other methods demonstrate that the run-length matrices contain great discriminatory information and that a good method of extracting such information is of paramount importance to successful classification.

Chapter 4 Problem Definition and Methodology

4. Introduction

The process of feature extraction constitutes one of the fundamental components of the pattern recognition process, as it enables a verification system to represent signature patterns in an intelligent and robust manner. The feature vector representation generated by this process may subsequently be used to train a suitable classification model.

In this chapter we explain how the GLRLM method developed convert a raw signature image into a suitable feature vector representation.

4.1 Problem Definition

Aim of the project is to implement GLRLM algorithm for signature verification and identification using Long run emphasis feature and the Euclidean distance to compare individual signatures and check what ratio is minimum to have a match.

[...]

Details

Pages
59
Year
2012
ISBN (eBook)
9783668541535
ISBN (Book)
9783668541542
File size
972 KB
Language
English
Catalog Number
v376136
Grade
10
Tags
signature Offline signature verification texture based verification Handwritten signatures

Author

Share

Previous

Title: Signature verification based on a feature extraction technique