Manual for the implementation of neural networks in MATLAB

Bachelor Thesis 2005 42 Pages

Excerpt

Summary

List of figures and tables V

1. Introduction

2. Neural Networks
2.1. What are neural networks?
2.2 Biological background
2.3 General structure of neural networks
2.4 Properties of neural networks
2.5 Learning in neural networks and weighting
2.6 Historical overview
2.7 Different networks models
2.7.1 Single-layer feed-forward networks
2.7.2 Multi-layer feed-forward networks
2.7.3 Recurrent networks

3. MATLAB
3.1 General overview
3.2. MATLAB
3.3 Matrixes in MATLAB

4. Realization of neural networks in MATLAB
4.1 Creation of neural networks with the Network/Data Manager
4.2 Import of data
4.3 The implementation of an ADALINE network
4.4 The implementation of a back-propagation network
4.5 Self-optimizing neural networks
4.6 Hopfield network
4.7 Summary

5. Conclusion

6. Table of sources

Books

Scripts and other sources

Internet sources with data and time of accession

7. Table of abbreviations of networks

Summary

This bachelor thesis presents a manual about the implementation of neural networks in the software environment MATLAB. The thesis can be divided into four parts. After an introduction into the thesis, the theoretical background of neural networks and MATLAB is explained in two chapters. The third part is the description how to implement networks in a general way and with examples, too. The manual is created for the “Master Course of Computer Studies” at the University of Applied Science Zittau/Görlitz. Due to the fact, that this manual is a bachelor thesis just a small theoretical and practical overview about neural networks can be given.

List of figures and tables

Figure 2.1: Neural networks and their connections to other scientific areas

Figure 2.2: Drawing of a biological and an artificial neuron

Figure 2.3: Layer model with feed-back communication

Figure 2.4: Neural networks categorized by their architecture

Figure 2.5: An Adaline-processor element

Figure 2.6: An error area of a neural network with back-propagation

Figure 3.1: MATLAB windows with three main elements:

Table 3.2: Overview over some simple arithmetic functions in MATLAB

Figure 3.3: Plotted graph of the function x^3 from -2 to 2

Figure 4.1: The Network/Data Manager

Figure 4.2: Create New Data

Figure 4.3: Create New Network

Figure 4.4: Import Wizard

Figure 4.5: Adaline network structures with different transfer functions

Figure 4.6: Graph after the Adaline network has been trained

Figure 4.7: Creating a back-propagation network

Figure 4.8: Structure of a back-propagation network

Figure 4.9: Comparison of training discourses for TANSIG and PURELIN functions.

Figure 4.10: Self-optimizing network

Figure 4.11: Creating a Hopfield network

Figure 4.12: Hopfield network state space

1. Introduction

Almost every day one can read in newspapers or professional journals about computer technology that again the performance and the speed of processors, memories and other, for the functionality of computers necessary, parts has been increased. Computers are today able to solve the most difficult tasks in almost all parts of every day life. One can find them in each office, in the industrial field, in the medicine, etc. . They work much more exactly than a human being could ever work. But still computers are just so intelligent like engineers did program them. For tasks, they were not programmed for; computers are helpless like a baby. Therefore on big aim in research is the creation of system with artificial intelligence, which could work in a way like the human brain works. There is still a lot to do until this aim will be reached, but the first basic steps have been done. One of them is the field of artificial neural networks, which try to copy the functionality of biological neural systems. The idea of artificial networks is as old as the idea of computers. Since the 1940s when McCulloch and Pitts published their paper "A logical calculus of the ideas immanent in nervous activity" in [McCulloc43] one can speak about artificial neural networks, which promote the idea of parallelism and stand in contrast to the idea of centralization, like it is used in today’s computers with one processor.

In my bachelor thesis I explain the creation of artificial neuronal networks in the MATLAB software. Neither neuronal networks nor MATLAB were part of my studies. Therefore I give in the next chapter a general overview about neural networks and will explain network models deeper. In the third chapter I describe the software MATLAB, which is the environment for the neural networks in my bachelor thesis. MATLAB as I later show is mainly not designed for the creation of neural networks. The original and still most important idea of this program is the combination of two different fields, the field of matrix manipulation and the field of development and enlargement as MATLAB is the short version of MATrix LABoratory. The theoretical part which I present in the chapters about neural networks and MATLAB is the base for the understanding of the implementation of different kinds of networks in this software environment. In the final part of my thesis I will give a conclusion how successful the implementation of neural networks in MATLAB works.

My whole bachelor thesis is supposed to be a manual for a Master course at the University of Applied Science in Zittau/Görlitz, as a supplement to a course about neural networks. In my thesis I treat only roughly the biological and historical background of neural networks and will give a general overview about their structures and explain some examples more detailed. I will not discuss learning theories and mathematical calculations about neural networks due to the fact, that this would cover several books, which would exceed very much the scope of my Bachelor thesis.

2. Neural Networks

This chapter will provide a general overview about neural networks. This is necessary for a better understanding of their implementation in the further parts of my thesis. This theoretical knowledge can not be seen as a general precondition, because neural networks were no part of my studies. After a short general overview, some network types are discussed more detailed.

2.1. What are neural networks?

Neural networks are not an invention of informatics. They exist already for million of years, as they can be found in animals and human beings. Therefore biological neural networks are the archetypes for artificial neural networks, which are the main part of my bachelor thesis.

There exist different definitions for neural networks. Wikipedia defines them as “A neural network is an interconnected group of artificial or biological neurons” [Wiki1], Prof. Dr. Andreas Zell describes them as “ Neural networks are information processing systems, consisting of a huge amount of simple units(cells, neurons), which send information to each other by activating cells through directed connections ” [Zell97, p. 23] . As a last definition I refer to Gerhard Rigoll who identifies neural networks as “ a system of consisting of interlinked elements, which are able to process data and can be named as neurons ” [Rigoll94, p. 1]. These three definitions show in a small frame the complexity of neural networks. Terms like information or data processing we normally associate with the topic of informatics. In the opposite the notions cell and neuron are mostly connected with biology. This connection is not a contradiction.

As the following figure 2.1.1 shows, is the field of neural networks not just present in informatics, but concerns and connects many different scientific areas as for example biology, mathematics, physics and others. Each of these research areas connects with neural networks other interests and research aims (e.g. understanding the human brain, psychological phenomena; artificial intelligence: etc.).

The application areas for neural networks are very wide spread. Some possible usages are data compression, pattern and speech recognition, generalisation, robots, stock exchange predictions, behaviour prediction on dynamic systems, adaptive software like for example software agents and autonomous robots. This is just a small selection of possible application fields. The examples for neural networks, which will be implemented in chapter 4 are not on this application level, but will be simple networks, which are able to solve simple “AND” and “XOR” functions or can recognize patterns.

Neuronal networks can be divided into biological and artificial neural networks. In chapter 2.2 I describe shortly biological neural networks, as they are the archetype for the artificial ones, which are the main focus of my thesis. Therefore by using the expression “neural network” or “network” I refer to artificial neural networks.

Abbildung in dieser Leseprobe nicht enthalten

Figure 2.1: Neural networks and their connections to other scientific areas Source: [Zell97, p.24] translated into English by myself

2.2 Biological background

Before I explain in the following subchapters more deeply artificial neural networks I expose the biological fundaments of networks as they can be found in nature. The higher creatures are developed the more complex their nervous system is. Therefore human beings have the most complex nervous system, which can solve many tasks much faster than the best today existing computers. The human brain for example is able to process millions of input impulses, although the communication is based on electrochemical processes, which are compared to digital operations in computers, much slower. The main reason for this big advantage is the parallelism of the brain, which enables it to process a high number of processes at once and not after each other, like it is mostly realized in computers with von Neumann architecture.

The basic components of nervous systems are neurons, which have a lot in common with normal cells, but differ in the cell shape and the kind of cell membrane. The human brain consists of approximately 1011 neurons. A special property these kinds of cells have is the ability to form at the end of their tillers swellings, which are used for communication with other cells. These tillers can be divided into an axon, which is the main tiller with a length between a few millimetres up to almost one metre and dendrites, the smaller sub-tillers. Depending of the number of dendrites, nerve cells are called unipolar, bipolar or multipolar cells. The left cell in figure 2.3.1 shows a model of a nerve cell with synapses, which are the swellings at the end of axons or dendrites. Normally human cells have between 1000 and 10000 of these synapses. They are necessary for the contact between nerve signals, which exists between axons and dendrites of neurons. The communication is based between neurons on a chemical and inside a neuron on an electrical level. On their synapses the dendrites get an input impulse, which is send to the cell body and increases an intern electrical potential of the neuron. When the potential exceeds a certain threshold value the neuron emits a nervous impulse which is send via the axon to other neurons. If the action potential does not exceed this limit the potential returns to the initial potential. After the emission the neuron remains for a while in a resting state.

When synapses transmit nervous impulses between neurons they are able to filter the signal in a stimulating or inhibiting way, which depends on their efficiency. The efficiency of synapses is changeable. The more impulses are transmitted through a synapse, the more effective it gets. Scientists assume that this change is the most important activity in nervous systems, which is as well responsible for the “learning” process.

2.3 General structure of neural networks

Neural networks consist of simple processing elements, which are in a high degree interlinked. These units are very much idealized neurons, consisting, like there biological originals, of three parts: a cell body, dendrites and an axon. The figure 2.3.1 shows a part of a biological and an artificial neuron in comparison.

Abbildung in dieser Leseprobe nicht enthalten

Figure 2.2: Drawing of a biological and an artificial neuron Translated into English by myself

Neurons “communicate” via simple scalar messages between the elements and neural networks are able to interact adaptively between single units. In a more general way one can say that neural networks are a simplification of the parallel structural design of animal brains.

Every element of a network can be interlinked with many other elements. This is the reason, why network structures can be also very intricate.

In a simple model of a neural network all neurons can be separated into tree layers, the input, the hidden and the output layer as the figure 2.2.2 shows. The hidden layer can

Abbildung in dieser Leseprobe nicht enthalten

Figure 2.3: Layer model with feed-back communication Compare: [Kratzer91, p. 27]

consist of many layers between 0 and n. The communication between the elements in the layers can be feed-forward, which means information just can be send to an element which is on an upper level or feed-back (recurrent), where the output signal of an element can also be sent back to the input of the original element in a direct or indirect way. For the classification of networks different counting methods of the layers can be used. My differentiation into single-layer and multi-layer networks corresponds to the layers, which can be trained and not to the overall amount of layers.

The construction of the network size very much depends on the problem, which has to be solved. Neural networks which are oversized do not learn the structures of the trained inputs, but learn the training examples by heart, which makes them useless for a further usage with other unknown inputs.

2.4 Properties of neural networks

A special property of networks I already mentioned in the chapter 2.2, the parallelism. The main difference between neuronal networks (biological and artificial) and computer with von Neumann architecture is the centralism of “normal” computers and the parallelism of networks. This means that networks can process many tasks at once or parallel and centralised computers just can fulfil one task after the other. This parallelism is one of the reasons why a human being much faster can solve certain tasks (e.g. recognition of faces or shapes) than a computer, although the brain works at a lower speed.

Other positive properties of neural networks are:

1) Learning aptitude: Networks are able to learn; normally they have to be trained with examples instead of being programmed.

2) Parallelism: The parallel structure of neural networks enables them to process a high amount of tasks once.

3) Distributed presentation of knowledge: Knowledge is saved in the whole network in form of weights, which means that data is not saved on one place, but everywhere in a distributed manner. This makes a network more robust and error resistant.

4) Associative memorization of data: Data is saved content-related and not address-related, as it normally is done in von Neumann computers. This property is very important for pattern recognition.

5) Higher error tolerance: The knowledge distribution gives neural networks a higher error tolerance for the case that single components fail.

6) Robustness against disturbances and noised data: Neural networks can handle disturbed or noised data, if they have been properly trained.

7) Default values and spontaneous classification: Inputted data can be classified by neural networks in a spontaneous way, which depends on their structure and the training.

8) Active representation: Data is not handled by a second program, but the weights, which represent the data, process it, too.

Beside these positive properties are negative ones, too. Negative aspects of networks are the facts that knowledge acquisition is just possible by learning, introspection is impossible, logical conclusions are very difficult, learning/teaching systems need a lot of time and there exists no guaranty that the network will be successful (problem of global or local minimum). Neural networks can be over fitted, if they get to many learning examples or if their architecture is over constructed. This problem is called „bias-variance-trade-off“ and can be found in other statistical methods, too. The learning speed depends strongly on the way; the data is coded and presented to the network. The better a problem is pre-processed the more successful an artificial network can process it.

2.5 Learning in neural networks and weighting

As already mentioned before, neural networks have to be trained instead of being programmed. This learning is in an overall manner the change of parameters of the network, which influence the “transformation” of the input into the output. This “transformation” is done after the data was introduced to the network until it is presented at the output. As already explained in the chapter 2.3 about the structure of neural networks, are neurons interlinked between each other. Therefore neurons after

the input layer get a lot of inputs from other neurons, which all can have a different weight. The weight works similarly to a filter, which strengthens or weakens the input signal.

Neurons communicate via directed messages; therefore the output of each neuron can be seen as an n-dimensional vector. The vector formulation helps to understand the following formula, which represents the sum of all weighted inputs:

Abbildung in dieser Leseprobe nicht enthalten

Neti is hereby the product of the sum of all input vectors xj multiplied with the weight vectors w ij of each input between cell i and cell j. The input which the Neuron gets from the function neti is added to the existing activation grade of the neuron. The neuron transforms the new activation grade via the output function into an output which is send to all upper connected neurons. This procedure is repeated until the output layer of the network is reached. Depending on the network the neuron contains a threshold value, which has to be exceeded by the activation grade if the output function has to get active. The weight vector has an enormous influence in the propagation function neti. The structure of the output function of each neuron can not be changed, therefore the only way, how parameters in networks can be adjusted is by adapting the weight value of connections. The network minimizes by weight adaption the error, which lead to a wrong output. This happens until the error is so small that the output will accepted as the correct one. This procedure is called learning or training of neural networks.

Learning can be divided into two main groups: supervised and unsupervised. For several networks both learning strategies can be used, which mostly depends on the type of problem. Supervised learning happens with an external “teacher”. The network gets always an example for the input and the output of neural networks. The “teacher” adjusts the parameter as long as necessary that the network presents the expected output, This is in many cases a program, which uses a pattern file with the anticipated outcome of the output layer. Normally several different musters are learned, depending on the used network. Supervised learning is used in the back propagation algorithm, the delta rule, the Hebbian method and the perceptron rule. Networks which use this learning method are Boltzmann, Adaline, Hopfield, Learning Vector Quantization or back propagation network.

Unsupervised learning is learning without a teacher, which can be divided into a competing and a non-competing one. The keyword for this learning method is self- organizing. The neural network gets only input examples, but no output example. Therefore the network has to categorize the received patterns in analogous classes or to identify them as similar. This is based on the theory, that the network achieves after the input a stable state and the weights are changed from inactive connections to active ones. Depending if the learning method is competing or non-competing a distributed learning takes place or only the winner learns. The technique of unsupervised learning is in a high degree the biological most plausible one, as in nature normally nobody can supervise in an active way the course of learning. In neural networks unsupervised learning can be found in Self-Organizing Maps, in Learning Vector Quantization, in Hopfield, Neocognitron and Adaptive Resonant Theory nets.

There exists a third option the reinforced learning, too, which needs a teacher as well, but no answer is presented to the network, just parameters are changed in order to award or to punish the neural network. This method is called “reinforcement learning”, but it will not be discussed further in this paper.

2.6 Historical overview

The historical evolution of neural networks can be divided into several steps and time periods as Andreas Zell shows in his book [Zell97, p. 28 ff.]. The years between 1942 and 1955 can be described as the “early beginnings” of neural networks. The date of birth for neural networks is the year 1943 when W.S. McCulloch and W. Pitts published in a scientific magazine their article "A logical calculus of the ideas immanent in nervous activity" about neural networks. Their network was based on the idea of a neuron and described simple classes of neural networks, which are able to solve all kinds of arithmetical or logical functions. This paper is mostly recognized as the “base” for the later research on neural networks. McCulloch and Pitts did not go into detail of learning theories for networks. The first and real learning rule was created by Donald O. Hebb in 1949. This simple and universal learning concept is still today the basis for most of the learning methods. At the Massachusetts Institute of Technology in the years 1958 and 1959 the first successfully working neural computer was developed by Frank Rosenblatt and Charles Wightman and was used for pattern recognition. This event is situated in the period of the “first height”, which lasted from 1955 until 1969. A predecessor of today’s used associative memories was developed by Karl Steinbruch, who described in his work “ The learning matrix ” [Steinbruch61] a simple realisation of associative memories. Another highlight was the presentation of “ Adaline ” (Adaptive linear neuron) by Bernard Widrow and Marcian Hoff [Widrow60]. This Adaline network is able to learn in an exact and fast way. Similar to the perceptron Adaline used a binary threshold-value neuron.

A cut in the evolution of neural network was raised by Marvin Minsky and Seymour Papert, who proved in their book “ Perceptrons ” [Minsky69] with a mathematical analysis that perceptrons can not present many important problems, which they were expected to. Therefore the time period between 1969 and 1982 is called the “quiet years”. As the name refers was this period quite calm, but during this time the theoretical basis was created for the renaissance of neural networks, which lasts until today. Teuvo Kohonen, one of the originators of self-organizing maps, presented in 1972 his model in the field of associative memories [Kohonen72]. The back propagation method was developed by Paul Werbos in 1974 [Werbos74], which got later a high importance. The Brain-State-in-a-Box model, where infinite growth is limited by the shape of a cube, was released by James Anderson [Anderson77]. John Hopfield developed in the 1980’s the so called Hopfield networks [Hopfield82], which supported in a big amount the renaissance of neural networks. The “ Neocognitron: a model for a mechanism of visual pattern recognition ” was presented in 1983 by Fukushima, Miyake and Ito [Fukushima83] and describes neocognitrons as a construct of simple and complex cells, which are ordered in layers, as they can be found in biological visual systems, too.

2.7 Different networks models

As in the subchapter 2.5 about learning methods already mentioned, can neural networks divided into networks with supervised or unsupervised learning rules. This differentiation gives no information about the structure or complexity of networks. Therefore another distinction has to be used. In this subchapter I will explain different network models, which are divided into three groups according to their structure. As in figure 2.7.1 is visible, networks can be constructed in a single-layer feed-forward, a multilayer feed-forward or in a recurrent (feed-back) way. As the use of the word layer refers, depends this differentiation on the number of layers. A second distinguishing feature, which is presented in the names of the sub-types, is the direction of communication between neurons. The different kinds of neural networks differ themselves as well in the complexity and intricacy. A single-layer network is simpler than a multilayered one. Recurrent networks can consist of a single or many layers; the feed-back structure is therefore an additional handicap. A combination of different networks is as well possible, but I will not explain such neural networks, because it would exceed the frame of my Bachelor thesis.

In the next subchapters the three different network architectures together with some examples will be explained more precisely.

Abbildung in dieser Leseprobe nicht enthalten

Figure 2.4: Neural networks categorized by their architecture Source: [Patterson99, p.50] translated by myself

2.7.1 Single-layer feed-forward networks

The single-layer feed-forward network is one of the simplest and oldest network architectures. Just one layer is used for data processing. The input pattern is inserted into the network, the information is transformed by a transfer function and the result is presented at the output. The data in a vector mode is only directed forward as the name feed-forward already implies. Single-layer neural networks can work in a synchronous

[...]

Details

Pages
42
Year
2005
ISBN (eBook)
9783638445511
ISBN (Book)
9783656451563
File size
1.1 MB
Language
English
Catalog Number
v47657
Institution / College
Neisse University Görlitz – Neisse University 