Linear Neural Networks

Oct 6, 2000 ... Adaline Rule; Widrow-Hoff Rule; Least Mean Squares (LMS) Rule ...... for Neural
Networks: a tutorial with JAVA exercises (W. Gerstner).

Part of the document


http://www.willamette.edu/~gorr/classes/cs449/Unsupervised/pca.html
[pic] CS-449: Neural Networks Fall 99
Instructor: Genevieve Orr
Willamette University
Lecture Notes prepared by Genevieve Orr, Nici Schraudolph, and Fred Cummins
[pic]
[Content][Links]
[pic] [pic]Course content
Summary Our goal is to introduce students to a powerful class of model, the Neural
Network. In fact, this is a broad term which includes many diverse models
and approaches. We will first motivate networks by analogy to the brain.
The analogy is loose, but serves to introduce the idea of parallel and
distributed computation.
We then introduce one kind of network in detail: the feedforward network
trained by backpropagation of error. We discuss model architectures,
training methods and data representation issues. We hope to cover
everything you need to know to get backpropagation working for you. A range
of applications and extensions to the basic model will be presented in the
final section of the module.
Lecture 1: Introduction
. Questions
. Motivation and Applications
. Computation in the brain
. Artificial neuron models
. Linear regression
. Linear neural networks
. Multi-layer networks
. Error Backpropagation
Lecture 2: Classification
. Introduction
. Perceptron Learning
. Delta Learning
. Doing it Right
Lecture 3: Optimizing Linear Networks
. Weights and Learning Rates
. Summary
Lecture 4: The Backprop Toolbox
. 2-Layer Networks and Backprop
. Noise and Overtraining
. Momentum
. Delta-Bar-Delta
. Many layer Networks and Backprop
. Backprop: an example
. Overfitting and regularization
. Growing and pruning networks
. Preconditioning the network
. Momentum
. Delta-Bar-Delta
Lecture 5: Unsupervised Learning
. Introduction
. Linear Compression (PCA)
. NonLinear Compression
. Competitive Learning
. Kohonon Self-Organizing Nets
Lecture 6: Reinforcement Learning
. Introduction
. Components of RL
. Terminology and Bellman's Equation
Lecture 7: Advanced Topics
. Learning rate adaptation
. Classification
. Non-supervised learning
. Time-Delay Neural Networks
. Recurrent neural networks
. Real-Time Recurrent Learning
. Dynamics of RNNs
. Long Short-Term Memory
[Top]
[pic]
Review for Midterm:
. Linear Nets
. Non-linear Nets [pic]Links Tutorials:
. The Nervous System - a very nice introduction, many pictures
. [pic]Neural Java - a neural network tutorial with Java applets
. Web Sim - A Java neural network simulator.
. a book chapter describing the Backpropagation Algorithm (Postscript)
. A short set of pages showing how a simple backprop net learns to
recognize the digits 0-9, with C code
. Reinforcement Learning - A Tutorial
Simulators and code:
|[pic] |Web Sim: Java neural network simulator. |
|[pic] |Brainwave: a Java based simulator |
|[pic] |tlearn: Windows, Macintosh and Unix implentation of backprop and |
| |variants. Written in C. |
|[pic] |PDP++: C++ software with every conceivable bell and whistle. Unix |
| |only. The manual also makes a good tutorial. |
Data Sources:
. UCI machine learning database
. ai-faq/neural-nets data source list
. Handwritten Digits
Related stuff of interest:
. A page of neural network links
. Tesauro's backgammon network
. Lego Lab at University of Aarhus
[pic]
[Top] Questions 1. What tasks are machines good at doing that humans are not?
2. What tasks are humans good at doing that machines are not?
3. What tasks are both good at?
4. What does it mean to learn?
5. How is learning related to intelligence?
6. What does it mean to be intelligent? Do you believe a machine will
ever be built that exhibits intelligence?
7. Have the above definitions changed over time?
8. If a computer were intelligent, how would you know?
9. What does it mean to be conscious?
10. Can one be intelligent and not conscious or vice versa?
[pic]
[Top] [pic][Next: Motivation] [pic][Back to the first page]
Neural networks were started about 50 years ago. Their early abilities were
exaggerated, casting doubts on the field as a whole There is a recent
renewed interest in the field, however, because of new techniques and a
better theoretical understanding of their capabilities.
. [pic]Motivation for neural networks: . Scientists are challenged to use machines more effectively for tasks
currently solved by humans.
. Symbolic Rules don't reflect processes actually used by humans
. Traditional computing excels in many areas, but not in others.
[pic]Types of Applications
Machine learning:
. Having a computer program itself from a set of examples so you don't
have to program it yourself. This will be a strong focus of this
course: neural networks that learn from a set of examples.
. Optimization: given a set of constraints and a cost function, how do
you find an optimal solution? E.g. traveling salesman problem.
. Classification: grouping patterns into classes: i.e. handwritten
characters into letters.
. Associative memory: recalling a memory based on a partial match.
. Regression: function mapping
Cognitive science:
. Modelling higher level reasoning:
o language
o problem solving
. Modelling lower level reasoning:
o vision
o audition speech recognition
o speech generation
Neurobiology: Modelling models of how the brain works.
. neuron-level
. higher levels: vision, hearing, etc. Overlaps with cognitive folks.
Mathematics:
. Nonparametric statistical analysis and regression.
Philosophy:
. Can human souls/behavior be explained in terms of symbols, or does it
require something lower level, like a neurally based model?
[pic]Where are neural networks being used? . Signal processing: suppress line noise, with adaptive echo canceling,
blind source separation
. Control: e.g. backing up a truck: cab position, rear position, and
match with the dock get converted to steering instructions.
Manufacturing plants for controlling automated machines.
. Siemens successfully uses neural networks for process automation in
basic industries, e.g., in rolling mill control more than 100 neural
networks do their job, 24 hours a day
. Robotics - navigation, vision recognition
. Pattern recognition, i.e. recognizing handwritten characters, e.g. the
current version of Apple's Newton uses a neural net
. Medicine, i.e. storing medical records based on case information
. Speech production: reading text aloud (NETtalk)
. Speech recognition
. Vision: face recognition , edge detection, visual search engines
. Business,e.g.. rules for mortgage decisions are extracted from past
decisions made by experienced evaluators, resulting in a network that
has a high level of agreement with human experts.
. Financial Applications: time series analysis, stock market prediction
. Data Compression: speech signal, image, e.g. faces
. Game Playing: backgammon, chess, go, ...
[Top] [pic][Next: Computation in the brain] [pic][Back to the first page] Computation in the brain [pic]
[pic]
The brain - that's my second most favourite organ! - Woody Allen
[pic] The Brain as an Information Processing System The human brain contains about 10 billion nerve cells, or neurons. On
average, each neuron is connected to other neurons through about 10 000
synapses. (The actual figures vary greatly, depending on the local
neuroanatomy.) The brain's network of neurons forms a massively parallel
information processing system. This contrasts with conventional computers,
in which a single processor executes a single series of instructions.
Against this, consider the time taken for each elementary operation:
neurons typically operate at a maximum rate of about 100 Hz, while a
conventional CPU carries out several hundred million machine level
operations per second. Despite of being built with very slow hardware, the
brain has quite remarkable capabilities:
. its performance tends to degrade gracefully under partial damage. In
contrast, most programs and engineered systems are brittle: if you
remove some arbitrary parts, very likely the whole will cease to
function.
. it can learn (reorganize itself) from experience.
. this means that partial recovery from damage is possible if healthy
units can learn to take over the functions previously carried out by
the damaged areas.
. it performs massively parallel computations extremely efficiently. For
example, complex visual perception occurs within less than 100 ms,
that is, 10 processing steps!
. it supports our intelligence and self-awareness. (Nobody knows yet how
this occurs.)
| |proc|
| |essi|
| |ng |
| |elem|
| |ents|
This is a linear model: in an xy-plot, equation 1 describes a straight line
with slope w1 and intercept w0 with the y-axis, as shown in Fig. 2. (Note
that we have rescaled the coordinate axes - this does not change the
problem in any fundamental way.)
How do we choose the two parameters w0 and