An Introduction to Support Vector Machines and Other Kernel-based Learning MethodsThis is the first comprehensive introduction to Support Vector Machines (SVMs), a generation learning system based on recent advances in statistical learning theory. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc., and are now established as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and its applications. The concepts are introduced gradually in accessible and self-contained stages, while the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally, the book and its associated web site will guide practitioners to updated literature, new applications, and on-line software. |
Kirjan sisältä
Tulokset 1 - 5 kokonaismäärästä 47
Sivu xiii
... functions <x-z> inner product between x and z <t>:X ^F mapping to feature space K(x,z) kernel {<f>(x) - <j)(z)) ... X' transpose of vector, matrix N, R natural, real numbers S training sample / training set size '/ learning rate E ...
... functions <x-z> inner product between x and z <t>:X ^F mapping to feature space K(x,z) kernel {<f>(x) - <j)(z)) ... X' transpose of vector, matrix N, R natural, real numbers S training sample / training set size '/ learning rate E ...
Sivu 19
Remark 2.11 The training data only enter the algorithm through the entries of the matrix G = ((x, - x;}). .=p known as the Gram matrix, whose properties are briefly discussed in Appendix B and which will be related to other similar ...
Remark 2.11 The training data only enter the algorithm through the entries of the matrix G = ((x, - x;}). .=p known as the Gram matrix, whose properties are briefly discussed in Appendix B and which will be related to other similar ...
Sivu 20
The hyperplane that optimises this criterion can be found by solving a system of linear equations with a symmetric matrix formed from the training data and right hand side the difference between the two class means.
The hyperplane that optimises this criterion can be found by solving a system of linear equations with a symmetric matrix formed from the training data and right hand side the difference between the two class means.
Sivu 21
Both these algorithms require the inversion of a matrix, though a simple iterative procedure also exists (the Adaline algorithm developed by Widrow and Hoff in the 1960s). Note that these regression techniques can also be used for ...
Both these algorithms require the inversion of a matrix, though a simple iterative procedure also exists (the Adaline algorithm developed by Widrow and Hoff in the 1960s). Note that these regression techniques can also be used for ...
Sivu 22
We can minimise L by differentiating with respect to the parameters (w,/>), and setting the resulting n + I linear expressions to zero. This is best expressed in matrix notation by setting w = (w'.fc) , and X = ...
We can minimise L by differentiating with respect to the parameters (w,/>), and setting the resulting n + I linear expressions to zero. This is best expressed in matrix notation by setting w = (w'.fc) , and X = ...
Mitä ihmiset sanovat - Kirjoita arvostelu
Yhtään arvostelua ei löytynyt.
Sisältö
1 | |
9 | |
KernelInduced Feature Spaces | 26 |
Generalisation Theory | 52 |
Optimisation Theory | 79 |
Support Vector Machines | 93 |
Implementation Techniques | 125 |
Applications of Support Vector Machines | 149 |
A Pseudocode for the SMO Algorithm | 162 |
References | 173 |
Index | 187 |
Muita painoksia - Näytä kaikki
Yleiset termit ja lausekkeet
1-norm soft margin algorithm analysis applied approach Bayesian bias bound Chapter choice classification computational consider constraints convergence convex corresponding datasets Definition described dual problem dual representation fat-shattering dimension feasibility gap feature mapping feature space finite Gaussian processes generalisation error geometric margin given Hence heuristics high dimensional Hilbert space hyperplane hypothesis inequality inner product space input space introduced iterative Karush-Kuhn-Tucker kernel function kernel matrix Lagrange multipliers Lagrangian learning algorithm linear functions linear learning machines loss function machine learning margin distribution margin slack vector maximal margin hyperplane maximise minimise norm objective function obtained on-line optimisation problem parameters perceptron perceptron algorithm performance positive semi-definite primal and dual quantity random examples real-valued function Remark result ridge regression Section sequence slack variables soft margin optimisation solution solve subset Support Vector Machines SVMs techniques Theorem training data training examples training points training set update Vapnik VC dimension weight vector zero