Artificial Neural Network Terms

Downloaded 03/09/15 to 69.91.134.103. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.
php
book
2004/5/7
page 85
Glossary
activation function a function used as the basis set for a neural network; historically
these were indicator functions, but now they are sigmoidal functions
artificial neural network a neural network used as a statistical model, as distinguished from a model of a biological process for thought
backpropagation a gradient descent algorithm for fitting neural network parameters
that takes advantage of the structure of the model
bias an intercept parameter for a neural network (of course, it can also mean the
difference between the expected value of an estimator and the true value of what it is
trying to estimate)
feature explanatory variable
feature selection variable selection
feed forward predicted values are an explicit function of explanatory variables, as opposed to a network with cycles (feedback) which allow only an implicit representation
of predictions
Heaviside function an indicator function that takes value one when its argument is
true and value zero when its argument is false
hidden node logistic basis function
input explanatory variable
neural network a statistical model that uses logistic basis functions
output fitted value for a response variable
perceptron a hidden node in a neural network; historically these were indicator
functions used as basis functions, but the term is now sometimes applied to logistic
basis functions, usually in the phrase multilayer perceptron (which is just a standard
neural network)
radial basis network a mixture model, typically of normals with equal variance,
interpreted as a network structure analogous to a neural network
85
Downloaded 03/09/15 to 69.91.134.103. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
86
book
2004/5/7
page 86
Glossary
sigmoidal function a monotone function with lower and upper asymptotes and a
smooth rise between, which looks vaguely S shaped; for a neural network, this is
typically either a logistic or a hyperbolic tangent
softmax function transforms a set of real numbers to probabilities by exponentiating
each of them and then dividing by the sum of all of the exponentiated values
supervised learning standard training a model or algorithm to fit one or more response
variables using a set of explanatory variables
test dataset a subset of the data not used during model fitting and then used to validate
predicted values, a sort of simplified cross-validation; cf. training data set
threshold function indicator function; cf. Heaviside function
training dataset a subset of the data used to fit the parameters of the model; cf. test
data set
unsupervised learning fitting a model or algorithm without an observed response
variable, for example clustering
weights the (nonvariance) parameters of a neural network
book
2004/5/7
page 87
Bibliography
Abramowitz, M. and Stegun, I. A., eds. (1965). Handbook of Mathematical Functions. New
York: Dover Publications.
Akaike, H. (1974). A New Look at Statistical Model Identification. IEEE Transactions
on Automatic Control, AU19, 716722.
Anderson, J. A. (1982). Logistic Discrimination. In Classification, Pattern Recognition
and Reduction of Dimensionality, eds. P. R. Krishnaiah and L. N. Kanal, Vol. 2 of
Handbook of Statistics, 169191. Amsterdam: NorthHolland.
Andrieu, C., de Freitas, J. F. G., and Doucet, A. (2001). Robust Full Bayesian Learning for
Radial Basis Networks. Neural Computation, 13, 23592407.
Barbieri, M. M. and Berger, J. O. (2002). Optimal Predictive Model Selection. Tech.
Rep. 02-02, Duke University, ISDS.
Barron, A., Schervish, M. J., and Wasserman, L. (1999). The Consistency of Posterior
Distributions in Nonparametric Problems. Annals of Statistics, 27, 536561.
Berger, J. O. and Bernardo, J. M. (1992). On the Development of Reference Priors. In
Bayesian Statistics 4, eds. J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M.
Smith, 3560. New York: Oxford University Press.
Berger, J. O., De Oliveira, V., and Sans, B. (2001). Objective Bayesian Analysis of
Spatially Correlated Data. Journal of the American Statistical Association, 96, 1361
1374.
Berger, J. O. and Pericchi, L. R. (1996). The Intrinsic Bayes Factor for Model Selection
and Prediction. Journal of the American Statistical Association, 91, 109122.
Bernardo, J. M. (1979). Reference Posterior Distributions for Bayesian Inference (with
discussion). Journal of the Royal Statistical Society Series B, 41, 113147.
Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Chichester: John Wiley &
Sons.
Bishop, C. M. (1995). Neural Networks for Pattern Recognition. Oxford: Clarendon Press.
87
88
book
2004/5/7
page 88
Bibliography
Breiman, L. (1994). Heuristics of Instability in Model Selection. Tech. Rep., University

of California at Berkeley.
Breiman, L. (1996). Bagging Predictors. Machine Learning, 26, 123140.
Breiman, L. and Friedman, J. H. (1985). Estimating Optimal Transformations for Multiple
Regression and Correlation. Journal of the American Statistical Association, 80,
580619.
Breiman, L., Friedman, J. H., Olshen, R., and Stone, C. (1984). Classification and Regression Trees. Belmont, CA: Wadsworth.
Bridle, J. S. (1989). Probabilistic Interpretation of Feedforward Classification Network
Outputs, with Relationships to Statistical Pattern Recognition. In Neuro-computing:
Algorithms, Architectures and Applications, eds. F. F. Souli and J. Hault, 227236.
New York: Springer-Verlag.
Bhlmann, P. and Yu, B. (2002). Analyzing Bagging. Annals of Statistics, 30, 927961.
Buntine, W. L. and Weigend, A. S. (1991). Bayesian Back-Propagation. Complex Systems,
5, 603643.
Carlin, B. P. and Louis, T. A. (2000). Bayes and Empirical Bayes Methods for Data Analysis.
Boca Raton, FL: Chapman & Hall/CRC.
Cheng, B. and Titterington, D. M. (1994). Neural Networks: A Review from a Statistical
Perspective. Statistical Science, 9, 230.
Chipman, H., George, E., and McCulloch, R. (1998). Bayesian CART Model Search (with
discussion). Journal of the American Statistical Association, 93, 935960.
Chipman, H. A., George, E. I., and McCulloch, R. E. (2002). Bayesian Treed Models.
Machine Learning, 48, 303324.
Chui, C. K. (1988). Multivariate Splines. CBMS-NSF Regional Conf. Ser. in Appl. Math.
54. Philadelphia: SIAM.
Cleveland, W. S. (1979). Robust Locally-Weighted Regression and Smoothing Scatterplots. Journal of the American Statistical Association, 74, 829836.
Clyde, M. A. (1999). Bayesian Model Averaging and Model Search Strategies (with
discussion). In Bayesian Statistics 6, eds. J. M. Bernardo, J. O. Berger, A. P. Dawid,
and A. F. M. Smith, 157185. New York: Oxford University Press.
Clyde, M. A. and Lee, H. K. H. (2001). Bagging and the Bayesian Bootstrap. In Artificial
Intelligence and Statistics 2001, eds. T. Richardson and T. Jaakkola, 169174. San
Francisco: Morgan Kaufmann.
Congdon, P. (2001). Bayesian Statistical Modelling. New York: John Wiley & Sons.
Craig, B. (1997). Manatee: Selecting from Among Several Models. Presented at 1997
Joint Statistical Meetings, Anaheim, CA.
Bibliography
book
2004/5/7
page 89
89
Cressie, N. A. C. (1991). Statistics for Spatial Data. New York: Wiley-Interscience.

Cybenko, G. (1989). Approximation by Superpositions of a Sigmoidal Function. Mathematics of Control, Signals and Systems, 2, 303314.
de Boor, C. (2002). A Practical Guide to Splines. Revised edition. New York: SpringerVerlag.
Denison, D. G. T., Holmes, C. C., Mallick, B. K., and Smith, A. F. M. (2002). Bayesian
Methods for Nonlinear Classification and Regression. London: Wiley & Sons.
Denison, D. G. T., Mallick, B. K., and Smith, A. F. M. (1998a). A Bayesian CART
Algorithm. Biometrika, 85, 363377.
Denison, D. G. T., Mallick, B. K., and Smith, A. F. M. (1998b). Bayesian MARS. Statistics
and Computing, 8, 337346.
Diebolt, J. and Robert, C. (1994). Estimation of Finite Mixture Distributions through
Bayesian Sampling. Journal of the Royal Statistical Society Series B, 56, 363375.
Draper, D. (1995). Assessment and Propagation of Model Uncertainty. Journal of the
Royal Statistical Society Series B, 57, 4598.
Duda, R. O., Hart, P. E., and Stork, D. G. (2001). Pattern Classification. New York: John
Wiley & Sons.
Epanechnikov, V. (1969). Nonparametric Estimates of a Multivariate Probability Density.
Theory of Probability and its Applications, 14, 153158.
Evans, M. and Swartz, T. (1995). Methods for Approximating Integrals in Statistics with
Special Emphasis on Bayesian Integration Problems. Statistical Science, 10, 254272.
Fine, T. L. (1999). Feedforward Neural Network Methodology. New York: Springer-Verlag.
French, S. and Rios Insua, D. (2000). Statistical Decision Theory. London: Edward Arnold.
Friedman, J. H. (1991). Multivariate Adaptive Regression Splines. Annals of Statistics,
19, 1141.
Friedman, J. H. and Silverman, B. W. (1989). Flexible Parsimonious Smoothing and
Additive Modelling (with discussion). Technometrics, 31, 339.
Friedman, J. H. and Stuetzle, W. (1981). Projection Pursuit Regression. Journal of the
American Statistical Association, 76, 817823.
Funahashi, K. (1989). On the Approximate Realization of Continuous Mappings by Neural
Networks. Neural Networks, 2, 183192.
Gelfand, A. E. and Dey, D. K. (1994). Bayesian Model Choice: Asymptotics and Exact
Calculations. Journal of the Royal Statistical Society Series B, 56, 501514.
90
book
2004/5/7
page 90
Bibliography
Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (1995). Bayesian Data Analysis.
London: Chapman and Hall.
Geman, S. and Geman, D. (1984). Stochastic Relaxation, Gibbs Distributions and the
Bayesian Restoration of Images. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 6, 721741.
Gentle, J. E. (2002). Elements of Computational Statistics. New York: Springer-Verlag.
Geweke, J. (1989). Bayesian Inference in Econometric Models Using Monte Carlo Integration. Econometrica, 57, 13171340.
Gibbs, M. N. and MacKay, D. J. C. (1997). Efficient Implementation of Gaussian Processes. Tech. Rep., Cavendish Laboratory, Cambridge.
Gilks, W. R., Richardson, S., and Spiegelhalter, D. J. (1996). Markov Chain Monte Carlo
in Practice. London: Chapman and Hall.
Gordon, A. D. (1999). Classification. Monogr. Statist. Appl. Probab. 82, 2nd ed. Boca
Raton, FL: Chapman & Hall/CRC.
Green, P. J. (1995). Reversible Jump Markov Chain Monte Carlo Computation and
Bayesian Model Determination. Biometrika, 82, 711732.
Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized
Linear Models: A Roughness Penalty Approach. London: Chapman and Hall.
Grenander, U. (1981). Abstract Inference. New York: Wiley.
Hrdle, W. (1990). Applied Nonparametric Regression. Cambridge, UK: Cambridge University Press.
Hrdle, W. (1991). Smoothing Techniques with Implementation in S. New York: SpringerVerlag.
Hartigan, J. A. (1964). Invariant Prior Distributions. Annals of Mathematical Statistics,
35, 836845.
Hartigan, J. A. (1983). Bayes Theory. New York: Springer-Verlag.
Hastie, T. and Tibshirani, R. (1984). Generalized Additive Models. Tech. Rep. 98,
Stanford University, Department of Statistics.
Hastie, T. and Tibshirani, R. (1990). Generalized Additive Models. London: Chapman and
Hall.
Hastie, T., Tibshirani, R., and Friedman, J. (2001). The Elements of Statistical Learning.
Hastings, W. K. (1970). Monte Carlo Sampling Methods Using Markov Chains and Their
Applications. Biometrika, 57, 97109.
Bibliography
book
2004/5/7
page 91
91
Hawkins, D. (1989). Discussion of Flexible Parsimonious Smoothing and Additive Modelling by J. Friedman and B. Silverman. Technometrics, 31, 339.
Hjort, N. L. and Omre, H. (1994). Topics in Spatial Statistics. Scandinavian Journal of
Statistics, 21, 289357.
Hoeting, J. A., Madigan, D., Raftery, A. E., and Volinsky, C. T. (1999). Bayesian Model
Averaging: A Tutorial (with discussion). Statistical Science, 14, 382417.
Holmes, C. C. and Adams, N. M. (2002). A Probabilistic Nearest Neighbour Method for
Statistical Pattern Recognition. Journal of the Royal Statistical Society Series B, 64,
295306.
Holmes, C. C. and Mallick, B. K. (1998). Bayesian Radial Basis Functions of Variable
Dimension. Neural Computation, 10, 12171233.
Holmes, C. C. and Mallick, B. K. (2000). Bayesian Wavelet Networks for Nonparametric
Regression. IEEE Transactions on Neural Networks, 11, 2735.
Holmes, C. C. and Mallick, B. K. (2001). Bayesian Regression with Multivariate Linear
Splines. Journal of the Royal Statistical Society Series B, 63, 318.
Hornik, K., Stinchcombe, M., and White, H. (1989). Multilayer Feedforward Networks
are Universal Approximators. Neural Networks, 2, 359366.
Jeffreys, H. (1946). An Invariant Form for the Prior Probability in Estimation Problems.
Proceedings of the Royal Society London A, 186, 453461.
Jeffreys, H. (1961). Theory of Probability. 3rd ed. New York: Oxford University Press.
Kass, R. E. and Raftery, A. E. (1995). Bayes Factors. Journal of the American Statistical
Association, 90, 773795.
Kass, R. E. and Wasserman, L. (1995). A Reference Bayesian Test for Nested Hypotheses
and Its Relationship to the Schwarz Criterion. Journal of the American Statistical
Association, 90, 928934.
Kass, R. E. and Wasserman, L. (1996). The Selection of Prior Distributions by Formal
Rules. Journal of the American Statistical Association, 91, 13431370.
Keribin, C. (2000). Consistent Estimation of the Order of Mixture Models. Sankhya, 62,
4966.
Leamer, E. E. (1978). Specification Searches: Ad Hoc Inference with Nonexperimental
Data. New York: Wiley.
Lee, H. K. H. (2000). Consistency of Posterior Distributions for Neural Networks. Neural
Networks, 13, 629642.
Lee, H. K. H. (2001). Model Selection for Neural Network Classification. Journal of
Classification, 18, 227243.
92
book
2004/5/7
page 92
Bibliography
Lee, H. K. H. (2002). Difficulties in Estimating the Normalizing Constant of the Posterior

for a Neural Network. Journal of Computational and Graphical Statistics, 11, 222
235.
Lee, H. K. H. (2003). A Noninformative Prior for Neural Networks. Machine Learning,
50, 197212.
Loader, C. (1999). Local Regression and Likelihood . New York: Springer-Verlag.
MacKay, D. J. C. (1992). Bayesian Methods for Adaptive Methods. Ph.D. thesis, California Institute of Technology, Program in Computation and Neural Systems.
MacKay, D. J. C. (1994). Bayesian Non-linear Modeling for the Energy Prediction Competition. ASHRAE Transactions, 100, pt. 2, 10531062.
McCullagh, P. and Nelder, J. A. (1983). Generalized Linear Models. London: Chapman
and Hall.
McCulloch, W. S. and Pitts, W. (1943). A Logical Calculus of the Ideas Imminent in
Nervous Activity. Bulletin of Mathematical Biophysics, 5, 115133.
Meng, X. L. and Wong, W. H. (1996). Simulating Ratios of Normalizing Constants via a
Simple Identity: A Theoretical Exploration. Statistica Sinica, 4, 831860.
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953).
Equations of State Calculations by Fast Computing Machine. Journal of Chemical
Physics, 21, 10871091.
Minsky, M. L. and Papert, S. A. (1969). Perceptrons. Cambridge, MA: MIT Press.
Mller, P. and Rios Insua, D. (1998). Issues in Bayesian Analysis of Neural Network
Models. Neural Computation, 10, 571592.
Murata, N., Yoshizawa, S., and Amari, S. (1994). Network Information Criterion
Determining the Number of Hidden Units for an Artificial Neural Network Model.
IEEE Transactions on Neural Networks, 5, 6, 865871.
Neal, R. M. (1996). Bayesian Learning for Neural Networks. New York: Springer-Verlag.
Neal, R. M. (1999). Regression and Classification Using Gaussian Process Priors. In
Bayesian Statistics 6, eds. J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M.
Smith, 475501. Oxford: Clarendon Press.
Ogden, R. T. (1997). Essential Wavelets for Statistical Applications and Data Analysis.
Boston: Birkhuser.
OHagan, A. (1978). Curve Fitting and Optimal Design for Prediction. Journal of the
OHagan, A. (1995). Fractional Bayes Factors for Model Comparisons. Journal of the
Bibliography
book
2004/5/7
page 93
93
Press, S. J. (1989). Bayesian Statistics: Principles, Models, and Applications. New York:
John Wiley & Sons.
Raftery, A. E. (1996). Approximate Bayes Factors and Accounting for Model Uncertainty
in Generalized Linear Models. Biometrika, 83, 251266.
Raftery, A. E. and Madigan, D. (1994). Model Selection and Accounting for Model
Uncertainty in Graphical Models Using Occams Window. Journal of the American
Statistical Association, 89, 15351546.
Raftery, A. E., Madigan, D., and Hoeting, J. A. (1997). Bayesian Model Averaging for
Linear Regression Models. Journal of the American Statistical Association, 437,
179191.
Rios Insua, D. and Mller, P. (1998). Feedforward Neural Networks for Nonparametric
Regression. In Practical Nonparametric and Semiparametric Bayesian Statistics,
eds. D. Dey, P. Mller, and D. Sinha, 181193. New York: Springer-Verlag.
Ripley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge, UK: Cambridge University Press.
Robert, C. P. (2001). The Bayesian Choice: From Decision-Theoretic Foundations to
Computational Implementation. New York: Springer-Verlag.
Robert, C. P. and Casella, G. (2000). Monte Carlo Statistical Methods. New York: SpringerVerlag.
Robinson, M. (2001a). Priors for Bayesian Neural Networks. Masters thesis, University
of British Columbia, Department of Statistics.
Robinson, M. (2001b). Priors for Bayesian Neural Networks. In Computing Science
and Statistics, eds. E. J. Wegman, A. Braverman, A. Goodman, and P. Smyth, Vol. 33,
122127.
Rosenblatt, F. (1962). Principles of Neurodynamics: Perceptrons and the Theory of Brain
Mechanisms. Washington, D.C.: Spartan.
Rubin, D. B. (1981). The Bayesian Bootstrap. Annals of Statistics, 9, 130134.
Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning Internal Representations by Error Propagation. In Parallel Distributed Processing: Explorations in
the Microstructure of Cognition, eds. D. E. Rumelhart, J. L. McClelland, and the PDP
Research Group, Vol. 1, 318362. Cambridge, MA: MIT Press.
Sarle, W. S. (1994). Neural Network Implementation in SAS Software. In Proceedings
of the Nineteenth Annual SAS Users Group International Conference, 3851. SAS
Institute, Cary, NC.
Schervish, M. J. (1995). Theory of Statistics. New York: Springer-Verlag.
94
book
2004/5/7
page 94
Bibliography
Schwarz, G. (1978). Estimating the Dimension of a Model. Annals of Statistics, 6,

461464.
Silverman, B. W. (1985). Some Aspects of the Spline Smoothing Approach to Nonparametric Curve Fitting. Journal of the Royal Statistical Society Series B, 47, 152.
Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. London:
Chapman and Hall.
Stern, H. S. (1996). Neural Networks in Applied Statistics. Technometrics, 38, 205214.
Tierney, L. and Kadane, J. (1986). Accurate Approximations for Posterior Moments and
Marginal Densities. Journal of the American Statistical Association, 81, 8286.
Venables, W. N. and Ripley, B. D. (1999). Modern Applied Statistics with S-PLUS. 3rd ed.
Vidakovic, B. (1999). Statistical Modeling by Wavelets. New York: John Wiley & Sons.
Wahba, G. (1990). Spline Models for Observational Data. CBMS-NSF Regional Conf. Ser.
in Appl. Math. 59. Philadelphia: SIAM.
Warner, B. and Misra, M. (1996). Understanding Neural Networks as Statistical Tools.
American Statistician, 50, 284293.
Wasserman, L. (2000). Asymptotic Inference for Mixture Models by Using DataDependent Priors. Journal of the Royal Statistical Society Series B, 62, 159180.
Widrow, B. and Hoff, M. E. (1960). Institute of Radio Engineers, Western Electronic Show
and Convention, Convention Record. In Adaptive Switching Circuits, Vol. 4, 96104.
New York: Institute of Radio Engineers.
Williams, C. K. I. and Rasmussen, C. E. (1996). Gaussian Processes for Regression. In
Advances in Neural Information Processing Systems 8, eds. D. S. Tourestzky, M. C.
Mozer, and M. E. Haeelmo. Cambridge, MA: MIT Press.
Wong, W. H. and Shen, X. (1995). Probability Inequalities for Likelihood Ratios and
Convergence Rates of Sieve MLEs. Annals of Statistics, 23, 339362.
book
2004/5/7
page 95
Index
Fisher information, see information,
Fisher
Fourier series regression, 16, 18
full conditional distribution, see complete
conditional distribution
ACE, see alternating conditional expectation

adaline, 22
Akaike information criterion (AIC), 62
alternating conditional expectation, 72
75
ARD, see automatic relevance determination
automatic relevance determination, 63
64, 77
GAM, see generalized additive model

Gaussian process, 13, 18, 19, 22
generalized additive model, 3, 15, 18, 47,
7273, 75
Gibbs sampling, 29, 4951, 71
backpropagation, 22, 48, 49, 85

bagging, 6466
bank loan data, see loan data
BARS, see Bayesian random searching
Bayes factor, 6062, 69, 70
Bayesian information criterion (BIC), 62,
6771
Bayesian random searching, 7071, 73,
77
bootstrap, 6465
Bayesian, 65
bridge sampling, 61
BRUTO, 73, 75
Hellinger distance, 54, 55

histogram estimator, 13, 18, 19
hyperbolic tangent activation function,
28, 36
importance sampling, 61
information
criteria, see AIC, BIC, and NIC
Fisher, 4345, 81
Shannon, 46
kernel estimator, 1213, 18
KullbackLeibler neighborhood, 54
CART, see tree

classification and regression tree, see tree
complete conditional distribution, 4951,
71
conjugacy, 29, 33, 35, 38, 49, 50, 64
Laplace approximation, 61
loan data, 38, 2527, 59, 7377
locally weighted scatterplot smoothing
(LOWESS), 13, 18
logistic regression, 11, 21, 67, 76
density estimation, 13, 16, 19, 28

directed acyclic graph (DAG) model diagram, 3537
Markov chain Monte Carlo, 4851, 61,

62, 66, 67, 6971
95
96
Markov chain Monte Carlo model composition, 6971
MARS, see multivariate adaptive regression splines
MC3 , see Markov chain Monte Carlo
model composition
MCMC, see Markov chain Monte Carlo
median probability model, 61
MetropolisHastings sampling, 4951,
7071
mixture model, 1618, 28, 32, 40, 62
model averaging, 3, 30, 6263, 65, 66,
68, 70, 71
motorcycle data, 33
moving average estimator, 1213, 1820
multicollinearity, 3, 5, 7, 40, 58, 63, 71,
73, 74
multimodality, 48, 67
multinomial likelihood, 2327, 65
multivariate adaptive regression splines,
15, 18
nearest neighbor classification, 19
network information criterion (NIC), 62
Occams Window, 6871
ordinal response, 25
orthogonal polynomial, see polynomial
approximator
overfitting, 3, 11, 30, 46, 5759, 6365,
71
ozone data, 3, 8, 13, 15, 51, 58, 59, 65,
7173
partition model, 1315, 1820
perceptron, 22, 85
polynomial approximator, 16, 18
PPR, see projection pursuit regression
prior
asymptotic equivalence, 42
BergerBernardo, see prior, reference
conjugate, see conjugacy
book
2004/5/7
page 96
Index
flat, 3842, 45, 46, 4852, 58, 65,
71, 73
Jeffreys, 37, 4346, 5152
Mller and Rios Insua, 3435, 51
52
MacKay, 48
model space, 6061, 71
Neal, 3637, 5152, 63, 77
reference, 46, 5152, 8183
Robinson, 47, 63
weight decay, see weight decay
projection pursuit regression, 1516, 18,
19, 21
quadrature, 61
quasi-Newton methods, 48, 49
radial basis network, 17, 18, 28, 32, 85
reciprocal importance sampling, 61
recursive partitioning regression, 13, see
tree
reversible jump Markov chain Monte
Carlo (RJMCMC), 66, 7071
RPR, see recursive partitioning regression
Shannon information, see information,
Shannon
sieve asymptotics, 5354
sigmoidal function, 22, 2728, 85, 86
simulated annealing, 48
softmax, 24, 86
spline, 1416, 1819, 72
stepwise searching, 6773
tree, 1314, 18, 19, 65, 70, 77, 80
TURBO, 7275
variable selection, 3, 5, 30, 5763, 6574,
85
Voronoi tesselation, 14
wavelet, 1718
weight decay, 4647, 5152, 59, 63

Artificial Neural Network Terms

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Artificial Neural Network Terms

Enviado por

Direitos autorais:

Formatos disponíveis

Downloaded 03/09/15 to 69.91.134.103. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.

Breiman, L. (1994). Heuristics of Instability in Model Selection. Tech. Rep., University

Cressie, N. A. C. (1991). Statistics for Spatial Data. New York: Wiley-Interscience.

Lee, H. K. H. (2002). Difficulties in Estimating the Normalizing Constant of the Posterior

Schwarz, G. (1978). Estimating the Dimension of a Model. Annals of Statistics, 6,

ACE, see alternating conditional expectation

GAM, see generalized additive model

backpropagation, 22, 48, 49, 85

Hellinger distance, 54, 55

CART, see tree

density estimation, 13, 16, 19, 28

Markov chain Monte Carlo, 4851, 61,

Você também pode gostar