Minimum Divergence Methods in Statistical Machine Learning

Minimum Divergence Methods in Statistical Machine Learning
Author: Shinto Eguchi,Osamu Komori
Publsiher: Springer Nature
Total Pages: 224
Release: 2022-03-14
Genre: Mathematics
ISBN: 9784431569220

Download Minimum Divergence Methods in Statistical Machine Learning Book in PDF, Epub and Kindle

This book explores minimum divergence methods of statistical machine learning for estimation, regression, prediction, and so forth, in which we engage in information geometry to elucidate their intrinsic properties of the corresponding loss functions, learning algorithms, and statistical models. One of the most elementary examples is Gauss's least squares estimator in a linear regression model, in which the estimator is given by minimization of the sum of squares between a response vector and a vector of the linear subspace hulled by explanatory vectors. This is extended to Fisher's maximum likelihood estimator (MLE) for an exponential model, in which the estimator is provided by minimization of the Kullback-Leibler (KL) divergence between a data distribution and a parametric distribution of the exponential model in an empirical analogue. Thus, we envisage a geometric interpretation of such minimization procedures such that a right triangle is kept with Pythagorean identity in the sense of the KL divergence. This understanding sublimates a dualistic interplay between a statistical estimation and model, which requires dual geodesic paths, called m-geodesic and e-geodesic paths, in a framework of information geometry. We extend such a dualistic structure of the MLE and exponential model to that of the minimum divergence estimator and the maximum entropy model, which is applied to robust statistics, maximum entropy, density estimation, principal component analysis, independent component analysis, regression analysis, manifold learning, boosting algorithm, clustering, dynamic treatment regimes, and so forth. We consider a variety of information divergence measures typically including KL divergence to express departure from one probability distribution to another. An information divergence is decomposed into the cross-entropy and the (diagonal) entropy in which the entropy associates with a generative model as a family of maximum entropy distributions; the cross entropy associates with a statistical estimation method via minimization of the empirical analogue based on given data. Thus any statistical divergence includes an intrinsic object between the generative model and the estimation method. Typically, KL divergence leads to the exponential model and the maximum likelihood estimation. It is shown that any information divergence leads to a Riemannian metric and a pair of the linear connections in the framework of information geometry. We focus on a class of information divergence generated by an increasing and convex function U, called U-divergence. It is shown that any generator function U generates the U-entropy and U-divergence, in which there is a dualistic structure between the U-divergence method and the maximum U-entropy model. We observe that a specific choice of U leads to a robust statistical procedure via the minimum U-divergence method. If U is selected as an exponential function, then the corresponding U-entropy and U-divergence are reduced to the Boltzmann-Shanon entropy and the KL divergence; the minimum U-divergence estimator is equivalent to the MLE. For robust supervised learning to predict a class label we observe that the U-boosting algorithm performs well for contamination of mislabel examples if U is appropriately selected. We present such maximal U-entropy and minimum U-divergence methods, in particular, selecting a power function as U to provide flexible performance in statistical machine learning.

Information Theory and Statistical Learning

Information Theory and Statistical Learning
Author: Frank Emmert-Streib,Matthias Dehmer
Publsiher: Springer Science & Business Media
Total Pages: 443
Release: 2009
Genre: Computers
ISBN: 9780387848150

Download Information Theory and Statistical Learning Book in PDF, Epub and Kindle

This interdisciplinary text offers theoretical and practical results of information theoretic methods used in statistical learning. It presents a comprehensive overview of the many different methods that have been developed in numerous contexts.

Geometric Science of Information

Geometric Science of Information
Author: Frank Nielsen,Frédéric Barbaresco
Publsiher: Springer Nature
Total Pages: 641
Release: 2023-07-31
Genre: Computers
ISBN: 9783031382710

Download Geometric Science of Information Book in PDF, Epub and Kindle

This book constitutes the proceedings of the 6th International Conference on Geometric Science of Information, GSI 2023, held in St. Malo, France, during August 30-September 1, 2023. The 125 full papers presented in this volume were carefully reviewed and selected from 161 submissions. They cover all the main topics and highlights in the domain of geometric science of information, including information geometry manifolds of structured data/information and their advanced applications. The papers are organized in the following topics: geometry and machine learning; divergences and computational information geometry; statistics, topology and shape spaces; geometry and mechanics; geometry, learning dynamics and thermodynamics; quantum information geometry; geometry and biological structures; geometry and applications.

Rank Based Methods for Shrinkage and Selection

Rank Based Methods for Shrinkage and Selection
Author: A. K. Md. Ehsanes Saleh,Mohammad Arashi,Resve A. Saleh,Mina Norouzirad
Publsiher: John Wiley & Sons
Total Pages: 484
Release: 2022-03-22
Genre: Mathematics
ISBN: 9781119625391

Download Rank Based Methods for Shrinkage and Selection Book in PDF, Epub and Kindle

Rank-Based Methods for Shrinkage and Selection A practical and hands-on guide to the theory and methodology of statistical estimation based on rank Robust statistics is an important field in contemporary mathematics and applied statistical methods. Rank-Based Methods for Shrinkage and Selection: With Application to Machine Learning describes techniques to produce higher quality data analysis in shrinkage and subset selection to obtain parsimonious models with outlier-free prediction. This book is intended for statisticians, economists, biostatisticians, data scientists and graduate students. Rank-Based Methods for Shrinkage and Selection elaborates on rank-based theory and application in machine learning to robustify the least squares methodology. It also includes: Development of rank theory and application of shrinkage and selection Methodology for robust data science using penalized rank estimators Theory and methods of penalized rank dispersion for ridge, LASSO and Enet Topics include Liu regression, high-dimension, and AR(p) Novel rank-based logistic regression and neural networks Problem sets include R code to demonstrate its use in machine learning

Data Science and Machine Learning

Data Science and Machine Learning
Author: Dirk P. Kroese,Zdravko Botev,Thomas Taimre,Radislav Vaisman
Publsiher: CRC Press
Total Pages: 538
Release: 2019-11-20
Genre: Business & Economics
ISBN: 9781000730777

Download Data Science and Machine Learning Book in PDF, Epub and Kindle

Focuses on mathematical understanding Presentation is self-contained, accessible, and comprehensive Full color throughout Extensive list of exercises and worked-out examples Many concrete algorithms with actual code

Statistical Foundations of Data Science

Statistical Foundations of Data Science
Author: Jianqing Fan,Runze Li,Cun-Hui Zhang,Hui Zou
Publsiher: CRC Press
Total Pages: 942
Release: 2020-09-21
Genre: Mathematics
ISBN: 9780429527616

Download Statistical Foundations of Data Science Book in PDF, Epub and Kindle

Statistical Foundations of Data Science gives a thorough introduction to commonly used statistical models, contemporary statistical machine learning techniques and algorithms, along with their mathematical insights and statistical theories. It aims to serve as a graduate-level textbook and a research monograph on high-dimensional statistics, sparsity and covariance learning, machine learning, and statistical inference. It includes ample exercises that involve both theoretical studies as well as empirical applications. The book begins with an introduction to the stylized features of big data and their impacts on statistical analysis. It then introduces multiple linear regression and expands the techniques of model building via nonparametric regression and kernel tricks. It provides a comprehensive account on sparsity explorations and model selections for multiple regression, generalized linear models, quantile regression, robust regression, hazards regression, among others. High-dimensional inference is also thoroughly addressed and so is feature screening. The book also provides a comprehensive account on high-dimensional covariance estimation, learning latent factors and hidden structures, as well as their applications to statistical estimation, inference, prediction and machine learning problems. It also introduces thoroughly statistical machine learning theory and methods for classification, clustering, and prediction. These include CART, random forests, boosting, support vector machines, clustering algorithms, sparse PCA, and deep learning.

Statistical Inference and Machine Learning for Big Data

Statistical Inference and Machine Learning for Big Data
Author: Mayer Alvo
Publsiher: Springer Nature
Total Pages: 442
Release: 2022-11-30
Genre: Mathematics
ISBN: 9783031067846

Download Statistical Inference and Machine Learning for Big Data Book in PDF, Epub and Kindle

This book presents a variety of advanced statistical methods at a level suitable for advanced undergraduate and graduate students as well as for others interested in familiarizing themselves with these important subjects. It proceeds to illustrate these methods in the context of real-life applications in a variety of areas such as genetics, medicine, and environmental problems. The book begins in Part I by outlining various data types and by indicating how these are normally represented graphically and subsequently analyzed. In Part II, the basic tools in probability and statistics are introduced with special reference to symbolic data analysis. The most useful and relevant results pertinent to this book are retained. In Part III, the focus is on the tools of machine learning whereas in Part IV the computational aspects of BIG DATA are presented. This book would serve as a handy desk reference for statistical methods at the undergraduate and graduate level as well as be useful in courses which aim to provide an overview of modern statistics and its applications.

Statistical Inference

Statistical Inference
Author: Ayanendranath Basu,Hiroyuki Shioya,Chanseok Park
Publsiher: CRC Press
Total Pages: 424
Release: 2011-06-22
Genre: Computers
ISBN: 9781420099669

Download Statistical Inference Book in PDF, Epub and Kindle

In many ways, estimation by an appropriate minimum distance method is one of the most natural ideas in statistics. However, there are many different ways of constructing an appropriate distance between the data and the model: the scope of study referred to by "Minimum Distance Estimation" is literally huge. Filling a statistical resource gap, Stati