Statistical Language Models for Information Retrieval

Statistical Language Models for Information Retrieval
Author: Chengxiang Zhai
Publsiher: Morgan & Claypool Publishers
Total Pages: 141
Release: 2009-01-08
Genre: Computers
ISBN: 9781598295917

Download Statistical Language Models for Information Retrieval Book in PDF, Epub and Kindle

As online information grows dramatically, search engines such as Google are playing a more and more important role in our lives. Critical to all search engines is the problem of designing an effective retrieval model that can rank documents accurately for a given query. This has been a central research problem in information retrieval for several decades. In the past ten years, a new generation of retrieval models, often referred to as statistical language models, has been successfully applied to solve many different information retrieval problems. Compared with the traditional models such as the vector space model, these new models have a more sound statistical foundation and can leverage statistical estimation to optimize retrieval parameters. They can also be more easily adapted to model non-traditional and complex retrieval problems. Empirically, they tend to achieve comparable or better performance than a traditional model with less effort on parameter tuning. This book systematically reviews the large body of literature on applying statistical language models to information retrieval with an emphasis on the underlying principles, empirically effective language models, and language models developed for non-traditional retrieval tasks. All the relevant literature has been synthesized to make it easy for a reader to digest the research progress achieved so far and see the frontier of research in this area. The book also offers practitioners an informative introduction to a set of practically useful language models that can effectively solve a variety of retrieval problems. No prior knowledge about information retrieval is required, but some basic knowledge about probability and statistics would be useful for fully digesting all the details. Table of Contents: Introduction / Overview of Information Retrieval Models / Simple Query Likelihood Retrieval Model / Complex Query Likelihood Model / Probabilistic Distance Retrieval Model / Language Models for Special Retrieval Tasks / Language Models for Latent Topic Analysis / Conclusions

Language Modeling for Information Retrieval

Language Modeling for Information Retrieval
Author: W. Bruce Croft,John Lafferty
Publsiher: Springer Science & Business Media
Total Pages: 253
Release: 2013-04-17
Genre: Computers
ISBN: 9789401701716

Download Language Modeling for Information Retrieval Book in PDF, Epub and Kindle

A statisticallanguage model, or more simply a language model, is a prob abilistic mechanism for generating text. Such adefinition is general enough to include an endless variety of schemes. However, a distinction should be made between generative models, which can in principle be used to synthesize artificial text, and discriminative techniques to classify text into predefined cat egories. The first statisticallanguage modeler was Claude Shannon. In exploring the application of his newly founded theory of information to human language, Shannon considered language as a statistical source, and measured how weH simple n-gram models predicted or, equivalently, compressed natural text. To do this, he estimated the entropy of English through experiments with human subjects, and also estimated the cross-entropy of the n-gram models on natural 1 text. The ability of language models to be quantitatively evaluated in tbis way is one of their important virtues. Of course, estimating the true entropy of language is an elusive goal, aiming at many moving targets, since language is so varied and evolves so quickly. Yet fifty years after Shannon's study, language models remain, by all measures, far from the Shannon entropy liInit in terms of their predictive power. However, tbis has not kept them from being useful for a variety of text processing tasks, and moreover can be viewed as encouragement that there is still great room for improvement in statisticallanguage modeling.

Statistical Language Models for Information Retrieval

Statistical Language Models for Information Retrieval
Author: Chengxiang Zhai
Publsiher: Springer Nature
Total Pages: 132
Release: 2022-05-31
Genre: Computers
ISBN: 9783031021305

Download Statistical Language Models for Information Retrieval Book in PDF, Epub and Kindle

As online information grows dramatically, search engines such as Google are playing a more and more important role in our lives. Critical to all search engines is the problem of designing an effective retrieval model that can rank documents accurately for a given query. This has been a central research problem in information retrieval for several decades. In the past ten years, a new generation of retrieval models, often referred to as statistical language models, has been successfully applied to solve many different information retrieval problems. Compared with the traditional models such as the vector space model, these new models have a more sound statistical foundation and can leverage statistical estimation to optimize retrieval parameters. They can also be more easily adapted to model non-traditional and complex retrieval problems. Empirically, they tend to achieve comparable or better performance than a traditional model with less effort on parameter tuning. This book systematically reviews the large body of literature on applying statistical language models to information retrieval with an emphasis on the underlying principles, empirically effective language models, and language models developed for non-traditional retrieval tasks. All the relevant literature has been synthesized to make it easy for a reader to digest the research progress achieved so far and see the frontier of research in this area. The book also offers practitioners an informative introduction to a set of practically useful language models that can effectively solve a variety of retrieval problems. No prior knowledge about information retrieval is required, but some basic knowledge about probability and statistics would be useful for fully digesting all the details. Table of Contents: Introduction / Overview of Information Retrieval Models / Simple Query Likelihood Retrieval Model / Complex Query Likelihood Model / Probabilistic Distance Retrieval Model / Language Models for Special Retrieval Tasks / Language Models for Latent Topic Analysis / Conclusions

Foundations of Statistical Natural Language Processing

Foundations of Statistical Natural Language Processing
Author: Christopher Manning,Hinrich Schutze
Publsiher: MIT Press
Total Pages: 719
Release: 1999-05-28
Genre: Language Arts & Disciplines
ISBN: 9780262303798

Download Foundations of Statistical Natural Language Processing Book in PDF, Epub and Kindle

Statistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications.

Introduction to Information Retrieval

Introduction to Information Retrieval
Author: Christopher D. Manning,Prabhakar Raghavan,Hinrich Schütze
Publsiher: Cambridge University Press
Total Pages: 135
Release: 2008-07-07
Genre: Computers
ISBN: 9781139472104

Download Introduction to Information Retrieval Book in PDF, Epub and Kindle

Class-tested and coherent, this textbook teaches classical and web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. It gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Slides and additional exercises (with solutions for lecturers) are also available through the book's supporting website to help course instructors prepare their lectures.

Information Retrieval Technology

Information Retrieval Technology
Author: Mohamed Vall Mohamed Salem,Khaled Shaalan,Farhad Oroumchian,Azadeh Shakery,Halim Khelalfa
Publsiher: Springer
Total Pages: 626
Release: 2011-12-14
Genre: Computers
ISBN: 9783642256318

Download Information Retrieval Technology Book in PDF, Epub and Kindle

This book constitutes the refereed proceedings of the 7th Asia Information Retrieval Societies Conference AIRS 2011, held in Dubai, United Arab Emirates, in December 2011. The 31 revised full papers and 25 revised poster papers presented were carefully reviewed and selected from 132 submissions. All current aspects of information retrieval - in theory and practice - are addressed; the papers are organized in topical sections on information retrieval models and theories; information retrieval applications and multimedia information retrieval; user study, information retrieval evaluation and interactive information retrieval; Web information retrieval, scalability and adversarial information retrieval; machine learning for information retrieval; natural language processing for information retrieval; arabic script text processing and retrieval.

Information Retrieval

Information Retrieval
Author: Stefan Büttcher,Charles L. A. Clarke,Gordon V. Cormack
Publsiher: MIT Press
Total Pages: 633
Release: 2010-07-23
Genre: Computers
ISBN: 9780262026512

Download Information Retrieval Book in PDF, Epub and Kindle

An introduction to information retrieval, the foundation for modern search engines, that emphasizes implementation and experimentation.

Annual Review of Information Science and Technology

Annual Review of Information Science and Technology
Author: Blaise Cronin
Publsiher: Information Today, Inc.
Total Pages: 712
Release: 2004
Genre: Computers
ISBN: 1573872091

Download Annual Review of Information Science and Technology Book in PDF, Epub and Kindle

ARIST, published annually since 1966, is a landmark publication within the information science community. It surveys the landscape of information science and technology, providing an analytical, authoritative, and accessible overview of recent trends and significant developments. The range of topics varies considerably, reflecting the dynamism of the discipline and the diversity of theoretical and applied perspectives. While ARIST continues to cover key topics associated with "classical" information science (e.g., bibliometrics, information retrieval), editor Blaise Cronin is selectively expanding its footprint in an effort to connect information science more tightly with cognate academic and professional communities.