Scalability Challenges in Web Search Engines

Scalability Challenges in Web Search Engines
Author: B. Barla Cambazoglu,Ricardo Baeza-Yates
Publsiher: Springer Nature
Total Pages: 122
Release: 2022-06-01
Genre: Computers
ISBN: 9783031022982

Download Scalability Challenges in Web Search Engines Book in PDF, Epub and Kindle

In this book, we aim to provide a fairly comprehensive overview of the scalability and efficiency challenges in large-scale web search engines. More specifically, we cover the issues involved in the design of three separate systems that are commonly available in every web-scale search engine: web crawling, indexing, and query processing systems. We present the performance challenges encountered in these systems and review a wide range of design alternatives employed as solution to these challenges, specifically focusing on algorithmic and architectural optimizations. We discuss the available optimizations at different computational granularities, ranging from a single computer node to a collection of data centers. We provide some hints to both the practitioners and theoreticians involved in the field about the way large-scale web search engines operate and the adopted design choices. Moreover, we survey the efficiency literature, providing pointers to a large number of relatively important research papers. Finally, we discuss some open research problems in the context of search engine efficiency.

Advanced Topics in Information Retrieval

Advanced Topics in Information Retrieval
Author: Massimo Melucci,Ricardo Baeza-Yates
Publsiher: Springer Science & Business Media
Total Pages: 295
Release: 2011-06-10
Genre: Computers
ISBN: 9783642209468

Download Advanced Topics in Information Retrieval Book in PDF, Epub and Kindle

Information retrieval is the science concerned with the effective and efficient retrieval of documents starting from their semantic content. It is employed to fulfill some information need from a large number of digital documents. Given the ever-growing amount of documents available and the heterogeneous data structures used for storage, information retrieval has recently faced and tackled novel applications. In this book, Melucci and Baeza-Yates present a wide-spectrum illustration of recent research results in advanced areas related to information retrieval. Readers will find chapters on e.g. aggregated search, digital advertising, digital libraries, discovery of spam and opinions, information retrieval in context, multimedia resource discovery, quantum mechanics applied to information retrieval, scalability challenges in web search engines, and interactive information retrieval evaluation. All chapters are written by well-known researchers, are completely self-contained and comprehensive, and are complemented by an integrated bibliography and subject index. With this selection, the editors provide the most up-to-date survey of topics usually not addressed in depth in traditional (text)books on information retrieval. The presentation is intended for a wide audience of people interested in information retrieval: undergraduate and graduate students, post-doctoral researchers, lecturers, and industrial researchers.

Business Information Systems

Business Information Systems
Author: Witold Abramowicz,Dalia Kriksciuniene,Virgilijus Sakalauskas
Publsiher: Springer
Total Pages: 330
Release: 2012-05-17
Genre: Business & Economics
ISBN: 9783642303593

Download Business Information Systems Book in PDF, Epub and Kindle

This book contains the refereed proceedings of the 15th International Conference on Business Information Systems, BIS 2012, held in Vilnius, Lithuania, in May 2012. The 26 revised full papers were carefully reviewed and selected from 70 submissions. They are grouped into nine sessions on business process discovery, business process verification, service architectures, collaborative BIS, data management, Web search applications, BIS in finance, decision support, and specific BIS issues. The volume is completed by an invited paper on "Information Systems and Business and Information Systems Engineering."

Semantic Keyword Based Search on Structured Data Sources

Semantic Keyword Based Search on Structured Data Sources
Author: Andrea Calì,Dorian Gorgan,Martín Ugarte
Publsiher: Springer
Total Pages: 197
Release: 2017-02-13
Genre: Computers
ISBN: 9783319536408

Download Semantic Keyword Based Search on Structured Data Sources Book in PDF, Epub and Kindle

This book constitutes the thoroughly refereed post-conference proceedings of the Second COST Action IC1302 International KEYSTONE Conference on Semantic Keyword-Based Search on Structured Data Sources, IKC 2016, held in Cluj-Napoca, Romania, in September 2016. The 15 revised full papers and 2 invited papers are reviewed and selected from 18 initial submissions and cover the areas of keyword extraction, natural language searches, graph databases, information retrieval techniques for keyword search and document retrieval.

The Past Web

The Past Web
Author: Daniel Gomes,Elena Demidova,Jane Winters,Thomas Risse
Publsiher: Springer Nature
Total Pages: 297
Release: 2021-06-30
Genre: Computers
ISBN: 9783030632915

Download The Past Web Book in PDF, Epub and Kindle

This book provides practical information about web archives, offers inspiring examples for web archivists, raises new challenges, and shares recent research results about access methods to explore information from the past preserved by web archives. The book is structured in six parts. Part 1 advocates for the importance of web archives to preserve our collective memory in the digital era, demonstrates the problem of web ephemera and shows how web archiving activities have been trying to address this challenge. Part 2 then focuses on different strategies for selecting web content to be preserved and on the media types that different web archives host. It provides an overview of efforts to address the preservation of web content as well as smaller-scale but high-quality collections of social media or audiovisual content. Next, Part 3 presents examples of initiatives to improve access to archived web information and provides an overview of access mechanisms for web archives designed to be used by humans or automatically accessed by machines. Part 4 presents research use cases for web archives. It also discusses how to engage more researchers in exploiting web archives and provides inspiring research studies performed using the exploration of web archives. Subsequently, Part 5 demonstrates that web archives should become crucial infrastructures for modern connected societies. It makes the case for developing web archives as research infrastructures and presents several inspiring examples of added-value services built on web archives. Lastly, Part 6 reflects on the evolution of the web and the sustainability of web archiving activities. It debates the requirements and challenges for web archives if they are to assume the responsibility of being societal infrastructures that enable the preservation of memory. This book targets academics and advanced professionals in a broad range of research areas such as digital humanities, social sciences, history, media studies and information or computer science. It also aims to fill the need for a scholarly overview to support lecturers who would like to introduce web archiving into their courses by offering an initial reference for students.

Global Information Technologies Concepts Methodologies Tools and Applications

Global Information Technologies  Concepts  Methodologies  Tools  and Applications
Author: Tan, Felix B.
Publsiher: IGI Global
Total Pages: 4194
Release: 2007-10-31
Genre: Computers
ISBN: 9781599049403

Download Global Information Technologies Concepts Methodologies Tools and Applications Book in PDF, Epub and Kindle

"This collection compiles research in all areas of the global information domain. It examines culture in information systems, IT in developing countries, global e-business, and the worldwide information society, providing critical knowledge to fuel the future work of researchers, academicians and practitioners in fields such as information science, political science, international relations, sociology, and many more"--Provided by publisher.

LC21

LC21
Author: National Research Council,Commission on Physical Sciences, Mathematics, and Applications,Computer Science and Telecommunications Board,Committee on an Information Technology Strategy for the Library of Congress
Publsiher: National Academies Press
Total Pages: 284
Release: 2001-01-23
Genre: Law
ISBN: 9780309171687

Download LC21 Book in PDF, Epub and Kindle

Digital information and networks challenge the core practices of libraries, archives, and all organizations with intensive information management needs in many respectsâ€"not only in terms of accommodating digital information and technology, but also through the need to develop new economic and organizational models for managing information. LC21: A Digital Strategy for the Library of Congress discusses these challenges and provides recommendations for moving forward at the Library of Congress, the world's largest library. Topics covered in LC21 include digital collections, digital preservation, digital cataloging (metadata), strategic planning, human resources, and general management and budgetary issues. The book identifies and elaborates upon a clear theme for the Library of Congress that is applicable more generally: the digital age calls for much more collaboration and cooperation than in the past. LC21 demonstrates that information-intensive organizations will have to change in fundamental ways to survive and prosper in the digital age.

String Processing and Information Retrieval

String Processing and Information Retrieval
Author: Christina Boucher,Sharma V. Thankachan
Publsiher: Springer Nature
Total Pages: 307
Release: 2020-10-18
Genre: Computers
ISBN: 9783030592127

Download String Processing and Information Retrieval Book in PDF, Epub and Kindle

This book constitutes the refereed proceedings of the 27th International Symposium on String Processing and Information Retrieval, SPIRE 2020, held in Orlando, FL, USA, in October 2020. The 17 full papers and 4 short papers presented in this volume were carefully reviewed and selected from 32 submissions. They cover topics such as: data structures; algorithms; information retrieval; compression; combinatorics on words; and computational biology.