مشخصات کتاب
-
Rafał Kuć
-
2013
-
انگلیسی
-
2489
-
343
-
0
Apache Solr 4 Cookbook
Chapter 1: Apache Solr Configuration 5
Introduction 5
Running Solr on Jetty 6
Running Solr on Apache Tomcat 10
Installing a standalone ZooKeeper 14
Clustering your data 15
Choosing the right directory implementation 17
Configuring spellchecker to not use its own index 19
Solr cache configuration 22
How to fetch and index web pages 27
How to set up the extracting request handler 30
Changing the default similarity implementation 32
Chapter 2: Indexing Your Data 35
Introduction 35
Indexing PDF files 36
Generating unique fields automatically 38
Extracting metadata from binary files 40
How to properly configure Data Import Handler with JDBC 42
Indexing data from a database using Data Import Handler 45
How to import data using Data Import Handler and delta query 48
How to use Data Import Handler with the URL data source 50
How to modify data while importing with Data Import Handler 53
Updating a single field of your document 56
Handling multiple currencies 59
Detecting the document's language 62
Optimizing your primary key field indexing 67Chapter 3: Analyzing Your Text Data 69
Introduction 70
Storing additional information using payloads 70
Eliminating XML and HTML tags from text 73
Copying the contents of one field to another 75
Changing words to other words 77
Splitting text by CamelCase 80
Splitting text by whitespace only 82
Making plural words singular without stemming 84
Lowercasing the whole string 87
Storing geographical points in the index 88
Stemming your data 91
Preparing text to perform an efficient trailing wildcard search 93
Splitting text by numbers and non-whitespace characters 96
Using Hunspell as a stemmer 99
Using your own stemming dictionary 101
Protecting words from being stemmed 103
Chapter 4: Querying Solr 107
Introduction 108
Asking for a particular field value 108
Sorting results by a field value 109
How to search for a phrase, not a single word 111
Boosting phrases over words 114
Positioning some documents over others in a query 117
Positioning documents with words closer to each other first 122
Sorting results by the distance from a point 125
Getting documents with only a partial match 128
Affecting scoring with functions 130
Nesting queries 134
Modifying returned documents 136
Using parent-child relationships 139
Ignoring typos in terms of performance 142
Detecting and omitting duplicate documents 145
Using field aliases 148
Returning a value of a function in the results 151
Chapter 5: Using the Faceting Mechanism 155
Introduction 155
Getting the number of documents with the same field value 156
Getting the number of documents with the same value range 158Getting the number of documents matching the query and subquery 161
Removing filters from faceting results 164
Sorting faceting results in alphabetical order 168
Implementing the autosuggest feature using faceting 171
Getting the number of documents that don't have a value in the field 174
Having two different facet limits for two different fields in the same query 177
Using decision tree faceting 180
Calculating faceting for relevant documents in groups 183
Chapter 6: Improving Solr Performance 187
Introduction 187
Paging your results quickly 188
Configuring the document cache 189
Configuring the query result cache 190
Configuring the filter cache 192
Improving Solr performance right after the startup or commit operation 194
Caching whole result pages 197
Improving faceting performance for low cardinality fields 198
What to do when Solr slows down during indexing 200
Analyzing query performance 202
Avoiding filter caching 206
Controlling the order of execution of filter queries 207
Improving the performance of numerical range queries 208
Chapter 7: In the Cloud 211
Introduction 211
Creating a new SolrCloud cluster 211
Setting up two collections inside a single cluster 214
Managing your SolrCloud cluster 216
Understanding the SolrCloud cluster administration GUI 220
Distributed indexing and searching 223
Increasing the number of replicas on an already live cluster 227
Stopping automatic document distribution among shards 230
Chapter 8: Using Additional Solr Functionalities 235
Introduction 235
Getting more documents similar to those returned in the results list 236
Highlighting matched words 238
How to highlight long text fields and get good performance 241
Sorting results by a function value 243
Searching words by how they sound 246
Ignoring defined words 248Computing statistics for the search results 250
Checking the user's spelling mistakes 253
Using field values to group results 257
Using queries to group results 260
Using function queries to group results 262
Chapter 9: Dealing with Problems 265
Introduction 265
How to deal with too many opened files 265
How to deal with out-of-memory problems 267
How to sort non-English languages properly 268
How to make your index smaller 272
Diagnosing Solr problems 274
How to avoid swapping 280
Appendix: Real-life Situations 283
Introduction 283
How to implement a product's autocomplete functionality 284
How to implement a category's autocomplete functionality 287
How to use different query parsers in a single query 290
How to get documents right after they were sent for indexation 292
How to search your data in a near real-time manner 294
How to get the documents with all the query words to the top
of the results set 296
How to boost documents based on their publishing date 300
Index 305