7
голосов
3ответов
4538 просмотров

Handling + as a special character in Lucene search

How do i make sure lucene gives me back relevant search results when my input string contains terms like c++? Lucene seems to ignore ++ characters. Code details: When I execute this line,I get a blank search query. queryField = multiFieldQueryParser.Parse(inpKeywords); keywordsQuery.Add(query...

5
голосов
1ответов
4291 просмотров

Lucene.Net TermQuery wildcard search

I have a lucene index I am trying to do a wildcard search. In index i have a character like '234Test2343' I am trying to do the search like %Test%.. My lucene syntax looks like string catalogNumber="test"; Term searchTerm = new Term("FIELD", "*"+catalogNumber+"*"); Query query = new TermQuery(s...

9
голосов
3ответов
6116 просмотров

Need to know pros and cons of using RAMDirectory

I need to improve performance of my Lucene search query. Can I use RAMDirectory?Does it optimize performance?Is there any index size limit for this? I would appreciate if someone could list pros and cons of using a RAMDirectory. Thanks.

1
голосов
2ответов
9160 просмотров

Lucene crawler (it needs to build lucene index)

I am looking for Apache Lucene web crawler written in java if possible or in any other language. The crawler must use lucene and create a valid lucene index and document files, so this is the reason why nutch is eliminated for example... Does anybody know does such a web crawler exist and can If...

3
голосов
2ответов
1245 просмотров

How do I delete old documents from Lucene/Lucene.NET

What is the idiomatic way to delete old documents from a Lucene Index? I have a date field (YYYYMMddhhmmss) on all of the documents, and I'd like to remove anything more than a day old (for example). Should I perform a filtered search or enumerate through the IndexReader's documents? I'm sure ...

0
голосов
2ответов
361 просмотров

Why is the analyzer defined globally in Zend.Search.Lucene?

I just noticed that the Zend lucene implementation has a default analyzer that can be modified using Zend_Search_Lucene_Analysis_Analyzer::setDefault(), but I couldn't find a way to override that default when performing a query. Do I really need to reset the default analyzer if I'm working on mul...

3
голосов
2ответов
723 просмотров

Does Zend Lucene support MultiValued Fields?

I wanted to know if Zend Lucene supports multivalued fields. I tried passing a an array to a field and it doesnt give any errors during indexing. But its not returning any results when i search. Any help is appreciated.

4
голосов
2ответов
2463 просмотров

Querying lucene tokens without indexing

I am using Lucene (or more specifically Compass), to log threads in a forum and I need a way to extract the keywords behind the discussion. That said, I don't want to index every entry someone makes, but rather I'd have a list of 'keywords' that are relevant to a certain context and if the entry ...

4
голосов
2ответов
2395 просмотров

Retrieving per keyword/field match position in Lucene Solr -- possible?

Is there any way to retrieve the match field/position for each keyword for each matching document from solr? For example, if the document has title "Retrieving per keyword/field match position in Lucene Solr -- possible?" and the query is "solr keyword", I'd like to get, in addition to the doc-i...

16
голосов
2ответов
7456 просмотров

Solr DIH -- How to handle deleted documents?

I'm playing around with a Solr-powered search for my webapp, and I figured it'd be best to use the DataImportHandler to handle syncing with the app via the database. I like the elegance of just checking the last_updated_date field. Good stuff. However, I don't know how to handle deleting docum...

5
голосов
3ответов
775 просмотров

Lucene query - "Match exactly one of x, y, z"

I have a Lucene index that contains documents that have a "type" field, this field can be one of three values "article", "forum" or "blog". I want the user to be able to search within these types (there is a checkbox for each document type) How do I create a Lucene query dependent on which types...

2
голосов
1ответов
265 просмотров

Is the Lucene 2.9 TokenStream API faster than the old one?

I have been looking at upgrading from 2.4 to 2.9 and noticed all the contrived code that handles attributes. Just wondering if anyone has any opinions if this will change given its a .9 and things will change when 3.0 is out. I am confused how creating attributes by reflection and stashing attri...

1
голосов
1ответов
811 просмотров

Lucene - Searching several terms in different fields

I have a Lucene index which populates from a database. I store/index some fields and then add a FullText field in which I index the contents of all the other fields, so I can do a general search. Now let's say I have a document with the following two fields: fld1 - "Samsung releases a new 22'' L...

19
голосов
5ответов
22294 просмотров

figuring out reason for maxClauseCount is set to 1024 error

I've two sets of search indexes. TestIndex (used in our test environment) and ProdIndex(used in PRODUCTION environment). Lucene search query: +date:[20090410184806 TO 20091007184806] works fine for test index but gives this error message for Prod index. "maxClauseCount is set to 1024" If I ...

2
голосов
2ответов
782 просмотров

Lucene stop phrases filter

I'm trying to write a filter for Lucene, similar to StopWordsFilter (thus implementing TokenFilter), but I need to remove phrases (sequence of tokens) instead of words. The "stop phrases" are represented themselves as a sequence of tokens: punctuation is not considered. I think I need to do som...

7
голосов
3ответов
4092 просмотров

Lucene / Lucene.NET - Document.SetBoost() values?

I know it takes in a float, but what are some typical values for various levels of boosting within a result? For example: If I wanted to boost a document's weighting by 10% then I should set it 1.1? For 20% then 1.2? What happens if I start setting boosts to values like 75.0? or 500.0? Edit:...

0
голосов
2ответов
1159 просмотров

Pylucene in Python 2.6 + MacOs Snow Leopard

Greetings, I'm trying to install Pylucene on my 32-bit python running on Snow Leopard. I compiled JCC with success. But I get warnings while making pylucene: ld: warning: in build/temp.macosx-10.6-i386-2.6/build/_lucene/__init__.o, file is not of required architecture ld: warning: in build/temp....

2
голосов
1ответов
1048 просмотров

Zend_Search_Lucene query parsing problem

Here's the setup, I have a Lucene Index and it works well with the 2,000 documents I have indexed. I have been using Luke (Lucene Index Toolbox, v.0.9.2) to debug queries, and am using ZF 1.9. The layout for my Lucene Index is as follows: I = Indexed T = Tokenized S = Stored Fields: author - I...

2
голосов
2ответов
537 просмотров

lucene, or sql fulltext?

I want to create a search website to search docs (all kinds of formats including pdf), images, videos, and audio. I also want to be able to filter my search results based on some criteria like author name, date, etc. I'm doing this in .NET, so what's the easiest way to get up and running? SQ...

1
голосов
1ответов
285 просмотров

Help needed bubbling up relevant records with most recent date

I've got 5 records in Lucene index. a.Record 1 contains--tax analysis.Date field value is March 2009 b.Record 2 contains--Senior tax analyst.Date field value is Aug 2009 c.Record 3 contains--Senior tax analyst.Date field value is July 2009 d.Record 4 contains--tax analyst.Date field value i...

1
голосов
2ответов
273 просмотров

Help needed ordering search results

I've 3 records in Lucene index. Record 1 contains healthcare in title field. Record 2 contains healthcare and insurance in description field but not together. Record 3 contains healthcare insurance in company name field. When a user searches for healthcare insurance,I want to show records in t...

1
голосов
1ответов
313 просмотров

What is the best Field Type/Encoding to store a number in a Zend Lucene Search Index?

How would I index a price int field in a Zend Lucene Search Index? I am currently using: $doc->addField(Zend_Search_Lucene_Field::Keyword('price', $price, 'utf-8')); Is this the correct way? Or should I be storing it specifically as a number somehow?

1
голосов
1ответов
939 просмотров

How to get Zend Lucene Range Search working properly (or help me debug)

I have an implementation of the Zend Search (Lucene) framework on my website that contains an index of products with prices. I am trying to allow customers to search for something, while contsraining the prices. Eg. Search for "dog food" between $5-$10 dollars. My search index looks like this:...

1
голосов
1ответов
2610 просмотров

Lucene (Java) - How to specify default search field programatically?

I have the following code and would appreciate your advice. QueryParser queryParser = new QueryParser(searchTerm, analyzer); Query query = queryParser.parse(searchTerm); My first question is, this "doubled"? As I have the "String to search for (=searchTerm)" in the constructor as well as...

8
голосов
2ответов
11758 просмотров

SOLR - Boost function (bf) to increase score of documents whose date is closest to NOW

I have a solr instance containing documents which have a 'startTime' field ranging from last month to a year from now. I'd like to add a boost query/function to boost the scores of documents whose startTime field is close to the current time. So far I have seen a lot of examples which use rord ...

8
голосов
1ответов
2340 просмотров

Creating and updating Zend_Search_Lucene indexes

I'm using Zend_Search_Lucene to create an index of articles to allow them to be searched on my website. Whenever a administrator updates/creates/deletes an article in the admin area, the index is rebuilt: $config = Zend_Registry::get("config"); $cache = $config->lucene->cache; $path = $cac...

3
голосов
2ответов
3079 просмотров

lucene larger than

Does anybody of you guys know how to search all the numbers larget than a specified one? for example: all the document number> 65 i tried like: documentNumber: [65 TO *] but i receive exception, as lucene expected to parse a number there not a *. Thanks in advance!

0
голосов
2ответов
298 просмотров

SOLR choose index at search

I posted this in the Nabble group also, but figured may get some advice here. is there a way to get SOLR to search whatever index i tell it to during search time without using multiple cores? i dont build my indexes with SOLR, i build them with my own java class, but i do use SOLR to search the...

4
голосов
2ответов
1708 просмотров

Java: from Lucene Hits to original objects

I'd like to implement a filter/search feature in my application using Lucene. Querying Lucene index gives me a Hits instance, which is nothing more than a list of Documents matching my criteria. Since I generate the indexed Documents from my objects, which is the best way to find the origin...

1
голосов
1ответов
1251 просмотров

How does Nutch's plug-in system work?

I am new to Nutch, but I know Nutch uses Lucene for indexing, which only understands text format. Nutch has many plug-ins that are used for crawling documents with a particular format. My doubt is: how does actually the Nutch plug-in system? I seen the Team wiki page for nutch I'd like some i...