Phrase query for a tokenized text field should do it. -- Jack Krupansky
On Thu, Oct 1, 2015 at 10:04 PM, Bhaskar <bhaskar1...@gmail.com> wrote: > Hi Jack, > > my searching is working like this. > > if i give input as "SD RAM Bhaskar" then which ever strings are having > "SD", "RAM", "Bhaskar" all results are coming . > > i.e. "SD lib" > > "RAM hello" > > "hi Bhaskar " > > "Bhaskar hai SD" > > But I want below output. > > "SD RAM Bhaskar" > > "SD RAM Bhaskar hello" > > i.e in the begining string have "SD RAM Bhaskar" then next string can be > any thing. > > but my current application result where ever it finds the "SD", or "RAM", > or "Bhaskar" I am getting all the string. > > Regards, > Bhaskar > On Oct 2, 2015 1:19 AM, "Jack Krupansky" <jack.krupan...@gmail.com> wrote: > > > Technically, there is no such thing as a "sentence search" in Lucene. > > Please provide an example of how you wish to search, and then we can > > determine whether a phrase query or a span query might accomplish the > task. > > > > -- Jack Krupansky > > > > On Thu, Oct 1, 2015 at 11:53 AM, Bhaskar <bhaskar1...@gmail.com> wrote: > > > > > Hi, > > > I am looking for sentence search rather than word search. > > > Regards, > > > Bhaskar > > > On Oct 1, 2015 7:07 PM, "Ian Lea" <ian....@gmail.com> wrote: > > > > > > > Take a look at > > > > > > > > > > http://lucene.apache.org/core/5_3_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description > > > > . > > > > Sounds like you want an AND, or a +, or both. You may also want to > > > > take a look at phrase queries and/or span queries. > > > > > > > > > > > > -- > > > > Ian. > > > > > > > > > > > > > > > > -- > > > > Ian. > > > > > > > > > > > > On Thu, Oct 1, 2015 at 1:52 PM, Bhaskar <bhaskar1...@gmail.com> > wrote: > > > > > Hi Uwe, > > > > > my searching is working like this. > > > > > if i give input as "SD RAM Bhaskar" then which ever strings are > > having > > > > > "SD", "RAM", "Bhaskar" all results are coming . > > > > > i.e. "SD lib" > > > > > "RAM hello" > > > > > "hi Bhaskar " > > > > > "Bhaskar hai SD" > > > > > > > > > > > > > > > But I want below output. > > > > > "SD RAM Bhaskar" > > > > > "SD RAM Bhaskar hello" > > > > > i.e in the begining string have "SD RAM Bhaskar" then next string > > can > > > be > > > > > any thing. > > > > > > > > > > > > > > > but my current application result where ever it finds the "SD", or > > > "RAM", > > > > > or "Bhaskar" I am getting all the string. > > > > > > > > > > > > > > > Can you please advice? > > > > > Thanks a lot in advance. > > > > > > > > > > Regards, > > > > > Bhaskar > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Sep 30, 2015 at 12:23 PM, Uwe Schindler <u...@thetaphi.de> > > > wrote: > > > > > > > > > >> Hi Bhaskar, > > > > >> > > > > >> the answer is very simple: Your analysis is not useful for the > type > > of > > > > >> queries and data you are using. You are using SimpleAnalyzer in > your > > > > >> search/indexing code: > > > > >> > > > > >> > > > > >> > > > > > > > > > > https://lucene.apache.org/core/5_3_1/analyzers-common/org/apache/lucene/analysis/core/SimpleAnalyzer.html > > > > >> "An Analyzer that filters LetterTokenizer with LowerCaseFilter" > > > > >> > > > > >> And LetterTokenizer does the following: > > > > >> > > > > >> > > > > > > > > > > https://lucene.apache.org/core/5_3_1/analyzers-common/org/apache/lucene/analysis/core/LetterTokenizer.html > > > > >> "A LetterTokenizer is a tokenizer that divides text at > non-letters. > > > > That's > > > > >> to say, it defines tokens as maximal strings of adjacent letters, > as > > > > >> defined by java.lang.Character.isLetter() predicate." > > > > >> > > > > >> So it creates a new token at every non-letter boundary. All > > > non-letters > > > > >> are discarded (because they are treated as token boundary). So > your > > > > queries > > > > >> can never match. > > > > >> > > > > >> I'd suggest to first inform yourself about analysis and choose a > > > better > > > > >> one that suits your underlying data and the queries you want to > do. > > > > Maybe > > > > >> use WhitespaceAnalyzer or better StandardAnalyzer as a first step. > > Be > > > > sure > > > > >> to reindex your data before querying. The Analyzer used on the > > search > > > > side > > > > >> must be the same like on the query side. If you want to use > > wildcards, > > > > you > > > > >> have to take care more, because wildcards are not really natural > for > > > > "full > > > > >> text search engine" and may cause inconsistent results. > > > > >> > > > > >> Uwe > > > > >> > > > > >> ----- > > > > >> Uwe Schindler > > > > >> H.-H.-Meier-Allee 63, D-28213 Bremen > > > > >> http://www.thetaphi.de > > > > >> eMail: u...@thetaphi.de > > > > >> > > > > >> > -----Original Message----- > > > > >> > From: Bhaskar [mailto:bhaskar1...@gmail.com] > > > > >> > Sent: Wednesday, September 30, 2015 4:28 AM > > > > >> > To: java-user@lucene.apache.org > > > > >> > Subject: Re: Need help in alphanumeric search > > > > >> > > > > > >> > Hi Uwe, > > > > >> > > > > > >> > Below is my indexing code: > > > > >> > > > > > >> > public static void main(String[] args) throws Exception { //Path > > > > >> indexDir = > > > > >> > new Path(INDEX_DIR); public static final String INDEX_DIR = > > > > >> "c:/DBIndexAll/"; > > > > >> > final Path indexDir = Paths.get(INDEX_DIR); SimpleDBIndexer > > indexer > > > = > > > > new > > > > >> > SimpleDBIndexer(); try{ > > > > >> > Class.forName(JDBC_DRIVER).newInstance(); > > > > >> > Connection conn = DriverManager.getConnection(CONNECTION_URL > + > > > > >> > DBNAME, USER_NAME, PASSWORD); > > > > >> > SimpleAnalyzer analyzer = new SimpleAnalyzer(); > > > > >> > IndexWriterConfig indexWriterConfig = new > > > > IndexWriterConfig(analyzer); > > > > >> > IndexWriter indexWriter = new > > > > IndexWriter(FSDirectory.open(indexDir), > > > > >> > indexWriterConfig); > > > > >> > System.out.println("Indexing to directory '" + indexDir + > > > "'..."); > > > > >> > int indexedDocumentCount = indexer.indexDocs(indexWriter, > > conn); > > > > >> > indexWriter.close(); > > > > >> > System.out.println(indexedDocumentCount + " records have been > > > > indexed > > > > >> > successfully"); } catch (Exception e) { > > > > >> > e.printStackTrace(); > > > > >> > } > > > > >> > } > > > > >> > > > > > >> > int indexDocs(IndexWriter writer, Connection conn) throws > > Exception > > > { > > > > >> > String sql = QUERY1; > > > > >> > Statement stmt = conn.createStatement(); > > > > >> > ResultSet rs = stmt.executeQuery(sql); > > > > >> > int i=0; > > > > >> > while (rs.next()) { > > > > >> > Document d = new Document(); > > > > >> > d.add(new TextField("cpn", rs.getString("cpn"), > > > > Field.Store.YES)); > > > > >> > > > > > >> > writer.addDocument(d); > > > > >> > i++; > > > > >> > } > > > > >> > stmt.close(); > > > > >> > rs.close(); > > > > >> > > > > > >> > return i; > > > > >> > } > > > > >> > > > > > >> > > > > > >> > Searching code: > > > > >> > > > > > >> > public class SimpleDBSearcher { > > > > >> > // PLASTRON > > > > >> > private static final String LUCENE_QUERY = "SD*"; private static > > > final > > > > >> int > > > > >> > MAX_HITS = 500; private static final String INDEX_DIR = > > > > "C:/DBIndexAll/"; > > > > >> > > > > > >> > public static void main(String[] args) throws Exception { // > File > > > > >> indexDir = new > > > > >> > File(SimpleDBIndexer.INDEX_DIR); final Path indexDir = > > > > >> > Paths.get(SimpleDBIndexer.INDEX_DIR); > > > > >> > String query = LUCENE_QUERY; > > > > >> > SimpleDBSearcher searcher = new SimpleDBSearcher(); > > > > >> > searcher.searchIndex(indexDir, query); } > > > > >> > > > > > >> > private void searchIndex(Path indexDir, String queryStr) throws > > > > >> Exception { > > > > >> > Directory directory = FSDirectory.open(indexDir); > > > > System.out.println("The > > > > >> > query string is " + queryStr); MultiFieldQueryParser > queryParser = > > > new > > > > >> > MultiFieldQueryParser(new String[] { "cpn" }, new > > > StandardAnalyzer()); > > > > >> > IndexReader reader = DirectoryReader.open(directory); > > IndexSearcher > > > > >> > searcher = new IndexSearcher(reader); > > > > >> > queryParser.getAllowLeadingWildcard(); > > > > >> > > > > > >> > Query query = queryParser.parse(queryStr); TopDocs topDocs = > > > > >> > searcher.search(query, MAX_HITS); > > > > >> > > > > > >> > ScoreDoc[] hits = topDocs.scoreDocs; > > > > >> > System.out.println(hits.length + " Record(s) Found"); for (int > i = > > > 0; > > > > i < > > > > >> > hits.length; i++) { int docId = hits[i].doc; Document d = > > > > >> searcher.doc(docId); > > > > >> > System.out.println("\"cpn value is:\" " + d.get("cpn")); } if > > > > >> (hits.length == 0) { > > > > >> > System.out.println("No Data Founds "); } > > > > >> > > > > > >> > } > > > > >> > } > > > > >> > > > > > >> > > > > > >> > Please help here, thanks in advance. > > > > >> > > > > > >> > Regards, > > > > >> > Bhaskar > > > > >> > > > > > >> > On Tue, Sep 29, 2015 at 3:47 AM, Uwe Schindler <u...@thetaphi.de > > > > > > wrote: > > > > >> > > > > > >> > > Hi Erick, > > > > >> > > > > > > >> > > This mail was in Lucene's user mailing list. This is not about > > > Solr, > > > > >> > > so user cannot provide his Solr config! :-) In any case, it > > would > > > be > > > > >> > > good to get the Analyzer + code you use while indexing and > also > > > the > > > > >> > > code (+ Analyzer) that creates the query while searching. > > > > >> > > > > > > >> > > Uwe > > > > >> > > > > > > >> > > ----- > > > > >> > > Uwe Schindler > > > > >> > > H.-H.-Meier-Allee 63, D-28213 Bremen > > > > >> > > http://www.thetaphi.de > > > > >> > > eMail: u...@thetaphi.de > > > > >> > > > > > > >> > > > > > > >> > > > -----Original Message----- > > > > >> > > > From: Erick Erickson [mailto:erickerick...@gmail.com] > > > > >> > > > Sent: Monday, September 28, 2015 6:01 PM > > > > >> > > > To: java-user > > > > >> > > > Subject: Re: Need help in alphanumeric search > > > > >> > > > > > > > >> > > > You need to supply the definitions of this field from your > > > > >> > > > schema.xml > > > > >> > > file, > > > > >> > > > both the <field> and <fieldType> > > > > >> > > > > > > > >> > > > Additionally, please provide the results of the query you're > > > > trying > > > > >> > > > with &debug=true appended. > > > > >> > > > > > > > >> > > > The adminUI/analysis page is very helpful in these > situations > > as > > > > >> well. > > > > >> > > Select > > > > >> > > > the appropriate core from the drop-down on the left and > you'll > > > see > > > > >> > > > an "analysis" > > > > >> > > > section appear that shows you exactly what happens when the > > > field > > > > is > > > > >> > > > analyzed. > > > > >> > > > > > > > >> > > > Best, > > > > >> > > > Erick > > > > >> > > > > > > > >> > > > On Mon, Sep 28, 2015 at 5:01 AM, Bhaskar < > > bhaskar1...@gmail.com > > > > > > > > >> > wrote: > > > > >> > > > > Thanks Lan for reply. > > > > >> > > > > > > > > >> > > > > cpn values are like 123-0049, 342-043, ab23-090, hedwsdg > > > > >> > > > > > > > > >> > > > > my application is working when i gave search for below > > inputs > > > > >> > > > > 1) ab* > > > > >> > > > > 2)hedwsdg > > > > >> > > > > 3) hed* > > > > >> > > > > > > > > >> > > > > but it is not working for > > > > >> > > > > 1) 123* > > > > >> > > > > 2) 123-0049 > > > > >> > > > > 3) ab23* > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > Note: if the search input has number then it is not > working. > > > > >> > > > > > > > > >> > > > > Thanks in advacne. > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > On Mon, Sep 28, 2015 at 3:49 PM, Ian Lea < > ian....@gmail.com > > > > > > > >> wrote: > > > > >> > > > > > > > > >> > > > >> Hi > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> Can you provide a few examples of values of cpn that a) > are > > > and > > > > >> > > > >> b) are not being found, for indexing and searching. > > > > >> > > > >> > > > > >> > > > >> You may also find some of the tips at > > > > >> > > > >> > > > > >> > > > >> http://wiki.apache.org/lucene- > > > > >> > > > java/LuceneFAQ#Why_am_I_getting_no_hits > > > > >> > > > >> _.2F_incorrect_hits.3F > > > > >> > > > >> useful. > > > > >> > > > >> > > > > >> > > > >> You haven't shown the code that created the IndexWriter > so > > > the > > > > >> > > > >> tip about using the same analyzer at index and search > time > > > > might > > > > >> > > > >> be relevant. > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> -- > > > > >> > > > >> Ian. > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> On Mon, Sep 28, 2015 at 10:49 AM, Bhaskar > > > > >> > <bhaskar1...@gmail.com> > > > > >> > > > wrote: > > > > >> > > > >> > Hi, > > > > >> > > > >> > I am beginner in Apache lucene, I am using 5.3.1. > > > > >> > > > >> > I have created the index on the database result. The > > index > > > > >> > > > >> > values are having alphanumeric and strings values. I am > > > able > > > > to > > > > >> > > > >> > search the strings > > > > >> > > > >> but > > > > >> > > > >> > I am not able to search alphanumeric values. > > > > >> > > > >> > > > > > >> > > > >> > Can someone help me here. > > > > >> > > > >> > > > > > >> > > > >> > Below is indexing code... > > > > >> > > > >> > > > > > >> > > > >> > int indexDocs(IndexWriter writer, Connection conn) > throws > > > > >> > > > >> > Exception { Statement stmt = conn.createStatement(); > > > > >> > > > >> > ResultSet rs = stmt.executeQuery(sql); > > > > >> > > > >> > int i=0; > > > > >> > > > >> > while (rs.next()) { > > > > >> > > > >> > Document d = new Document(); > > > > >> > > > >> > // System.out.println("cpn is" + > > rs.getString("cpn")); > > > > >> > > > >> > // System.out.println("mpn is" + > > rs.getString("mpn")); > > > > >> > > > >> > > > > > >> > > > >> > d.add(new TextField("cpn", rs.getString("cpn"), > > > > >> > > > >> > Field.Store.YES)); > > > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > writer.addDocument(d); > > > > >> > > > >> > i++; > > > > >> > > > >> > } > > > > >> > > > >> > } > > > > >> > > > >> > > > > > >> > > > >> > Searching code: > > > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > private void searchIndex(Path indexDir, String > queryStr) > > > > throws > > > > >> > > > >> Exception { > > > > >> > > > >> > Directory directory = FSDirectory.open(indexDir); > > > > >> > > > >> > System.out.println("The query string is " + queryStr); > // > > > > >> > > > >> > MultiFieldQueryParser queryParser = new > > > > >> > > > >> > MultiFieldQueryParser(new // String[] {"mpn"}, new > > > > >> > > > >> > StandardAnalyzer()); // IndexReader reader = > > > > >> > > > >> > IndexReader.open(directory); IndexReader reader = > > > > >> > > > >> > DirectoryReader.open(directory); IndexSearcher > searcher = > > > new > > > > >> > > > >> > IndexSearcher(reader); Analyzer analyzer = new > > > > >> > > > >> > StandardAnalyzer(); analyzer.tokenStream("cpn", > > queryStr); > > > > >> > > > >> > QueryParser parser = new QueryParser("cpn", analyzer); > > > > >> > > > >> > parser.setDefaultOperator(Operator.OR); > > > > >> > > > >> > parser.getAllowLeadingWildcard(); > > > > >> > > > >> > parser.setAutoGeneratePhraseQueries(true); > > > > >> > > > >> > Query query = parser.parse(queryStr); > > > searcher.search(query, > > > > >> > > > >> > 100); TopDocs topDocs = searcher.search(query, > MAX_HITS); > > > > >> > > > >> > > > > > >> > > > >> > ScoreDoc[] hits = topDocs.scoreDocs; > > > > >> > > > >> > System.out.println(hits.length > > > > >> > > > >> > + " Record(s) Found"); for (int i = 0; i < hits.length; > > > i++) > > > > { > > > > >> > > > >> > + int > > > > >> > > > >> > docId = hits[i].doc; Document d = searcher.doc(docId); > > > > >> > > > >> > System.out.println("\"value is:\" " + d.get("cpn")); } > if > > > > >> > > > >> > (hits.length == 0) { System.out.println("No Data Founds > > > "); } > > > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > Thanks in advance. > > > > >> > > > >> > > > > > >> > > > >> > -- > > > > >> > > > >> > Keep Smiling.... > > > > >> > > > >> > Thanks & Regards > > > > >> > > > >> > Bhaskar. > > > > >> > > > >> > Mobile:9866724142 > > > > >> > > > >> > > > > >> > > > >> > > > > ----------------------------------------------------------------- > > > > >> > > > >> ---- To unsubscribe, e-mail: > > > > >> > > > >> java-user-unsubscr...@lucene.apache.org > > > > >> > > > >> For additional commands, e-mail: > > > > java-user-h...@lucene.apache.org > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > -- > > > > >> > > > > Keep Smiling.... > > > > >> > > > > Thanks & Regards > > > > >> > > > > Bhaskar. > > > > >> > > > > Mobile:9866724142 > > > > >> > > > > > > > >> > > > > > > > -------------------------------------------------------------------- > > > > >> > > > - To unsubscribe, e-mail: > > > java-user-unsubscr...@lucene.apache.org > > > > >> > > > For additional commands, e-mail: > > > java-user-h...@lucene.apache.org > > > > >> > > > > > > >> > > > > > > >> > > > > > > --------------------------------------------------------------------- > > > > >> > > To unsubscribe, e-mail: > java-user-unsubscr...@lucene.apache.org > > > > >> > > For additional commands, e-mail: > > java-user-h...@lucene.apache.org > > > > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > > >> > -- > > > > >> > Keep Smiling.... > > > > >> > Thanks & Regards > > > > >> > Bhaskar. > > > > >> > Mobile:9866724142 > > > > >> > > > > >> > > > > >> > > --------------------------------------------------------------------- > > > > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > >> > > > > >> > > > > > > > > > > > > > > > -- > > > > > Keep Smiling.... > > > > > Thanks & Regards > > > > > Bhaskar. > > > > > Mobile:9866724142 > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > > > > > > > >