Re: Need help in alphanumeric search

Bhaskar Thu, 01 Oct 2015 08:55:20 -0700

Hi,
I am looking for sentence search rather than word search.
Regards,
Bhaskar
On Oct 1, 2015 7:07 PM, "Ian Lea" <[email protected]> wrote:


> Take a look at
> http://lucene.apache.org/core/5_3_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description
> .
> Sounds like you want an AND, or a +, or both. You may also want to
> take a look at phrase queries and/or span queries.
>
>
> --
> Ian.
>
>
>
> --
> Ian.
>
>
> On Thu, Oct 1, 2015 at 1:52 PM, Bhaskar <[email protected]> wrote:
> > Hi Uwe,
> > my searching is working like this.
> > if i give input as "SD RAM Bhaskar" then which ever strings are having
> > "SD", "RAM", "Bhaskar" all results are coming .
> > i.e. "SD lib"
> >       "RAM hello"
> >       "hi Bhaskar "
> >       "Bhaskar hai SD"
> >
> >
> > But I want below output.
> >        "SD RAM Bhaskar"
> >        "SD RAM Bhaskar hello"
> > i.e in the begining string have "SD RAM Bhaskar"  then next string can be
> > any thing.
> >
> >
> > but my current application result where ever it finds the "SD", or "RAM",
> > or "Bhaskar" I am getting all the string.
> >
> >
> > Can you please advice?
> > Thanks a lot in advance.
> >
> > Regards,
> > Bhaskar
> >
> >
> >
> >
> > On Wed, Sep 30, 2015 at 12:23 PM, Uwe Schindler <[email protected]> wrote:
> >
> >> Hi Bhaskar,
> >>
> >> the answer is very simple: Your analysis is not useful for the type of
> >> queries and data you are using. You are using SimpleAnalyzer in your
> >> search/indexing code:
> >>
> >>
> >>
> https://lucene.apache.org/core/5_3_1/analyzers-common/org/apache/lucene/analysis/core/SimpleAnalyzer.html
> >> "An Analyzer that filters LetterTokenizer with LowerCaseFilter"
> >>
> >> And LetterTokenizer does the following:
> >>
> >>
> https://lucene.apache.org/core/5_3_1/analyzers-common/org/apache/lucene/analysis/core/LetterTokenizer.html
> >> "A LetterTokenizer is a tokenizer that divides text at non-letters.
> That's
> >> to say, it defines tokens as maximal strings of adjacent letters, as
> >> defined by java.lang.Character.isLetter() predicate."
> >>
> >> So it creates a new token at every non-letter boundary. All non-letters
> >> are discarded (because they are treated as token boundary). So your
> queries
> >> can never match.
> >>
> >> I'd suggest to first inform yourself about analysis and choose a better
> >> one that suits your underlying data and the queries you want to do.
> Maybe
> >> use WhitespaceAnalyzer or better StandardAnalyzer as a first step. Be
> sure
> >> to reindex your data before querying. The Analyzer used on the search
> side
> >> must be the same like on the query side. If you want to use wildcards,
> you
> >> have to take care more, because wildcards are not really natural for
> "full
> >> text search engine" and may cause inconsistent results.
> >>
> >> Uwe
> >>
> >> -----
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: [email protected]
> >>
> >> > -----Original Message-----
> >> > From: Bhaskar [mailto:[email protected]]
> >> > Sent: Wednesday, September 30, 2015 4:28 AM
> >> > To: [email protected]
> >> > Subject: Re: Need help in alphanumeric search
> >> >
> >> > Hi Uwe,
> >> >
> >> > Below is my indexing code:
> >> >
> >> > public static void main(String[] args) throws Exception { //Path
> >> indexDir =
> >> > new Path(INDEX_DIR); public static final String INDEX_DIR =
> >> "c:/DBIndexAll/";
> >> > final Path indexDir = Paths.get(INDEX_DIR); SimpleDBIndexer indexer =
> new
> >> > SimpleDBIndexer(); try{
> >> >    Class.forName(JDBC_DRIVER).newInstance();
> >> >    Connection conn = DriverManager.getConnection(CONNECTION_URL +
> >> > DBNAME, USER_NAME, PASSWORD);
> >> >    SimpleAnalyzer analyzer = new SimpleAnalyzer();
> >> >    IndexWriterConfig indexWriterConfig = new
> IndexWriterConfig(analyzer);
> >> >    IndexWriter indexWriter = new
> IndexWriter(FSDirectory.open(indexDir),
> >> > indexWriterConfig);
> >> >    System.out.println("Indexing to directory '" + indexDir + "'...");
> >> >    int indexedDocumentCount = indexer.indexDocs(indexWriter, conn);
> >> >    indexWriter.close();
> >> >    System.out.println(indexedDocumentCount + " records have been
> indexed
> >> > successfully"); } catch (Exception e) {
> >> >    e.printStackTrace();
> >> > }
> >> > }
> >> >
> >> > int indexDocs(IndexWriter writer, Connection conn) throws Exception {
> >> >   String sql = QUERY1;
> >> >   Statement stmt = conn.createStatement();
> >> >   ResultSet rs = stmt.executeQuery(sql);
> >> >   int i=0;
> >> >   while (rs.next()) {
> >> >      Document d = new Document();
> >> >      d.add(new TextField("cpn", rs.getString("cpn"),
> Field.Store.YES));
> >> >
> >> >      writer.addDocument(d);
> >> >      i++;
> >> >  }
> >> >   stmt.close();
> >> >   rs.close();
> >> >
> >> >   return i;
> >> > }
> >> >
> >> >
> >> > Searching code:
> >> >
> >> > public class SimpleDBSearcher {
> >> > // PLASTRON
> >> > private static final String LUCENE_QUERY = "SD*"; private static final
> >> int
> >> > MAX_HITS = 500; private static final String INDEX_DIR =
> "C:/DBIndexAll/";
> >> >
> >> > public static void main(String[] args) throws Exception { // File
> >> indexDir = new
> >> > File(SimpleDBIndexer.INDEX_DIR); final Path indexDir =
> >> > Paths.get(SimpleDBIndexer.INDEX_DIR);
> >> > String query = LUCENE_QUERY;
> >> > SimpleDBSearcher searcher = new SimpleDBSearcher();
> >> > searcher.searchIndex(indexDir, query); }
> >> >
> >> > private void searchIndex(Path indexDir, String queryStr) throws
> >> Exception {
> >> > Directory directory = FSDirectory.open(indexDir);
> System.out.println("The
> >> > query string is " + queryStr); MultiFieldQueryParser queryParser = new
> >> > MultiFieldQueryParser(new String[] { "cpn" }, new StandardAnalyzer());
> >> > IndexReader reader = DirectoryReader.open(directory); IndexSearcher
> >> > searcher = new IndexSearcher(reader);
> >> > queryParser.getAllowLeadingWildcard();
> >> >
> >> > Query query = queryParser.parse(queryStr); TopDocs topDocs =
> >> > searcher.search(query, MAX_HITS);
> >> >
> >> > ScoreDoc[] hits = topDocs.scoreDocs;
> >> > System.out.println(hits.length + " Record(s) Found"); for (int i = 0;
> i <
> >> > hits.length; i++) { int docId = hits[i].doc; Document d =
> >> searcher.doc(docId);
> >> > System.out.println("\"cpn value is:\" " + d.get("cpn")); } if
> >> (hits.length == 0) {
> >> > System.out.println("No Data Founds "); }
> >> >
> >> > }
> >> > }
> >> >
> >> >
> >> > Please help here, thanks in advance.
> >> >
> >> > Regards,
> >> > Bhaskar
> >> >
> >> > On Tue, Sep 29, 2015 at 3:47 AM, Uwe Schindler <[email protected]>
> wrote:
> >> >
> >> > > Hi Erick,
> >> > >
> >> > > This mail was in Lucene's user mailing list. This is not about Solr,
> >> > > so user cannot provide his Solr config! :-) In any case, it would be
> >> > > good to get the Analyzer + code you use while indexing and also the
> >> > > code (+ Analyzer) that creates the query while searching.
> >> > >
> >> > > Uwe
> >> > >
> >> > > -----
> >> > > Uwe Schindler
> >> > > H.-H.-Meier-Allee 63, D-28213 Bremen
> >> > > http://www.thetaphi.de
> >> > > eMail: [email protected]
> >> > >
> >> > >
> >> > > > -----Original Message-----
> >> > > > From: Erick Erickson [mailto:[email protected]]
> >> > > > Sent: Monday, September 28, 2015 6:01 PM
> >> > > > To: java-user
> >> > > > Subject: Re: Need help in alphanumeric search
> >> > > >
> >> > > > You need to supply the definitions of this field from your
> >> > > > schema.xml
> >> > > file,
> >> > > > both the <field> and <fieldType>
> >> > > >
> >> > > > Additionally, please provide the results of the query you're
> trying
> >> > > > with &debug=true appended.
> >> > > >
> >> > > > The adminUI/analysis page is very helpful in these situations as
> >> well.
> >> > > Select
> >> > > > the appropriate core from the drop-down on the left and you'll see
> >> > > > an "analysis"
> >> > > > section appear that shows you exactly what happens when the field
> is
> >> > > > analyzed.
> >> > > >
> >> > > > Best,
> >> > > > Erick
> >> > > >
> >> > > > On Mon, Sep 28, 2015 at 5:01 AM, Bhaskar <[email protected]>
> >> > wrote:
> >> > > > > Thanks Lan for reply.
> >> > > > >
> >> > > > > cpn values are like 123-0049, 342-043, ab23-090, hedwsdg
> >> > > > >
> >> > > > > my application is working when i gave search  for below inputs
> >> > > > > 1) ab*
> >> > > > >  2)hedwsdg
> >> > > > > 3) hed*
> >> > > > >
> >> > > > > but it is not working for
> >> > > > > 1) 123*
> >> > > > > 2) 123-0049
> >> > > > > 3) ab23*
> >> > > > >
> >> > > > >
> >> > > > > Note: if the search input has number then it is not working.
> >> > > > >
> >> > > > > Thanks in advacne.
> >> > > > >
> >> > > > >
> >> > > > > On Mon, Sep 28, 2015 at 3:49 PM, Ian Lea <[email protected]>
> >> wrote:
> >> > > > >
> >> > > > >> Hi
> >> > > > >>
> >> > > > >>
> >> > > > >> Can you provide a few examples of values of cpn that a) are and
> >> > > > >> b) are not being found, for indexing and searching.
> >> > > > >>
> >> > > > >> You may also find some of the tips at
> >> > > > >>
> >> > > > >> http://wiki.apache.org/lucene-
> >> > > > java/LuceneFAQ#Why_am_I_getting_no_hits
> >> > > > >> _.2F_incorrect_hits.3F
> >> > > > >> useful.
> >> > > > >>
> >> > > > >> You haven't shown the code that created the IndexWriter so the
> >> > > > >> tip about using the same analyzer at index and search time
> might
> >> > > > >> be relevant.
> >> > > > >>
> >> > > > >>
> >> > > > >>
> >> > > > >> --
> >> > > > >> Ian.
> >> > > > >>
> >> > > > >>
> >> > > > >> On Mon, Sep 28, 2015 at 10:49 AM, Bhaskar
> >> > <[email protected]>
> >> > > > wrote:
> >> > > > >> > Hi,
> >> > > > >> > I am beginner in Apache lucene, I am using 5.3.1.
> >> > > > >> > I have created  the index on the database result. The index
> >> > > > >> > values are having alphanumeric and strings values. I am able
> to
> >> > > > >> > search the strings
> >> > > > >> but
> >> > > > >> > I am not able to search alphanumeric values.
> >> > > > >> >
> >> > > > >> > Can someone help me here.
> >> > > > >> >
> >> > > > >> > Below is indexing code...
> >> > > > >> >
> >> > > > >> > int indexDocs(IndexWriter writer, Connection conn) throws
> >> > > > >> > Exception { Statement stmt = conn.createStatement();
> >> > > > >> >   ResultSet rs = stmt.executeQuery(sql);
> >> > > > >> >   int i=0;
> >> > > > >> >   while (rs.next()) {
> >> > > > >> >      Document d = new Document();
> >> > > > >> >     // System.out.println("cpn is" + rs.getString("cpn"));
> >> > > > >> >     // System.out.println("mpn is" + rs.getString("mpn"));
> >> > > > >> >
> >> > > > >> >   d.add(new TextField("cpn", rs.getString("cpn"),
> >> > > > >> > Field.Store.YES));
> >> > > > >> >
> >> > > > >> >
> >> > > > >> >      writer.addDocument(d);
> >> > > > >> >      i++;
> >> > > > >> >  }
> >> > > > >> > }
> >> > > > >> >
> >> > > > >> > Searching code:
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > private void searchIndex(Path indexDir, String queryStr)
> throws
> >> > > > >> Exception {
> >> > > > >> > Directory directory = FSDirectory.open(indexDir);
> >> > > > >> > System.out.println("The query string is " + queryStr); //
> >> > > > >> > MultiFieldQueryParser queryParser = new
> >> > > > >> > MultiFieldQueryParser(new // String[] {"mpn"}, new
> >> > > > >> > StandardAnalyzer()); // IndexReader reader =
> >> > > > >> > IndexReader.open(directory); IndexReader reader =
> >> > > > >> > DirectoryReader.open(directory); IndexSearcher searcher = new
> >> > > > >> > IndexSearcher(reader); Analyzer analyzer = new
> >> > > > >> > StandardAnalyzer(); analyzer.tokenStream("cpn", queryStr);
> >> > > > >> > QueryParser parser = new QueryParser("cpn", analyzer);
> >> > > > >> > parser.setDefaultOperator(Operator.OR);
> >> > > > >> > parser.getAllowLeadingWildcard();
> >> > > > >> > parser.setAutoGeneratePhraseQueries(true);
> >> > > > >> > Query query = parser.parse(queryStr); searcher.search(query,
> >> > > > >> > 100); TopDocs topDocs = searcher.search(query, MAX_HITS);
> >> > > > >> >
> >> > > > >> > ScoreDoc[] hits = topDocs.scoreDocs;
> >> > > > >> > System.out.println(hits.length
> >> > > > >> > + " Record(s) Found"); for (int i = 0; i < hits.length; i++)
> {
> >> > > > >> > + int
> >> > > > >> > docId = hits[i].doc; Document d = searcher.doc(docId);
> >> > > > >> > System.out.println("\"value is:\" " + d.get("cpn")); } if
> >> > > > >> > (hits.length == 0) { System.out.println("No Data Founds "); }
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > Thanks in advance.
> >> > > > >> >
> >> > > > >> > --
> >> > > > >> > Keep Smiling....
> >> > > > >> > Thanks & Regards
> >> > > > >> > Bhaskar.
> >> > > > >> > Mobile:9866724142
> >> > > > >>
> >> > > > >>
> -----------------------------------------------------------------
> >> > > > >> ---- To unsubscribe, e-mail:
> >> > > > >> [email protected]
> >> > > > >> For additional commands, e-mail:
> [email protected]
> >> > > > >>
> >> > > > >>
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Keep Smiling....
> >> > > > > Thanks & Regards
> >> > > > > Bhaskar.
> >> > > > > Mobile:9866724142
> >> > > >
> >> > > >
> --------------------------------------------------------------------
> >> > > > - To unsubscribe, e-mail: [email protected]
> >> > > > For additional commands, e-mail: [email protected]
> >> > >
> >> > >
> >> > >
> ---------------------------------------------------------------------
> >> > > To unsubscribe, e-mail: [email protected]
> >> > > For additional commands, e-mail: [email protected]
> >> > >
> >> > >
> >> >
> >> >
> >> > --
> >> > Keep Smiling....
> >> > Thanks & Regards
> >> > Bhaskar.
> >> > Mobile:9866724142
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> >>
> >
> >
> > --
> > Keep Smiling....
> > Thanks & Regards
> > Bhaskar.
> > Mobile:9866724142
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: Need help in alphanumeric search

Reply via email to