Changing wildcard characters

2008-02-23 Thread spring
Hi,

is it possible to change the wildcard charaters which are used by
QueryParser?

Or do I have to replace them myself in the query string?

Thank you


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



VTD-XML 2.3 released

2008-02-23 Thread jimmy Zhang

VTD-XML 2.3 is now released. To download the latest version please visit
http://sourceforge.net/project/showfiles.php?group_id=110612&package_id=120172.

Below is a list of new features and enhancements in this version.

 a.. VTDException is now introduced as the root class for all other
VTD-XML's exception classes (per suggestion of Max Rahder).
 b.. Transcoding capability is now added for inter-document cut and paste.
You can cut a chuck of bytes in a UTF-8 encoded document and paste it into a
UTF-16 encoded document and the output document is still well-formed.
 c.. ISO-8859-10, ISO-8859-11, ISO-8859-12, ISO-8859-13, ISO-8859-14 and
ISO-8859-15 support has now been added
 d.. Zero length Text node is now possible.
 e.. Ability to dump in-memory copy of text is added.
 f.. Various code cleanup, enhancement and bug fixes.

Below are some new articles related to VTD-XML

 a.. Index XML documents with VTD-XML
http://xml.sys-con.com/read/453082.htm
 b.. Manipulate XML content the Ximple Way
http://www.devx.com/xml/Article/36379
 c.. VTD-XML: A new vision of XML
http://www.developer.com/xml/article.php/3714051
 d.. VTD-XML: XML Processing for the future
http://www.codeproject.com/KB/cs/vtd-xml_examples.aspx

If you (or someone you know) like the concept of VTD-XML, think that it can
help solve enterprises' XML processing related issues (particularly those
related to SOA), and would like to directly influence and contribute to the
development of the future of Internet, please email me
[EMAIL PROTECTED]). We are looking for open source software developers
and project management people to take VTD-XML to the next level.



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Searching for multiple criteria (accross 2 tables)

2008-02-23 Thread Emmanuel Bernard
Not sure if it's too late for you. But here are my comments if you  
want to stick with Hibernate and Hibernate Search


Generally speaking, once you have the query to retrieve the data per  
id, you can map this query to an entity in Hibernate using either:

 - @Formula for simple cases
 - @Loader for more complex cases

Once mapped as an entity, the mapping to Lucene via Hibernate Search  
is business as usual.


Alternatively, you can use a class level @FieldBridge and map the  
data the way you want in Lucene from an entity object. Note that I  
don't think this strategy will suit your current needs.


Emmanuel

On  Feb 15, 2008, at 15:24, Chris Lu wrote:

Sorry, sent the previous draft email by mistake. Here is the  
correct one.


Sounds a typical SQL pivot problem.

select Id, SIN, data.*
from IdCard, (SELECT
  ID
  MAX(CASE WHEN name = 'Fname' THEN Value END) AS Fname,
  MAX(CASE WHEN name = 'Lname' THEN Value END) AS Lname,
  MAX(CASE WHEN name = 'Age' THEN Value END) AS Age,
  MAX(CASE WHEN name = 'Country' THEN Value END) AS Country
FROM
   DATA_Table
GROUP BY
  ID
) data

To speed things up, you can split the SQLs into 2 for better  
performance.


This is how DBSight does this. You can write your own SQLs, but
generally it's the same methods.


--
Chris Lu
-
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php? 
title=Create_Lucene_Database_Search_in_3_minutes

DBSight customer, a shopping comparison site, (anonymous per request)
got 2.6 Million Euro funding!


On Fri, Feb 15, 2008 at 11:27 AM, lmctndi <[EMAIL PROTECTED]> wrote:


 Thanks for your reply.

 Your idea prompts more questions:


 I understand what you are saying but don't know how to implement  
it.  How do
 you go about joining all rows of all the tables belonging to one  
person and



to index them so that I can actually use
 "+Fname:john +County:USA" as a query?


 Erick Erickson wrote:


To expand a bit on Chris's first point: Take off your DB hat and  
put on
your search hat . It sounds like you have simply moved your  
database

tables into Lucene and want to search across them. My rule is that
whenever you find yourself trying to make Lucene act like a DB, you
need to pause and reflect on your design.

So, from your example, you select all the data relating to id 1 from
*all* your tables, and index that as a single document in Lucene.  
Very

simplistically, your document for ID 1 has the fields
Fname, Lname, Age, Country, and SIN.

Your query is now very simple,
+Fname:john +County:USA

and to get the related SIN, you iterate over your hits
and extract the SIN from each hit.

If I understand your problem, that is .

In general, the strategy is to de-normalize your information
when you build your index

Best
Erick



 --
 View this message in context: http://www.nabble.com/Searching-for- 
multiple-criteria-%28accross-2-tables%29-tp15502657p15508362.html
 Sent from the Lucene - Java Users mailing list archive at  
Nabble.com.



  
-

 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Problem in Coding, to get the DOC ID from HITS

2008-02-23 Thread Erick Erickson
You have to set the Hits object to the results of a search. See
Searcher.search()

On Fri, Feb 22, 2008 at 4:32 PM, sumittyagi <[EMAIL PROTECTED]> wrote:

>
> here is my code
> package db;
> import java.io.*;
> import java.util.*;
> import java.lang.*;
> import org.apache.lucene.search.Hits;
> import org.apache.lucene.search.Hit;
>
> public class comm{
>public static void main(String[] args)
>{
>System.out.println("hi");
>
>
> Hits hits;
> int hitCount = hits.length();
> for (int i=0;i   int docId = hits.id(i) ;
>
> }
> }
> }
>
> and the error i am getting is
>
> C:\Documents and Settings\Sumit\Desktop>javac db/comm.java
> db/comm.java:15: variable hits might not have been initialized
> int hitCount = hits.length();
>   ^
> db/comm.java:17: unreported exception java.io.IOException; must be caught
> or
> dec
> lared to be thrown
>   int docId = hits.id(i) ;
>  ^
> 2 errors
>
>
>
> any help please..
>
>
> --
> View this message in context:
> http://www.nabble.com/Problem-in-Coding%2C-to-get-the-DOC-ID-from-HITS-tp15641665p15641665.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


A regex search engine for what?

2008-02-23 Thread Abeba Tensai
Hi,

Just wanted to get the feedback of the community of potential disruptive
application of a regular expression based search engine before offering my
prof. to start researching the subject with changes to the lucene codebase.

Thanks.


HELP...compiling first program for lucene Indexer.java

2008-02-23 Thread sumittyagi

I am new to lucene, and have problem in executing it's first program which is
Indexer.java

here is the source code..

*

import java.io.*;
import org.apache.lucene.document.*;
import org.apache.lucene.index.*;
import org.apache.lucene.analysis.*;
import org.apache.lucene.analysis.standard.*;
import java.util.*;
import java.io.IOException;

public class Indexer {
public static void main(String[] args) throws Exception {
if (args.length != 2) {
throw new Exception("Usage: java " + Indexer.class.getName()
+ "  ");
}
File indexDir = new File(args[0]);
File dataDir = new File(args[1]);
long start = new Date().getTime();
int numIndexed = index(indexDir, dataDir);
long end = new Date().getTime();
System.out.println("Indexing " + numIndexed + " files took "
+ (end - start) + " milliseconds");
}
// open an index and start file directory traversal
public static int index(File indexDir, File dataDir)
throws IOException {
if (!dataDir.exists() || !dataDir.isDirectory()) {
throw new IOException(dataDir
+ " does not exist or is not a directory");
}
IndexWriter writer = new IndexWriter(indexDir,
new StandardAnalyzer(), true);
writer.setUseCompoundFile(false);
indexDirectory(writer, dataDir);
int numIndexed = writer.docCount();
writer.optimize();
writer.close();
return numIndexed;
}
// recursive method that calls itself when it finds a directory
private static void indexDirectory(IndexWriter writer, File dir)
throws IOException {
File[] files = dir.listFiles();
for (int i = 0; i < files.length; i++) {
File f = files[i];
if (f.isDirectory()) {
indexDirectory(writer, f);
} else if (f.getName().endsWith(".txt")) {
indexFile(writer, f);
}
}
}
// method to actually index a file using Lucene
private static void indexFile(IndexWriter writer, File f)
throws IOException {
if (f.isHidden() || !f.exists() || !f.canRead()) {
return;
}
System.out.println("Indexing " + f.getCanonicalPath());
Document doc = new Document();
doc.add(Field.Text("contents", new FileReader(f)));
doc.add(Field.Keyword("filename", f.getCanonicalPath()));
writer.addDocument(doc);
}
}


**

and the errors which i am getting are

C:\Documents and Settings\Sumit\Desktop\db>javac Indexer.java
Indexer.java:60: cannot find symbol
symbol  : method Text(java.lang.String,java.io.FileReader)
location: class org.apache.lucene.document.Field
doc.add(Field.Text("contents", new FileReader(f)));
 ^
Indexer.java:61: cannot find symbol
symbol  : method Keyword(java.lang.String,java.lang.String)
location: class org.apache.lucene.document.Field
doc.add(Field.Keyword("filename", f.getCanonicalPath()));
 ^
2 errors





any suggestions ..please 
-- 
View this message in context: 
http://www.nabble.com/HELP...compiling-first-program-for-lucene--Indexer.java-tp15661169p15661169.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Searcher.java ...problem in compiling

2008-02-23 Thread sumittyagi

import java.io.*;
import org.apache.lucene.document.*;
import org.apache.lucene.document.Field.*;
import org.apache.lucene.index.*;
import org.apache.lucene.analysis.*;
import org.apache.lucene.analysis.standard.*;
import java.util.*;
import java.io.IOException;
import org.apache.lucene.store.*;
import org.apache.lucene.search.*;
import org.apache.lucene.queryParser.*;


public class Searcher {
public static void main(String[] args) throws Exception {
if (args.length != 2) {
throw new Exception("Usage: java " + Searcher.class.getName()
+ "  ");
}
File indexDir = new File(args[0]);
String q = args[1];
if (!indexDir.exists() || !indexDir.isDirectory()) {
throw new Exception(indexDir +
" does not exist or is not a directory.");
}
search(indexDir, q);
}
public static void search(File indexDir, String q)
throws Exception {
Directory fsDir = FSDirectory.getDirectory(indexDir, false);
IndexSearcher is = new IndexSearcher(fsDir);
Query query = QueryParser.parse(q, "contents",new StandardAnalyzer());
long start = new Date().getTime();
Hits hits = is.search(query);
long end = new Date().getTime();
System.err.println("Found " + hits.length() +
" document(s) (in " + (end - start) +
" milliseconds) that matched query '" +
q + "':");
for (int i = 0; i < hits.length(); i++) {
Document doc = hits.doc(i);
System.out.println(doc.get("filename"));
}
}
}
*8
and the error i am getting is

C:\Documents and Settings\Sumit\Desktop\db>javac Searcher.java
Searcher.java:32: parse(java.lang.String) in
org.apache.lucene.queryParser.Query
Parser cannot be applied to
(java.lang.String,java.lang.String,org.apache.lucene
.analysis.standard.StandardAnalyzer)
Query query = QueryParser.parse(q, "contents",new StandardAnalyzer());
 ^
Note: Searcher.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
1 error



please help me out regarding these basic problems...
-- 
View this message in context: 
http://www.nabble.com/Searcher.java-...problem-in-compiling-tp15661355p15661355.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]