Changing wildcard characters
Hi, is it possible to change the wildcard charaters which are used by QueryParser? Or do I have to replace them myself in the query string? Thank you - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
VTD-XML 2.3 released
VTD-XML 2.3 is now released. To download the latest version please visit http://sourceforge.net/project/showfiles.php?group_id=110612&package_id=120172. Below is a list of new features and enhancements in this version. a.. VTDException is now introduced as the root class for all other VTD-XML's exception classes (per suggestion of Max Rahder). b.. Transcoding capability is now added for inter-document cut and paste. You can cut a chuck of bytes in a UTF-8 encoded document and paste it into a UTF-16 encoded document and the output document is still well-formed. c.. ISO-8859-10, ISO-8859-11, ISO-8859-12, ISO-8859-13, ISO-8859-14 and ISO-8859-15 support has now been added d.. Zero length Text node is now possible. e.. Ability to dump in-memory copy of text is added. f.. Various code cleanup, enhancement and bug fixes. Below are some new articles related to VTD-XML a.. Index XML documents with VTD-XML http://xml.sys-con.com/read/453082.htm b.. Manipulate XML content the Ximple Way http://www.devx.com/xml/Article/36379 c.. VTD-XML: A new vision of XML http://www.developer.com/xml/article.php/3714051 d.. VTD-XML: XML Processing for the future http://www.codeproject.com/KB/cs/vtd-xml_examples.aspx If you (or someone you know) like the concept of VTD-XML, think that it can help solve enterprises' XML processing related issues (particularly those related to SOA), and would like to directly influence and contribute to the development of the future of Internet, please email me [EMAIL PROTECTED]). We are looking for open source software developers and project management people to take VTD-XML to the next level. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Searching for multiple criteria (accross 2 tables)
Not sure if it's too late for you. But here are my comments if you want to stick with Hibernate and Hibernate Search Generally speaking, once you have the query to retrieve the data per id, you can map this query to an entity in Hibernate using either: - @Formula for simple cases - @Loader for more complex cases Once mapped as an entity, the mapping to Lucene via Hibernate Search is business as usual. Alternatively, you can use a class level @FieldBridge and map the data the way you want in Lucene from an entity object. Note that I don't think this strategy will suit your current needs. Emmanuel On Feb 15, 2008, at 15:24, Chris Lu wrote: Sorry, sent the previous draft email by mistake. Here is the correct one. Sounds a typical SQL pivot problem. select Id, SIN, data.* from IdCard, (SELECT ID MAX(CASE WHEN name = 'Fname' THEN Value END) AS Fname, MAX(CASE WHEN name = 'Lname' THEN Value END) AS Lname, MAX(CASE WHEN name = 'Age' THEN Value END) AS Age, MAX(CASE WHEN name = 'Country' THEN Value END) AS Country FROM DATA_Table GROUP BY ID ) data To speed things up, you can split the SQLs into 2 for better performance. This is how DBSight does this. You can write your own SQLs, but generally it's the same methods. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php? title=Create_Lucene_Database_Search_in_3_minutes DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! On Fri, Feb 15, 2008 at 11:27 AM, lmctndi <[EMAIL PROTECTED]> wrote: Thanks for your reply. Your idea prompts more questions: I understand what you are saying but don't know how to implement it. How do you go about joining all rows of all the tables belonging to one person and to index them so that I can actually use "+Fname:john +County:USA" as a query? Erick Erickson wrote: To expand a bit on Chris's first point: Take off your DB hat and put on your search hat . It sounds like you have simply moved your database tables into Lucene and want to search across them. My rule is that whenever you find yourself trying to make Lucene act like a DB, you need to pause and reflect on your design. So, from your example, you select all the data relating to id 1 from *all* your tables, and index that as a single document in Lucene. Very simplistically, your document for ID 1 has the fields Fname, Lname, Age, Country, and SIN. Your query is now very simple, +Fname:john +County:USA and to get the related SIN, you iterate over your hits and extract the SIN from each hit. If I understand your problem, that is . In general, the strategy is to de-normalize your information when you build your index Best Erick -- View this message in context: http://www.nabble.com/Searching-for- multiple-criteria-%28accross-2-tables%29-tp15502657p15508362.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Problem in Coding, to get the DOC ID from HITS
You have to set the Hits object to the results of a search. See Searcher.search() On Fri, Feb 22, 2008 at 4:32 PM, sumittyagi <[EMAIL PROTECTED]> wrote: > > here is my code > package db; > import java.io.*; > import java.util.*; > import java.lang.*; > import org.apache.lucene.search.Hits; > import org.apache.lucene.search.Hit; > > public class comm{ >public static void main(String[] args) >{ >System.out.println("hi"); > > > Hits hits; > int hitCount = hits.length(); > for (int i=0;i int docId = hits.id(i) ; > > } > } > } > > and the error i am getting is > > C:\Documents and Settings\Sumit\Desktop>javac db/comm.java > db/comm.java:15: variable hits might not have been initialized > int hitCount = hits.length(); > ^ > db/comm.java:17: unreported exception java.io.IOException; must be caught > or > dec > lared to be thrown > int docId = hits.id(i) ; > ^ > 2 errors > > > > any help please.. > > > -- > View this message in context: > http://www.nabble.com/Problem-in-Coding%2C-to-get-the-DOC-ID-from-HITS-tp15641665p15641665.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
A regex search engine for what?
Hi, Just wanted to get the feedback of the community of potential disruptive application of a regular expression based search engine before offering my prof. to start researching the subject with changes to the lucene codebase. Thanks.
HELP...compiling first program for lucene Indexer.java
I am new to lucene, and have problem in executing it's first program which is Indexer.java here is the source code.. * import java.io.*; import org.apache.lucene.document.*; import org.apache.lucene.index.*; import org.apache.lucene.analysis.*; import org.apache.lucene.analysis.standard.*; import java.util.*; import java.io.IOException; public class Indexer { public static void main(String[] args) throws Exception { if (args.length != 2) { throw new Exception("Usage: java " + Indexer.class.getName() + " "); } File indexDir = new File(args[0]); File dataDir = new File(args[1]); long start = new Date().getTime(); int numIndexed = index(indexDir, dataDir); long end = new Date().getTime(); System.out.println("Indexing " + numIndexed + " files took " + (end - start) + " milliseconds"); } // open an index and start file directory traversal public static int index(File indexDir, File dataDir) throws IOException { if (!dataDir.exists() || !dataDir.isDirectory()) { throw new IOException(dataDir + " does not exist or is not a directory"); } IndexWriter writer = new IndexWriter(indexDir, new StandardAnalyzer(), true); writer.setUseCompoundFile(false); indexDirectory(writer, dataDir); int numIndexed = writer.docCount(); writer.optimize(); writer.close(); return numIndexed; } // recursive method that calls itself when it finds a directory private static void indexDirectory(IndexWriter writer, File dir) throws IOException { File[] files = dir.listFiles(); for (int i = 0; i < files.length; i++) { File f = files[i]; if (f.isDirectory()) { indexDirectory(writer, f); } else if (f.getName().endsWith(".txt")) { indexFile(writer, f); } } } // method to actually index a file using Lucene private static void indexFile(IndexWriter writer, File f) throws IOException { if (f.isHidden() || !f.exists() || !f.canRead()) { return; } System.out.println("Indexing " + f.getCanonicalPath()); Document doc = new Document(); doc.add(Field.Text("contents", new FileReader(f))); doc.add(Field.Keyword("filename", f.getCanonicalPath())); writer.addDocument(doc); } } ** and the errors which i am getting are C:\Documents and Settings\Sumit\Desktop\db>javac Indexer.java Indexer.java:60: cannot find symbol symbol : method Text(java.lang.String,java.io.FileReader) location: class org.apache.lucene.document.Field doc.add(Field.Text("contents", new FileReader(f))); ^ Indexer.java:61: cannot find symbol symbol : method Keyword(java.lang.String,java.lang.String) location: class org.apache.lucene.document.Field doc.add(Field.Keyword("filename", f.getCanonicalPath())); ^ 2 errors any suggestions ..please -- View this message in context: http://www.nabble.com/HELP...compiling-first-program-for-lucene--Indexer.java-tp15661169p15661169.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Searcher.java ...problem in compiling
import java.io.*; import org.apache.lucene.document.*; import org.apache.lucene.document.Field.*; import org.apache.lucene.index.*; import org.apache.lucene.analysis.*; import org.apache.lucene.analysis.standard.*; import java.util.*; import java.io.IOException; import org.apache.lucene.store.*; import org.apache.lucene.search.*; import org.apache.lucene.queryParser.*; public class Searcher { public static void main(String[] args) throws Exception { if (args.length != 2) { throw new Exception("Usage: java " + Searcher.class.getName() + " "); } File indexDir = new File(args[0]); String q = args[1]; if (!indexDir.exists() || !indexDir.isDirectory()) { throw new Exception(indexDir + " does not exist or is not a directory."); } search(indexDir, q); } public static void search(File indexDir, String q) throws Exception { Directory fsDir = FSDirectory.getDirectory(indexDir, false); IndexSearcher is = new IndexSearcher(fsDir); Query query = QueryParser.parse(q, "contents",new StandardAnalyzer()); long start = new Date().getTime(); Hits hits = is.search(query); long end = new Date().getTime(); System.err.println("Found " + hits.length() + " document(s) (in " + (end - start) + " milliseconds) that matched query '" + q + "':"); for (int i = 0; i < hits.length(); i++) { Document doc = hits.doc(i); System.out.println(doc.get("filename")); } } } *8 and the error i am getting is C:\Documents and Settings\Sumit\Desktop\db>javac Searcher.java Searcher.java:32: parse(java.lang.String) in org.apache.lucene.queryParser.Query Parser cannot be applied to (java.lang.String,java.lang.String,org.apache.lucene .analysis.standard.StandardAnalyzer) Query query = QueryParser.parse(q, "contents",new StandardAnalyzer()); ^ Note: Searcher.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 1 error please help me out regarding these basic problems... -- View this message in context: http://www.nabble.com/Searcher.java-...problem-in-compiling-tp15661355p15661355.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]