Introduction to PyLucene Community and some doubts

2013-06-11 Thread Vishrut Mehta
Hello Everyone, I am Vishrut Mehta, currently a third year students at IIIT Hyderabad, India. I have been contributing to Open Source since two years and also have contributed to organizations like E-cidadania, Sahana Software Foundation, Gnome, etc. I am very interested in Search e

Re: Introduction to PyLucene Community and some doubts

2013-06-11 Thread Vishrut Mehta
Hello sir, Thank you for the quick reply. I want to integrate this functionality with web2py, So i would need to stick with python and Pylucene. So the method you are saying is like, extracting text from all the document using different python libraries, and then Indexing the data, then Search the

Re: Introduction to PyLucene Community and some doubts

2013-06-11 Thread Thomas Koch
Hi, I suggest you have a look at Apache TIKA: http://tika.apache.org You can easily call a "java -jar tika.jar" command via python tools like os.popen and convert files in various formats to text. There's even a python wrapper based on JCC but I'm not sure if that's still maintained: http://red

Re: jcc/sources/functions.cpp:20:23: error: arpa/inet.h: No such file or directory

2013-06-11 Thread Thomas Koch
Samantha, you may want to try a per-built binary of JCC for windows: there are version for win32 and py26 and py27 available here: http://code.google.com/a/apache-extras.org/p/pylucene-extra/downloads/list These eggs were built using Python, Java, ant and MSVC (Microsoft Visual Studio 9.0) - plu

Future of pylucene-extra

2013-06-11 Thread Thomas Koch
Hi, the pylucene-extra project started some time ago with the goal to provide pre-built PyLucene and JCC eggs on several OS/Python/Java combos [1]. In fact we collected 32 eggs since June 2011 - including versions for Windows and Mac OSX. Now that Google announced their EOL support for Downloads