Hi, I was playing with the idea of creating virtual packages, attached is a working script that illustrates it. I am getting this output:
Dit it work? ================== from org.apache.lucene.search import SearcherFactory; print SearcherFactory <type 'SearcherFactory'> from org.apache.lucene.analysis import Analyzer as Banalyzer; print Banalyzer <type 'Analyzer'> print sys.modules['org'] <module 'org' (built-in)> print sys.modules['org.apache'] <module 'org.apache' (built-in)> print sys.modules['org.apache.lucene'] <module 'org.apache.lucene' (built-in)> print sys.modules['org.apache.lucene.search'] <module 'org.apache.lucene.search' (built-in)> Cheers, roman On Fri, Jul 13, 2012 at 1:34 PM, Andi Vajda <va...@apache.org> wrote: > > On Jul 13, 2012, at 18:33, Roman Chyla <roman.ch...@gmail.com> wrote: > > > I think this would be great. Let me add little bit more to your > > observations (whole night yesterday was spent fighting with renames - > > because I was building a project which imports shared lucene and solr -- > > there were thousands of same classes, I am not sure it would be possible > > without some sort of a flexible rename...) > > > > JCC is a great tool and is used by potentially many projects - so > stripping > > "org.apache" seems right for pylucene, but looks arbitrary otherwise > > Yes, I forgot to say that there would be a way to declare one or more > mappings so that org.apache.lucene becomes lucene. > > Andi.. > > > (unless there is a flexible stripping mechanism). Also, if the full > > namespace remains original, then the code written in Python would be also > > executable by Jython, which is IMHO an advantage. > > > > But this being Python, the packages cannot be spread in different > locations > > (ie. there can be only one org.apache.lucene.analysis package) - unless > > there exists (again) some flexible mechanism which populates the > namespace > > with objects that belong there. It may seem an overkill to you, because > for > > single projects it would work, but seems perfectly justifiable in case of > > imported shared libraries > > > > I don't know what is your idea for implementing the python packages, but > > your last email got me thinking as well - there might be a very simple > way > > of getting to the java packages inside Python without too much work. > > > > Let's say the java "org.apache.lucene.search.IndexSearcher" is known to > > python as org_apache_lucene_search_IndexSearcher > > > > and users do: > > > > import lucene > > lucene.initVM() > > > > initVM() first initiates java VM (and populates the lucene namespace with > > all objects), but then it will call jcc.register_module(self) > > > > A new piece of code inside JCC grabs the lucene module and creates (on > the > > fly) python packages -- using types.ModuleType (or new.module()) -- the > new > > packages will be inserted into sys.modules > > > > so after lucene.initVM() returns > > > > users can do "from org.apache.lucene.search import IndexSearcher" and get > > lucene.org_apache_lucene_search_IndexSearcher object > > > > and also, when shared libraries are present (let's say 'solr') users do: > > > > import solr > > solr.initVM() > > > > The JCC will just update the existing packages and create new ones if > > needed (and from this perspective, having fully qualified name is safer > > than to have lucene.search.IndexSearcher) > > > > I think this change is totally possible and will not change the way how > > extensions are built. Does it have some serious flaw? > > > > I would be of course more than happy to contribute and test. > > > > Best, > > > > roman > > > > > > On Fri, Jul 13, 2012 at 11:47 AM, Andi Vajda <va...@apache.org> wrote: > > > >> > >> On Tue, 10 Jul 2012, Andi Vajda wrote: > >> > >> I would also like to propose a change, to allow for more flexible > >>>> mechanism of generating Python class names. The patch doesn't change > >>>> the default pylucene behaviour, but it gives people a way to replace > >>>> class names with patterns. I have noticed that there are more > >>>> same-name classes from different packages in the new lucene (and it > >>>> becomes worse when one has to deal with both lucene and solr). > >>>> > >>> > >>> Another way to fix this is to reproduce the namespace hierarchy used in > >>> Lucene, following along the Java packages, something I've been > dreading to > >>> do. Lucene just loooooves a really long deeply nested class structure. > >>> I'm not convinced yet it is bad enough to go down that route, though. > >>> > >>> Your proposal to use patterns may in fact yield a much more convenient > >>> solution. Thanks ! > >>> > >> > >> Rethinking this a bit, I'm prepared to change my mind on this. Your > >> patterned rename patch shows that we're slowly but surely reaching the > >> limit of the current setup that consists in throwing all wrapped classes > >> under the one global 'lucene' namespace. > >> > >> Lucene 4.0 has seen a large number of deeply nested classes with similar > >> names added since 3.x. Renaming these one by one (or excluding some) > >> doesn't scale. Using the proposed patterned rename scales more but > makes it > >> difficult to know what got renamed and how. > >> Ultimately, the more classes that are like-named, the more classes would > >> have instable names from one release to the next as more duplicated > names > >> are encountered. > >> > >> What if instead JCC supported the original Java namespaces all the way > to > >> the Python inteface (still dropping the original 'org.apache' Java > package > >> tree prefix) ? > >> The world-rooted style of naming Java classes isn't Pythonic but using > the > >> second half of the package structure feels right at home in the Python > >> world. > >> > >> JCC already re-creates the complete Java package structure in C++ as > >> namespaces for all the C++ code it generates, for both the JNI wrapper > >> classes and the C++/Python types. It's only the installation of the > class > >> names into the Python VM that is done in the flat 'lucene' namespace. > >> > >> I think it shouldn't be too hard to change the code that installs > classes > >> to create sub-modules of the lucene module and install classes in these > >> submodules instead (down to however many levels are in the original). > >> > >> In other words: > >> - from lucene import Document > >> would become > >> - from lucene.document import Document > >> > >> One could of course also say: > >> - import lucene.document.Document as whateverOneLikes > >> > >> If that proposal isn't mortally flawed somewhere, I'm prepared to drop > >> support for --rename and replace it with this new Python class/module > >> layout. > >> > >> Since this is being talked about in the context of a major PyLucene > >> release, version 4.0, and that all tests/samples have to be reworked > >> anyway, this backwards compat break shouldn't be too controversial, > >> hopefully. > >> > >> If it is, the old --rename could be preserved for sure, but I'd prefer > >> simplying the JCC interface than to accrete more to it. > >> > >> What do you think ? > >> > >> Andi.. > >> > >> > >>> Andi.. > >>> > >>> > >>>> I can confirm the test_test_BinaryDocument.py crashes the JVM no more. > >>>> > >>>> Roman > >>>> > >>>> > >>>> On Tue, Jul 10, 2012 at 8:54 AM, Andi Vajda <va...@apache.org> wrote: > >>>> > >>>>> > >>>>> Hi Roman, > >>>>> > >>>>> > >>>>> On Mon, 9 Jul 2012, Roman Chyla wrote: > >>>>> > >>>>> Thanks, I am attaching a new patch that adds the missing test base. > >>>>>> Sorry for the tabs, I was probably messing around with a few editors > >>>>>> (some of them not configured properly) > >>>>>> > >>>>> > >>>>> > >>>>> I integrated your test class (renaming it to fit the naming scheme > >>>>> used). > >>>>> Thanks ! > >>>>> > >>>>> > >>>>> So far, found one serious problem, crashes VM -- see. eg > >>>>>>>>>> test/test_BinaryDocument.py - when getting the document using: > >>>>>>>>>> reader.document(0) > >>>>>>>>>> > >>>>>>>>> > >>>>> > >>>>> test/test_BInaryDocument.py doesn't seem to crash the VM but fails > >>>>> because > >>>>> of some API changes. I suspect the crash to be some issue related to > >>>>> using > >>>>> an older jcc. > >>>>> > >>>>> I see a comment saying: "couldn't find any combination with lucene4.0 > >>>>> where > >>>>> it would raise errors". Most of these unit tests are straight ports > >>>>> from the > >>>>> original Java version. If you're stumped about a change, check the > >>>>> original > >>>>> Java test, it may have changed too. > >>>>> > >>>>> Andi.. > >>>>> > >>>>> > >>>> > >>> >