Hi,
I was playing with the idea of creating virtual packages, attached is a
working script that illustrates it. I am getting this output:

Dit it work?
==================
from org.apache.lucene.search import SearcherFactory; print SearcherFactory
<type 'SearcherFactory'>
from org.apache.lucene.analysis import Analyzer as Banalyzer; print
Banalyzer
<type 'Analyzer'>
print sys.modules['org'] <module 'org' (built-in)>
print sys.modules['org.apache'] <module 'org.apache' (built-in)>
print sys.modules['org.apache.lucene'] <module 'org.apache.lucene'
(built-in)>
print sys.modules['org.apache.lucene.search'] <module
'org.apache.lucene.search' (built-in)>

Cheers,

  roman


On Fri, Jul 13, 2012 at 1:34 PM, Andi Vajda <va...@apache.org> wrote:

>
> On Jul 13, 2012, at 18:33, Roman Chyla <roman.ch...@gmail.com> wrote:
>
> > I think this would be great. Let me add little bit more to your
> > observations (whole night yesterday was spent fighting with renames -
> > because I was building a project which imports shared lucene and solr  --
> > there were thousands of same classes, I am not sure it would be possible
> > without some sort of a flexible rename...)
> >
> > JCC is a great tool and is used by potentially many projects - so
> stripping
> > "org.apache" seems right for pylucene, but looks arbitrary otherwise
>
> Yes, I forgot to say that there would be a way to declare one or more
> mappings  so that org.apache.lucene becomes lucene.
>
> Andi..
>
> > (unless there is a flexible stripping mechanism). Also, if the full
> > namespace remains original, then the code written in Python would be also
> > executable by Jython, which is IMHO an advantage.
> >
> > But this being Python, the packages cannot be spread in different
> locations
> > (ie. there can be only one org.apache.lucene.analysis package) - unless
> > there exists (again) some flexible mechanism which populates the
> namespace
> > with objects that belong there. It may seem an overkill to you, because
> for
> > single projects it would work, but seems perfectly justifiable in case of
> > imported shared libraries
> >
> > I don't know what is your idea for implementing the python packages, but
> > your last email got me thinking as well - there might be a very simple
> way
> > of getting to the java packages inside Python without too much work.
> >
> > Let's say the java "org.apache.lucene.search.IndexSearcher" is known to
> > python as org_apache_lucene_search_IndexSearcher
> >
> > and users do:
> >
> > import lucene
> > lucene.initVM()
> >
> > initVM() first initiates java VM (and populates the lucene namespace with
> > all objects), but then it will call jcc.register_module(self)
> >
> > A new piece of code inside JCC grabs the lucene module and creates (on
> the
> > fly) python packages -- using types.ModuleType (or new.module()) -- the
> new
> > packages will be inserted into sys.modules
> >
> > so after lucene.initVM() returns
> >
> > users can do "from org.apache.lucene.search import IndexSearcher" and get
> > lucene.org_apache_lucene_search_IndexSearcher object
> >
> > and also, when shared libraries are present (let's say 'solr') users do:
> >
> > import solr
> > solr.initVM()
> >
> > The JCC will just update the existing packages and create new ones if
> > needed (and from this perspective, having fully qualified name is safer
> > than to have lucene.search.IndexSearcher)
> >
> > I think this change is totally possible and will not change the way how
> > extensions are built. Does it have some serious flaw?
> >
> > I would be of course more than happy to contribute and test.
> >
> > Best,
> >
> >  roman
> >
> >
> > On Fri, Jul 13, 2012 at 11:47 AM, Andi Vajda <va...@apache.org> wrote:
> >
> >>
> >> On Tue, 10 Jul 2012, Andi Vajda wrote:
> >>
> >> I would also like to propose a change, to allow for more flexible
> >>>> mechanism of generating Python class names. The patch doesn't change
> >>>> the default pylucene behaviour, but it gives people a way to replace
> >>>> class names with patterns. I have noticed that there are more
> >>>> same-name classes from different packages in the new lucene (and it
> >>>> becomes worse when one has to deal with both lucene and solr).
> >>>>
> >>>
> >>> Another way to fix this is to reproduce the namespace hierarchy used in
> >>> Lucene, following along the Java packages, something I've been
> dreading to
> >>> do. Lucene just loooooves a really long deeply nested class structure.
> >>> I'm not convinced yet it is bad enough to go down that route, though.
> >>>
> >>> Your proposal to use patterns may in fact yield a much more convenient
> >>> solution. Thanks !
> >>>
> >>
> >> Rethinking this a bit, I'm prepared to change my mind on this. Your
> >> patterned rename patch shows that we're slowly but surely reaching the
> >> limit of the current setup that consists in throwing all wrapped classes
> >> under the one global 'lucene' namespace.
> >>
> >> Lucene 4.0 has seen a large number of deeply nested classes with similar
> >> names added since 3.x. Renaming these one by one (or excluding some)
> >> doesn't scale. Using the proposed patterned rename scales more but
> makes it
> >> difficult to know what got renamed and how.
> >> Ultimately, the more classes that are like-named, the more classes would
> >> have instable names from one release to the next as more duplicated
> names
> >> are encountered.
> >>
> >> What if instead JCC supported the original Java namespaces all the way
> to
> >> the Python inteface (still dropping the original 'org.apache' Java
> package
> >> tree prefix) ?
> >> The world-rooted style of naming Java classes isn't Pythonic but using
> the
> >> second half of the package structure feels right at home in the Python
> >> world.
> >>
> >> JCC already re-creates the complete Java package structure in C++ as
> >> namespaces for all the C++ code it generates, for both the JNI wrapper
> >> classes and the C++/Python types. It's only the installation of the
> class
> >> names into the Python VM that is done in the flat 'lucene' namespace.
> >>
> >> I think it shouldn't be too hard to change the code that installs
> classes
> >> to create sub-modules of the lucene module and install classes in these
> >> submodules instead (down to however many levels are in the original).
> >>
> >> In other words:
> >>  - from lucene import Document
> >> would become
> >>  - from lucene.document import Document
> >>
> >> One could of course also say:
> >>  - import lucene.document.Document as whateverOneLikes
> >>
> >> If that proposal isn't mortally flawed somewhere, I'm prepared to drop
> >> support for --rename and replace it with this new Python class/module
> >> layout.
> >>
> >> Since this is being talked about in the context of a major PyLucene
> >> release, version 4.0, and that all tests/samples have to be reworked
> >> anyway, this backwards compat break shouldn't be too controversial,
> >> hopefully.
> >>
> >> If it is, the old --rename could be preserved for sure, but I'd prefer
> >> simplying the JCC interface than to accrete more to it.
> >>
> >> What do you think ?
> >>
> >> Andi..
> >>
> >>
> >>> Andi..
> >>>
> >>>
> >>>> I can confirm the test_test_BinaryDocument.py crashes the JVM no more.
> >>>>
> >>>> Roman
> >>>>
> >>>>
> >>>> On Tue, Jul 10, 2012 at 8:54 AM, Andi Vajda <va...@apache.org> wrote:
> >>>>
> >>>>>
> >>>>> Hi Roman,
> >>>>>
> >>>>>
> >>>>> On Mon, 9 Jul 2012, Roman Chyla wrote:
> >>>>>
> >>>>> Thanks, I am attaching a new patch that adds the missing test base.
> >>>>>> Sorry for the tabs, I was probably messing around with a few editors
> >>>>>> (some of them not configured properly)
> >>>>>>
> >>>>>
> >>>>>
> >>>>> I integrated your test class (renaming it to fit the naming scheme
> >>>>> used).
> >>>>> Thanks !
> >>>>>
> >>>>>
> >>>>> So far, found one serious problem, crashes VM -- see. eg
> >>>>>>>>>> test/test_BinaryDocument.py - when getting the document using:
> >>>>>>>>>> reader.document(0)
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>>> test/test_BInaryDocument.py doesn't seem to crash the VM but fails
> >>>>> because
> >>>>> of some API changes. I suspect the crash to be some issue related to
> >>>>> using
> >>>>> an older jcc.
> >>>>>
> >>>>> I see a comment saying: "couldn't find any combination with lucene4.0
> >>>>> where
> >>>>> it would raise errors". Most of these unit tests are straight ports
> >>>>> from the
> >>>>> original Java version. If you're stumped about a change, check the
> >>>>> original
> >>>>> Java test, it may have changed too.
> >>>>>
> >>>>> Andi..
> >>>>>
> >>>>>
> >>>>
> >>>
>

Reply via email to