PyHyphen-0.2.1a - OpenOffice-like hyphenation
Hi, a couple of weeks ago I uploaded PyHyphen-0.1 on the PyPI. It is a wrapper around the C library "hnj_hyphen 2.3" that ships with OpenOffice and Mozilla products. You can have a look at PyHyphen at http://pypi.python.org/pypi/PyHyphen/0.2.1a I've tested it on Linux, but it should also run on Windows (for which I have no C compiler and I think I won't get one as I am using W2K). What PyHyphen can do is shown in the code example on the module's cover page on PyPI. Unlike a wrapper module called 'pyhnj' written by someone from Berkeley, PyHyphen supports non-standard hyphenation with replacements. It is therefore suitable for all languages. It also accepts unicode objects. There are no inherent limitations, let alone the max. length of words = 100. Just download a hyphenation dictionary for your favorite language at http://wiki.services.openoffice.org/wiki/Dictionaries or use the included English one, ... and enjoy! The 10 line 'example.py' included in the tarball shows how it works. There are no doc strings yet (but a README), and the C source of the wrapper module 'hyphenmodule.c' is lenthier than desirable. I will work on this as soon as I can. For now I am not aware of any major bugs though. My next plans are as follows: - shorten and polish the C code - add short doc strings - add search paths for the dictionaries - add a 'wrap' method to the convenience interface that selects the best hyphenation (among a tuple of pairs) to fit in the current line - modify the 'textwrap' module from the Standard Library to use hyphenation instead of just pushing entire words into the next line Any feedback or help, preferrably by e-mail ([EMAIL PROTECTED]) is highly welcome. I won't read the postings here regularly. Have fun! Stefan -- http://mail.python.org/mailman/listinfo/python-list
Hyphenation module PyHyphen-0.3 released
I am pleased to share with you the great features of the latest version. Large parts of the sources were completely rewritten. Also, they are now reasonably documented. Just go to http://pypi.python.org/pypi/PyHyphen/0.3 I was tempted to classify it as Beta. Indeed I am not aware of any bugs, but I haven't spent very much time for testing, just ran some word lists... Any feedback is greatly appreciated. Especially I would be interested in experiences under Windows. I can only test it under Linux. If there were a good soul to send me a DLL for Windows ([EMAIL PROTECTED]) , this would be terrific. Bests Leo -- http://mail.python.org/mailman/listinfo/python-list
Hyphenation module PyHyphen-0.4 released - good news for Windows users
Thank you very much for your interest, helpful comments and suggestions. Two of you have even sent me .pyd files one of which (arguably compiled with MSVC 2003) is contained in version 0.4 for manual installation. This has spared me the hazzle to install cygwin etc. I have made the following changes all of which are due to feedback received from this group: - restructured the whole thing to form a package rather than two independent modules in the root package (only two days ago I became aware of the beauty of __init__.py files!) - setup script installs the default English dictionary in the package dir rather than in a completely useless dict/ subdir. So the trouble with IOErrors upon instantiation should be history. - improved the module documentation: it now contains the link to the OpenOffice website where you can download your favorite dictionary. - I also updated the README file and cleaned up the source tree. Reportedly and in contrast to gcc, MSVC produces a few signed/unsigned mismatch warnings. This should be easy to fix but I haven't had the time for this. The pyd runs smoothly with Python 2.5 anyway. My special thanks go to G.K. for his detailed feedback. So I think it is worthwhile to download the latest version at http://pypi.python.org/pypi/PyHyphen/0.4. Finally, I'd like to share with you some ideas regarding potential next steps which might add value to some projects: - integrating PyHyphen with the standard module 'textwrap' 'textwrap' is a quite useful thing. But it might benefit from adding hyphenation capabilities. Consider an optional argument, say, 'use_hyphens', to be passed to the __init__ method of the textwrapper class. It should default to None for backwards compatibility. The methods doing the wrapping business should then invoke hyphenator.wrap(word, width), if available and do the necessary work. The changes should be easy to implement. I'm not sure whether subclassing textwrapper would be the preferred approach... - exploring if there is appetite to integrate PyHyphen with GUI's and web development frameworks. Although I would be happy to give it a first go on textwrap, I fear I won't find the time in the coming weeks. Spring is ahead in my country after all... So if you wish to contribute, the above bullets may be good starting points. Thanks again and enjoy! Leo -- http://mail.python.org/mailman/listinfo/python-list
Hyphenation: PyHyphen 0.4.1 and textwrap2-0.1.1 released
This latest version of PyHyphen is only important for Python 2.4 addicts who encountered a missing type when compiling. Further, a few signed/unsigned mismatch warnings coming from MSVC should be fixed. As I have only Python 2.5, I'd be interested in any experiences when compiling it with Python 2.4. Visit http://cheeseshop.python.org/pypi/PyHyphen Further, as suggested here some days ago, I have integrated 'textwrap ' with PyHyphen. While I anticipated lots of work and hoped for volunteers, I have done it myself now. And it was a cake walk! Just had to insert roughly a handfull of lines... Pure Python is pure fun! Visit http://cheeseshop.python.org/pypi/textwrap2 Bests Stefan -- http://mail.python.org/mailman/listinfo/python-list
Hyphenation: PyHyphen-0.7 released
Hi, I have just uploaded the latest sources of PyHyphen (http://cheeseshop.python.org/pypi/PyHyphen). The tarball also contains Windows binaries of the C extension for Python 2.4 and 2.5. So most Windows users will get going without compiling. Just enter the usual 'python setup.py install'. There are many bug fixes both in the Python modules on top as well as in the C extension that uses a new release of the underlying hyphenation library (hyphen-2.3.1). Further, I have added a module 'dictools' for easy download and installation of dictionaries (see below). Finally, a script for easy testing of the hyphenation functionality with large wordlists and multiple dictionaries has been added. Dictionaries are installed on the fly and everything is logged. The default dir for dictionaries and the default repository to download dictionaries are configurable, so that one can use existing dictionaries, e.g., from an OpenOffice installation. The package also includes and installs the module 'textwrap2' which adds a hyphenation feature to the standard module textwrap. Code example: from hyphen import hyphenator from hyphen.dictools import * # Download and install some dictionaries in the default directory using the default # repository, usually the OpenOffice website for lang in ['de_DE', 'fr_FR', 'en_UK', 'hu_HU']: if not is_installed(lang): install(lang) # Create some hyphenators h_de = hyphenator('de_DE') h_en = hyphenator('en_US') h_hu = hyphenator('hu_HU') # Now hyphenate some words print h_hu.inserted(u'asszonnyal') 'asz=szony=nyal' print h_en.pairs('beautiful') [[u'beau', u'tiful'], [u'beauti', u'ful']] print h_en.wrap('beautiful', 6) [u'beau-', u'tiful'] print h_en.wrap('beautiful', 7) [u'beauti-', u'ful'] from textwrap2 import fill print fill('very long text...', width = 40, use_hyphens = h_en) My thanks go to those who helped enormously with advice, suggestions, criticism and Windows builds. Regards Leo -- http://mail.python.org/mailman/listinfo/python-list