PyHyphen-0.2.1a - OpenOffice-like hyphenation

2008-02-13 Thread Dr. leo
Hi,

a couple of weeks ago I uploaded PyHyphen-0.1 on the PyPI. It is a wrapper
around the C library "hnj_hyphen 2.3" that ships with OpenOffice and Mozilla
products. You can have a look at PyHyphen at

http://pypi.python.org/pypi/PyHyphen/0.2.1a

I've tested it on Linux, but it should also run on Windows (for which I have
no C compiler and I think I won't get one as I am using W2K).

What PyHyphen can do is shown in the code example on the module's cover page
on PyPI. Unlike a wrapper module called 'pyhnj' written by someone from
Berkeley, PyHyphen supports non-standard hyphenation with replacements. It
is therefore suitable for all languages. It also accepts unicode objects.

There are no inherent limitations, let alone the max. length of words = 100.

Just download a hyphenation dictionary for your favorite language at
http://wiki.services.openoffice.org/wiki/Dictionaries

or use the included English one, ... and enjoy!

The 10 line 'example.py' included in the tarball shows how it works.

There are no doc strings yet (but a README), and the C source of the wrapper
module 'hyphenmodule.c' is lenthier than desirable. I will work on this as
soon as I can. For now I am not aware of any major bugs though.

My next plans are as follows:
- shorten and polish the C code
- add short doc strings
- add search paths for the dictionaries
- add a 'wrap' method to the convenience interface that selects the best
hyphenation (among a tuple of pairs) to fit in the current line
- modify the 'textwrap' module from the Standard Library to use hyphenation
instead of just pushing entire words into the next line

Any feedback or help, preferrably by e-mail ([EMAIL PROTECTED]) is
highly welcome. I won't read the postings here regularly.

Have fun!

Stefan



-- 
http://mail.python.org/mailman/listinfo/python-list


Hyphenation module PyHyphen-0.3 released

2008-02-23 Thread Dr. leo
I am pleased to share with you the great features of the latest version.
Large parts of the sources were completely rewritten. Also, they are now
reasonably documented.

Just go to
http://pypi.python.org/pypi/PyHyphen/0.3

I was tempted to classify it as Beta. Indeed I am not aware of any bugs, but
I haven't spent very much time for testing, just ran some word lists...

Any feedback is greatly appreciated. Especially I would be interested in
experiences under Windows. I can only test it under Linux.

If there were a good soul to send me a DLL for Windows
([EMAIL PROTECTED]) , this would be terrific.

Bests

Leo


-- 
http://mail.python.org/mailman/listinfo/python-list


Hyphenation module PyHyphen-0.4 released - good news for Windows users

2008-02-26 Thread Dr. leo
Thank you very much for your interest, helpful comments and suggestions. Two
of you have even sent me .pyd files one of which (arguably compiled with
MSVC 2003) is contained in version 0.4 for manual installation. This has
spared me the hazzle to install cygwin etc.

I have made the following changes all of which are due to feedback received
from this group:
- restructured the whole thing to form a package rather than two independent
modules in the root package (only two days ago I became aware of the beauty
of __init__.py files!)
- setup script installs the default English dictionary in the package dir
rather than in a completely useless dict/ subdir. So the trouble with
IOErrors upon instantiation should be history.
- improved the module documentation: it now contains the link to the
OpenOffice website where you can download your favorite dictionary.
- I also updated the README file and cleaned up the source tree.

Reportedly and in contrast to gcc, MSVC produces a few signed/unsigned
mismatch warnings. This should be easy to fix but I haven't had the time for
this. The pyd runs smoothly with Python 2.5 anyway. My special thanks go to
G.K. for his detailed feedback.

So I think it is worthwhile to download the latest version at
http://pypi.python.org/pypi/PyHyphen/0.4.

Finally, I'd like to share with you some ideas regarding potential next
steps which might add value to some projects:

- integrating PyHyphen with the standard module 'textwrap'

'textwrap' is a quite useful thing. But it might benefit from adding
hyphenation capabilities. Consider an optional argument, say, 'use_hyphens',
to be passed to the __init__ method of the textwrapper class. It should
default to None for backwards compatibility. The methods doing the wrapping
business should then invoke hyphenator.wrap(word, width), if available and
do the necessary work. The changes should be easy to implement. I'm not sure
whether subclassing textwrapper would be the preferred approach...

- exploring if there is appetite to integrate PyHyphen with GUI's and web
development frameworks.

Although I would be happy to give it a first go on textwrap, I fear I won't
find the time in the coming weeks. Spring is ahead in my country after
all...

So if you wish to contribute, the above bullets may be good starting points.

Thanks again and enjoy!

Leo


-- 
http://mail.python.org/mailman/listinfo/python-list


Hyphenation: PyHyphen 0.4.1 and textwrap2-0.1.1 released

2008-03-02 Thread Dr. leo
This latest version of PyHyphen is only important for Python 2.4 addicts who
encountered a missing type when compiling. Further, a few signed/unsigned
mismatch warnings coming from MSVC should be fixed. As I have only Python
2.5, I'd be interested in any experiences when compiling it with Python 2.4.

Visit http://cheeseshop.python.org/pypi/PyHyphen

Further, as suggested here some days ago, I have integrated 'textwrap ' with
PyHyphen. While I anticipated lots of work and hoped for volunteers, I have
done it myself now. And it was a cake walk! Just had to insert roughly a
handfull of lines... Pure Python is pure fun!

Visit http://cheeseshop.python.org/pypi/textwrap2

Bests
Stefan


-- 
http://mail.python.org/mailman/listinfo/python-list


Hyphenation: PyHyphen-0.7 released

2008-04-02 Thread Dr. leo
Hi,

I have just uploaded the latest sources of PyHyphen
(http://cheeseshop.python.org/pypi/PyHyphen). The tarball also contains
Windows binaries of the C extension for Python 2.4 and 2.5. So most Windows
users will get going without compiling. Just enter the usual 'python
setup.py install'.

There are many bug fixes both in the Python modules on top as well as in the
C extension that uses a new release of the underlying hyphenation library
(hyphen-2.3.1).

Further, I have added a module 'dictools' for easy download and installation
of dictionaries (see below). Finally, a script for easy testing of the
hyphenation functionality with large wordlists and multiple dictionaries has
been added. Dictionaries are installed on the fly and everything is logged.

The default dir for dictionaries and the default repository to download
dictionaries are configurable, so that one can use existing dictionaries,
e.g., from an OpenOffice installation.

The package also includes and installs the module 'textwrap2' which adds a
hyphenation feature to the standard module textwrap.


Code example:

from hyphen import hyphenator
from hyphen.dictools import *

# Download and install some dictionaries in the default directory using the
default
# repository, usually the OpenOffice website
for lang in ['de_DE', 'fr_FR', 'en_UK', 'hu_HU']:
if not is_installed(lang): install(lang)

# Create some hyphenators
h_de = hyphenator('de_DE')
h_en = hyphenator('en_US')
h_hu = hyphenator('hu_HU')

# Now hyphenate some words

print h_hu.inserted(u'asszonnyal')
'asz=szony=nyal'

print h_en.pairs('beautiful')
[[u'beau', u'tiful'], [u'beauti', u'ful']]

print h_en.wrap('beautiful', 6)
[u'beau-', u'tiful']

print h_en.wrap('beautiful', 7)
[u'beauti-', u'ful']

from textwrap2 import fill
print fill('very long text...', width = 40, use_hyphens = h_en)

My thanks go to those who helped enormously with advice, suggestions,
criticism and Windows builds.

Regards

Leo


-- 
http://mail.python.org/mailman/listinfo/python-list