Package: g2p-sk Version: 0.3 Severity: normal Ä (U+00C4 LATIN CAPITAL LETTER A WITH DIAERESIS) is missing from lc_1 subroutine. As a result, asking for transcription of Ä yields Ä in ISO-8859-2 encoding, which is not a legal SAMPA.
In addition, g2p-sk should probably NOT repeat unknown characters at the output, since the output is supposed to be SAMPA and any additional characters are confusing the hell out of any output parsing utilities. -- System Information: Debian Release: 3.1 APT prefers unstable APT policy: (500, 'unstable'), (500, 'testing') Architecture: powerpc (ppc) Shell: /bin/sh linked to /bin/dash Kernel: Linux 2.6.15-1-powerpc Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Versions of packages g2p-sk depends on: ii perl 5.8.8-6.1 Larry Wall's Practical Extraction ii sylseg-sk 0.5 Syllabic segmentation for Slovak l g2p-sk recommends no packages. -- no debconf information