[ https://issues.apache.org/jira/browse/CODEC-331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gary D. Gregory resolved CODEC-331. ----------------------------------- Fix Version/s: 1.19.0 Resolution: Fixed Hello [~ilikecode] This is fixed in git master and snapshot builds in [https://repository.apache.org/content/repositories/snapshots/] Please verify and close this ticket if your use case is fixed. TY! > org.apache.commons.codec.language.bm.Rule.parsePhonemeExpr(String) adds > duplicate empty phoneme when input ends with | > ---------------------------------------------------------------------------------------------------------------------- > > Key: CODEC-331 > URL: https://issues.apache.org/jira/browse/CODEC-331 > Project: Commons Codec > Issue Type: Bug > Affects Versions: 1.18.0 > Environment: Affected Version: 1.18.1 (I found this version from my > pom.xml) > MacOS > JDK 8 > Reporter: IlikeCode > Priority: Major > Fix For: 1.19.0 > > Attachments: Screenshot 2025-05-19 at 8.11.02 am.png > > > Component: org.apache.commons.codec.language.bm.Rule > Method: private static PhonemeExpr parsePhonemeExpr(String ph) > > h1. Problem > When the input string is *(()|)* > The method *parsePhonemeExpr(String)* first strips the parentheses, > producing: *body = "()|”* > Then it executes *body.split("[|]")* > Due to Java's default behavior, the trailing empty string (after the {*}|{*}) > is discarded, resulting in *["()"]* > To compensate for this, the following logic is used: > if (body.startsWith("|") || body.endsWith("|")) > { phs.add(new Phoneme("", Languages.ANY_LANGUAGE)); } > However, the *"()"* entry already results in a *Phoneme("")* when parsed. > As a result, the list ends up containing two empty phonemes, which seems > unintended. > h1. Expected Result > Only one empty phoneme should be added for (()|). > > h1. Actual Result > > Two empty phonemes are returned: > - One from parsing "()" > - One manually added due to .endsWith("|") > -- This message was sent by Atlassian Jira (v8.20.10#820010)