[ https://issues.apache.org/jira/browse/IMAGING-380 ]
Thomas Stieler deleted comment on IMAGING-380:
----------------------------------------
was (Author: JIRAUSER308395):
#
> Option to enforce UTF-8 encoding for IPTC records
> -------------------------------------------------
>
> Key: IMAGING-380
> URL: https://issues.apache.org/jira/browse/IMAGING-380
> Project: Commons Imaging
> Issue Type: New Feature
> Components: Format: JPEG
> Affects Versions: 1.0.0-alpha5
> Reporter: Thomas Stieler
> Priority: Major
> Fix For: 1.0.0-alpha6
>
>
> We are using commons-imaging to add IPTC records to JPEG images.
> Currently the encoding for IPTC values is determined by testing, if all
> values can be encoded with ISO-8859-1 (see
> [here|https://github.com/apache/commons-imaging/blob/master/src/main/java/org/apache/commons/imaging/formats/jpeg/iptc/IptcParser.java#L375-L382]).
>
> * if ISO-8859-1 is used as charset, no envelope record "Coded Character Set"
> is written
> * if ISO-8859-1 cannot be used, then the IPTC values will be converted into
> UTF-8 bytes and the envelope record "Coded Character Set" is set accordingly
> The problem with this strategy is, that some applications/libraries are using
> UTF-8 as default charset to parse IPTC records, if "Coded Character Set" is
> not set. In this case special characters (like "äöü") in IPTC records are
> parsed incorrectly.
> Currently there is no option to enforce UTF-8 encoding: This would help to
> improve the reliability for correct parsing of IPTC records written by
> commons-imaging.
> I already made some small changes to offer an option for enforcing UTF-8
> encoding without changes the current default behaviour, it would be cool if
> someone could review my pull request:
> https://github.com/apache/commons-imaging/pull/477
> Please send me a message, if something is missing or incomplete!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)