While working on it, I found that there's an already existing error called no_translation.
I believe that this is the intended error message since it says "device failed to transcode string from ~p to ~p". This is exactly our use case I think. Based on this, I created two PRs: 1. OTP: https://github.com/erlang/otp/pull/5885 2. Elixir: https://github.com/elixir-lang/elixir/pull/11756 The Elixir PR does not depend on the OTP PR since the error was already in place and therefore also works in current OTP versions. It however has a small issue in the tests (ExUnit / iex somehow behave differently). I would therefore appreciate a hint on what is going wrong. Since my proposal is represented completely in the two PRs, I will no longer update this thread. For anyone following along, please have a look at the PRs instead. José Valim schrieb am Dienstag, 12. April 2022 um 18:39:36 UTC+8: > I would do it based on the error_info. In fact if you could solve this all > together with the error_info, then it would be even better. > > On Tue, Apr 12, 2022 at 11:14 AM Jonatan Männchen <[email protected]> > wrote: > >> Great :) >> >> Last question for now: Would you add the details of the error >> to error_info of the badarg error or should this be an entirely different >> error? >> >> On Tuesday, April 12, 2022 at 4:11:45 PM UTC+8 José Valim wrote: >> >>> I would personally send a PR with steps one and two, and also changing >>> one of the callsites for 3 so you have something to test. Then another PR >>> to migrate the remaining call sites! >>> >>> On Tue, Apr 12, 2022 at 9:38 AM Jonatan Männchen <[email protected]> >>> wrote: >>> >>>> Hi all, >>>> >>>> I dug through the code and now know how the parts fits together. >>>> >>>> Since it is a combination of Elixir & OTP, I would like to ask for your >>>> input before continuing: >>>> >>>> When using the following code, the key code paths are: >>>> >>>> *{:ok, io} = StringIO.open("", encoding: :unicode) # Starts a StringIO >>>> GenServer* >>>> *IO.write(io, <<222>>)* >>>> >>>> >>>> https://github.com/erlang/otp/blob/56240f084d80de80d5f1c7fef94db68a6f1b81cc/lib/stdlib/src/io.erl#L91 >>>> => :io.request(pid, {:put_chars, :unicode,<<222>>}, _ref) >>>> >>>> >>>> https://github.com/elixir-lang/elixir/blob/a64d42f5d3cb6c32752af9d3312897e8cd5bb7ec/lib/elixir/lib/string_io.ex#L289 >>>> => {{:error, req}, state} >>>> >>>> >>>> https://github.com/erlang/otp/blob/56240f084d80de80d5f1c7fef94db68a6f1b81cc/lib/stdlib/src/io.erl#L340 >>>> => badarg >>>> >>>> >>>> https://github.com/erlang/otp/blob/56240f084d80de80d5f1c7fef94db68a6f1b81cc/lib/stdlib/src/io.erl#L99 >>>> => raises argument error without any additional info >>>> >>>> Based on this, I think multiple things are needed: >>>> >>>> >>>> 1. `io:conv_reason/1` needs to accept a new tuple for the error >>>> 1. I would return the error directly from >>>> `unicode:characters_to_binary/3`: {:error, {:encoding, result}} >>>> 2. where result: {error, binary(), RestData} | {incomplete, >>>> binary(), binary()} >>>> 2. `io:o_request/2` needs to handle that new error and raise a >>>> better error >>>> 1. new error kind? >>>> 2. existing badarg with more detail data? >>>> 3. All those sources need to produce the new error (does not >>>> produce OTP incompatibility in Elixir sind everything unknown is >>>> converted >>>> to badarg): >>>> 1. `StringIO.put_chars/4` >>>> 2. `file_io_server:put_chars/3` >>>> 3. `group:io_request/5` >>>> 4. `standard_error:wrap_characters_to_binary/3` >>>> 5. `user_drv:io_command/1` >>>> 6. `user:wrap_characters_to_binary/2` >>>> 7. `ssh_cli:io_request/4` >>>> >>>> Does that sound like I'm going into the right direction? >>>> >>>> If yes: Should that be one PR in OTP or several? >>>> >>>> Thanks for any input & Best, >>>> Jonatan >>>> >>>> >>>> On Thursday, April 7, 2022 at 2:14:39 PM UTC+8 José Valim wrote: >>>> >>>>> Make sure to also check out this doc then: >>>>> https://github.com/erlang/otp/blob/master/HOWTO/DEVELOPMENT.md Have >>>>> fun! >>>>> >>>>> On Thu, Apr 7, 2022 at 8:12 AM Jonatan Männchen <[email protected]> >>>>> wrote: >>>>> >>>>>> I’ve never done a PR to OTP so far. I’ll check it out and ping back >>>>>> here for updates. >>>>>> >>>>>> Sent from my iPhone >>>>>> >>>>>> On 7 Apr 2022, at 07:09, José Valim <[email protected]> wrote: >>>>>> >>>>>> >>>>>> I would start with 2 by submitting a PR to Erlang/OTP and then access >>>>>> their appetite from there. :) >>>>>> >>>>>> On Thu, Apr 7, 2022 at 8:03 AM Jonatan Männchen <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> I guess leaving nr. 3 out would be fine. >>>>>>> >>>>>>> What would he the proper path to do this in erlang? Just writing >>>>>>> some erlang example code and opening issues? >>>>>>> >>>>>>> Sent from my iPhone >>>>>>> >>>>>>> On 7 Apr 2022, at 06:46, José Valim <[email protected]> wrote: >>>>>>> >>>>>>> >>>>>>> All three of them would require changes to Erlang and I would love >>>>>>> for someone to chase this path. 2 sounds doable with EEP 54 but 1 would >>>>>>> definitely require more discussion. >>>>>>> >>>>>>> I am not sure about 3. binwrite pretty much means "I know what I am >>>>>>> doing" and we should allow the user to write whatever they want as is. >>>>>>> >>>>>>> On Thu, Apr 7, 2022 at 7:25 AM Jonatan Männchen <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks, that solved my specific issue. I still think that some >>>>>>>> improvements are needed. >>>>>>>> >>>>>>>> Looking at the code for CaptureIO, I think those changes should be >>>>>>>> made directly in the StringIO / IO modules and not specifically for >>>>>>>> CaptureIO. >>>>>>>> >>>>>>>> The following things are not great today IMHO: >>>>>>>> >>>>>>>> 1. The encoding that lets everything through is called latin1. >>>>>>>> I think we should introduce & properly document a new encoding >>>>>>>> called >>>>>>>> something like raw_binary. It would work exactly the same though. >>>>>>>> 2. IO.write with invalid characters into an io device with any >>>>>>>> encoding should have a better error message. (Something like >>>>>>>> "<<222>> is >>>>>>>> not a valid unicode string. Provide `encoding: :raw_binary` when >>>>>>>> opening >>>>>>>> the io device") >>>>>>>> 3. IO.binwrite with invalid characters into an encoding: >>>>>>>> :unicode io device should probably at least warn if invalid >>>>>>>> characters are >>>>>>>> passed. >>>>>>>> >>>>>>>> This would something like this in the form of a test: >>>>>>>> https://gist.github.com/maennchen/f428360a71d23a323538d9b7d51e638b >>>>>>>> >>>>>>>> On Wednesday, April 6, 2022 at 9:29:12 PM UTC+2 Wojtek Mach wrote: >>>>>>>> >>>>>>>>> > Additionally, it would be good if there was a proper error for >>>>>>>>> invalid characters instead of the currently raised ArgumentError. >>>>>>>>> >>>>>>>>> Yeah the error is pretty bad: >>>>>>>>> >>>>>>>>> IO.puts(<<222>>) >>>>>>>>> ** (ArgumentError) errors were found at the given arguments: unknown >>>>>>>>> error: put_chars >>>>>>>>> >>>>>>>>> >>>>>>>>> It is slightly more informative when used inside capture io though: >>>>>>>>> >>>>>>>>> assert capture_io(fn -> >>>>>>>>> IO.puts(<<222>>) >>>>>>>>> end) == <<222>> >>>>>>>>> ** (ArgumentError) errors were found at the given arguments: unknown >>>>>>>>> error: {put_chars,unicode,[<<"Þ">>,10]} >>>>>>>>> >>>>>>>>> >>>>>>>>> We get error because stdio is with unicode encoding: >>>>>>>>> >>>>>>>>> iex> :io.getopts(:standard_io)[:encoding] >>>>>>>>> :unicode >>>>>>>>> >>>>>>>>> >>>>>>>>> and we're writing <<222>> which isn't unicode. >>>>>>>>> >>>>>>>>> For writing in raw mode, use IO.binwrite. This and the previously >>>>>>>>> mentioned :encoding option will make the following test succeed: >>>>>>>>> >>>>>>>>> assert capture_io([encoding: :latin1], fn -> >>>>>>>>> IO.binwrite(<<222>>) >>>>>>>>> end) == <<222>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On April 6, 2022, "maennchen.ch" <[email protected]> wrote: >>>>>>>>> >>>>>>>>> That unfortunately gives me the same result. >>>>>>>>> >>>>>>>>> On Wednesday, April 6, 2022 at 6:57:35 PM UTC+1 Wojtek Mach wrote: >>>>>>>>>> >>>>>>>>>> I believe capture_io(encoding: :latin1, fun) should do the trick, >>>>>>>>>> can you check? >>>>>>>>>> >>>>>>>>>> On April 6, 2022, "maennchen.ch" <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>> Hi everyone, >>>>>>>>>> >>>>>>>>>> *Background* >>>>>>>>>> >>>>>>>>>> While developing tests for a mix task, that returns non UTF8 >>>>>>>>>> binaries into STDOUT (building block to be piped into a file / >>>>>>>>>> pipe), I >>>>>>>>>> found that ExUnit.CaptureIO can only handle UTF8 and Latin1. >>>>>>>>>> >>>>>>>>>> Example Test that does not work: >>>>>>>>>> https://gist.github.com/maennchen/16d411eeda3255fa3d3152fe9d836a82 >>>>>>>>>> >>>>>>>>>> *Proposal* >>>>>>>>>> >>>>>>>>>> For testing this use case, it would be good if any raw binary >>>>>>>>>> would also be passed through. (Maybe via option "encoding: >>>>>>>>>> :raw_binary") >>>>>>>>>> >>>>>>>>>> Additionally, it would be good if there was a proper error for >>>>>>>>>> invalid characters instead of the currently raised ArgumentError. >>>>>>>>>> >>>>>>>>>> *Real World Example* >>>>>>>>>> >>>>>>>>>> Here is a real test, that would be made possible by this change: >>>>>>>>>> https://github.com/elixir-gettext/expo/blob/9048fe242830614f6d4235cbd345de844693f28a/test/mix/tasks/expo.msgfmt_test.exs#L18 >>>>>>>>>> >>>>>>>>>> *PR* >>>>>>>>>> >>>>>>>>>> I'm happy to provide a PR for this as well. >>>>>>>>>> >>>>>>>>>> Thanks & Kind Regards, >>>>>>>>>> Jonatan >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>> Google Groups "elixir-lang-core" group. >>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>>> send an email to [email protected]. >>>>>>>>>> To view this discussion on the web visit >>>>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/e35e97cf-00d6-422e-b3c1-ec508ff1e36fn%40googlegroups.com >>>>>>>>>> >>>>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/e35e97cf-00d6-422e-b3c1-ec508ff1e36fn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>> . >>>>>>>>>> >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "elixir-lang-core" group. >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>> send an email to [email protected]. >>>>>>>>> >>>>>>>>> To view this discussion on the web visit >>>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/9e817fc9-6f61-4476-bb27-c062ed6167fan%40googlegroups.com >>>>>>>>> >>>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/9e817fc9-6f61-4476-bb27-c062ed6167fan%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>> . >>>>>>>>> >>>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "elixir-lang-core" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/f2816480-4d91-4576-9241-3bdc9351a920n%40googlegroups.com >>>>>>>> >>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/f2816480-4d91-4576-9241-3bdc9351a920n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to a topic in >>>>>>> the Google Groups "elixir-lang-core" group. >>>>>>> To unsubscribe from this topic, visit >>>>>>> https://groups.google.com/d/topic/elixir-lang-core/RR7nbeHsluQ/unsubscribe >>>>>>> . >>>>>>> To unsubscribe from this group and all its topics, send an email to >>>>>>> [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KdzYkX%3DL%3D3%3DYHSJk4R8iq1%3Dbe9262Fg54eX1yLrx-c9g%40mail.gmail.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KdzYkX%3DL%3D3%3DYHSJk4R8iq1%3Dbe9262Fg54eX1yLrx-c9g%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "elixir-lang-core" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/F3C0BE5C-EC1A-4574-8586-7FC5E7AFD39A%40maennchen.ch >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/F3C0BE5C-EC1A-4574-8586-7FC5E7AFD39A%40maennchen.ch?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to a topic in >>>>>> the Google Groups "elixir-lang-core" group. >>>>>> To unsubscribe from this topic, visit >>>>>> https://groups.google.com/d/topic/elixir-lang-core/RR7nbeHsluQ/unsubscribe >>>>>> . >>>>>> To unsubscribe from this group and all its topics, send an email to >>>>>> [email protected]. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4%2BSqS4j-VRG_aBfSfs3uc3c0Q_m3svGTKS%2Bw2OR5aO_rg%40mail.gmail.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4%2BSqS4j-VRG_aBfSfs3uc3c0Q_m3svGTKS%2Bw2OR5aO_rg%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "elixir-lang-core" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> >>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/elixir-lang-core/08729069-5CF7-4400-BAC3-A93E1DA43B1A%40maennchen.ch >>>>>> >>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/08729069-5CF7-4400-BAC3-A93E1DA43B1A%40maennchen.ch?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elixir-lang-core" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/elixir-lang-core/d5886ad9-002e-48fd-948a-067a96770830n%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/elixir-lang-core/d5886ad9-002e-48fd-948a-067a96770830n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "elixir-lang-core" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/elixir-lang-core/dec562c1-0f0c-4f9d-90f9-7865017a3b1en%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elixir-lang-core/dec562c1-0f0c-4f9d-90f9-7865017a3b1en%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "elixir-lang-core" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/2d6875c7-5468-4d54-bf51-b7232ed61120n%40googlegroups.com.
