While working on it, I found that there's an already existing error called 
no_translation.

I believe that this is the intended error message since it says "device 
failed to transcode string from ~p to ~p". This is exactly our use case I 
think.

Based on this, I created two PRs:


   1. OTP: https://github.com/erlang/otp/pull/5885
   2. Elixir: https://github.com/elixir-lang/elixir/pull/11756

The Elixir PR does not depend on the OTP PR since the error was already in 
place and therefore also works in current OTP versions. It however has a 
small issue in the tests (ExUnit / iex somehow behave differently). I would 
therefore appreciate a hint on what is going wrong.

Since my proposal is represented completely in the two PRs, I will no 
longer update this thread. For anyone following along, please have a look 
at the PRs instead.

José Valim schrieb am Dienstag, 12. April 2022 um 18:39:36 UTC+8:

> I would do it based on the error_info. In fact if you could solve this all 
> together with the error_info, then it would be even better.
>
> On Tue, Apr 12, 2022 at 11:14 AM Jonatan Männchen <[email protected]> 
> wrote:
>
>> Great :)
>>
>> Last question for now: Would you add the details of the error 
>> to error_info of the badarg error or should this be an entirely different 
>> error?
>>
>> On Tuesday, April 12, 2022 at 4:11:45 PM UTC+8 José Valim wrote:
>>
>>> I would personally send a PR with steps one and two, and also changing 
>>> one of the callsites for 3 so you have something to test. Then another PR 
>>> to migrate the remaining call sites!
>>>
>>> On Tue, Apr 12, 2022 at 9:38 AM Jonatan Männchen <[email protected]> 
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I dug through the code and now know how the parts fits together.
>>>>
>>>> Since it is a combination of Elixir & OTP, I would like to ask for your 
>>>> input before continuing:
>>>>
>>>> When using the following code, the key code paths are:
>>>>
>>>> *{:ok, io} = StringIO.open("", encoding: :unicode) # Starts a StringIO 
>>>> GenServer*
>>>> *IO.write(io, <<222>>)*
>>>>
>>>>
>>>> https://github.com/erlang/otp/blob/56240f084d80de80d5f1c7fef94db68a6f1b81cc/lib/stdlib/src/io.erl#L91
>>>> => :io.request(pid, {:put_chars, :unicode,<<222>>}, _ref)
>>>>
>>>>
>>>> https://github.com/elixir-lang/elixir/blob/a64d42f5d3cb6c32752af9d3312897e8cd5bb7ec/lib/elixir/lib/string_io.ex#L289
>>>> => {{:error, req}, state}
>>>>
>>>>
>>>> https://github.com/erlang/otp/blob/56240f084d80de80d5f1c7fef94db68a6f1b81cc/lib/stdlib/src/io.erl#L340
>>>> => badarg
>>>>
>>>>
>>>> https://github.com/erlang/otp/blob/56240f084d80de80d5f1c7fef94db68a6f1b81cc/lib/stdlib/src/io.erl#L99
>>>> => raises argument error without any additional info
>>>>
>>>> Based on this, I think multiple things are needed:
>>>>
>>>>
>>>>    1. `io:conv_reason/1` needs to accept a new tuple for the error
>>>>       1. I would return the error directly from 
>>>>       `unicode:characters_to_binary/3`: {:error, {:encoding, result}}
>>>>       2. where result: {error, binary(), RestData} | {incomplete, 
>>>>       binary(), binary()}
>>>>    2. `io:o_request/2` needs to handle that new error and raise a 
>>>>    better error
>>>>       1. new error kind?
>>>>       2. existing badarg with more detail data?
>>>>    3. All those sources need to produce the new error (does not 
>>>>    produce OTP incompatibility in Elixir sind everything unknown is 
>>>> converted 
>>>>    to badarg):
>>>>       1. `StringIO.put_chars/4`
>>>>       2. `file_io_server:put_chars/3`
>>>>       3. `group:io_request/5`
>>>>       4. `standard_error:wrap_characters_to_binary/3`
>>>>       5. `user_drv:io_command/1`
>>>>       6. `user:wrap_characters_to_binary/2`
>>>>       7. `ssh_cli:io_request/4`
>>>>    
>>>> Does that sound like I'm going into the right direction?
>>>>
>>>> If yes: Should that be one PR in OTP or several?
>>>>
>>>> Thanks for any input & Best,
>>>> Jonatan
>>>>
>>>>
>>>> On Thursday, April 7, 2022 at 2:14:39 PM UTC+8 José Valim wrote:
>>>>
>>>>> Make sure to also check out this doc then: 
>>>>> https://github.com/erlang/otp/blob/master/HOWTO/DEVELOPMENT.md Have 
>>>>> fun!
>>>>>
>>>>> On Thu, Apr 7, 2022 at 8:12 AM Jonatan Männchen <[email protected]> 
>>>>> wrote:
>>>>>
>>>>>> I’ve never done a PR to OTP so far. I’ll check it out and ping back 
>>>>>> here for updates.
>>>>>>
>>>>>> Sent from my iPhone
>>>>>>
>>>>>> On 7 Apr 2022, at 07:09, José Valim <[email protected]> wrote:
>>>>>>
>>>>>> 
>>>>>> I would start with 2 by submitting a PR to Erlang/OTP and then access 
>>>>>> their appetite from there. :)
>>>>>>
>>>>>> On Thu, Apr 7, 2022 at 8:03 AM Jonatan Männchen <[email protected]> 
>>>>>> wrote:
>>>>>>
>>>>>>> I guess leaving nr. 3 out would be fine.
>>>>>>>
>>>>>>> What would he the proper path to do this in erlang? Just writing 
>>>>>>> some erlang example code and opening issues?
>>>>>>>
>>>>>>> Sent from my iPhone
>>>>>>>
>>>>>>> On 7 Apr 2022, at 06:46, José Valim <[email protected]> wrote:
>>>>>>>
>>>>>>> 
>>>>>>> All three of them would require changes to Erlang and I would love 
>>>>>>> for someone to chase this path. 2 sounds doable with EEP 54 but 1 would 
>>>>>>> definitely require more discussion.
>>>>>>>
>>>>>>> I am not sure about 3. binwrite pretty much means "I know what I am 
>>>>>>> doing" and we should allow the user to write whatever they want as is.
>>>>>>>
>>>>>>> On Thu, Apr 7, 2022 at 7:25 AM Jonatan Männchen <[email protected]> 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks, that solved my specific issue. I still think that some 
>>>>>>>> improvements are needed.
>>>>>>>>
>>>>>>>> Looking at the code for CaptureIO, I think those changes should be 
>>>>>>>> made directly in the StringIO / IO modules and not specifically for 
>>>>>>>> CaptureIO.
>>>>>>>>
>>>>>>>> The following things are not great today IMHO:
>>>>>>>>
>>>>>>>>    1. The encoding that lets everything through is called latin1. 
>>>>>>>>    I think we should introduce & properly document a new encoding 
>>>>>>>> called 
>>>>>>>>    something like raw_binary. It would work exactly the same though.
>>>>>>>>    2. IO.write with invalid characters into an io device with any 
>>>>>>>>    encoding should have a better error message. (Something like 
>>>>>>>> "<<222>> is 
>>>>>>>>    not a valid unicode string. Provide `encoding: :raw_binary` when 
>>>>>>>> opening 
>>>>>>>>    the io device")
>>>>>>>>    3. IO.binwrite with invalid characters into an encoding: 
>>>>>>>>    :unicode io device should probably at least warn if invalid 
>>>>>>>> characters are 
>>>>>>>>    passed.
>>>>>>>>
>>>>>>>> This would something like this in the form of a test: 
>>>>>>>> https://gist.github.com/maennchen/f428360a71d23a323538d9b7d51e638b
>>>>>>>>
>>>>>>>> On Wednesday, April 6, 2022 at 9:29:12 PM UTC+2 Wojtek Mach wrote:
>>>>>>>>
>>>>>>>>> > Additionally, it would be good if there was a proper error for 
>>>>>>>>> invalid characters instead of the currently raised ArgumentError.
>>>>>>>>>
>>>>>>>>> Yeah the error is pretty bad:
>>>>>>>>>
>>>>>>>>> IO.puts(<<222>>)
>>>>>>>>> ** (ArgumentError) errors were found at the given arguments: unknown 
>>>>>>>>> error: put_chars
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It is slightly more informative when used inside capture io though:
>>>>>>>>>
>>>>>>>>>     assert capture_io(fn ->
>>>>>>>>>              IO.puts(<<222>>)
>>>>>>>>>            end) == <<222>>
>>>>>>>>> ** (ArgumentError) errors were found at the given arguments: unknown 
>>>>>>>>> error: {put_chars,unicode,[<<"Þ">>,10]}
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We get error because stdio is with unicode encoding:
>>>>>>>>>
>>>>>>>>> iex> :io.getopts(:standard_io)[:encoding]
>>>>>>>>> :unicode
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> and we're writing <<222>> which isn't unicode.
>>>>>>>>>
>>>>>>>>> For writing in raw mode, use IO.binwrite. This and the previously 
>>>>>>>>> mentioned :encoding option will make the following test succeed:
>>>>>>>>>
>>>>>>>>>     assert capture_io([encoding: :latin1], fn ->
>>>>>>>>>              IO.binwrite(<<222>>)
>>>>>>>>>            end) == <<222>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On April 6, 2022, "maennchen.ch" <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>> That unfortunately gives me the same result.
>>>>>>>>>
>>>>>>>>> On Wednesday, April 6, 2022 at 6:57:35 PM UTC+1 Wojtek Mach wrote:
>>>>>>>>>>
>>>>>>>>>> I believe capture_io(encoding: :latin1, fun) should do the trick, 
>>>>>>>>>> can you check?
>>>>>>>>>>
>>>>>>>>>> On April 6, 2022, "maennchen.ch" <[email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi everyone,
>>>>>>>>>>
>>>>>>>>>> *Background*
>>>>>>>>>>
>>>>>>>>>> While developing tests for a mix task, that returns non UTF8 
>>>>>>>>>> binaries into STDOUT (building block to be piped into a file / 
>>>>>>>>>> pipe), I 
>>>>>>>>>> found that ExUnit.CaptureIO can only handle UTF8 and Latin1.
>>>>>>>>>>
>>>>>>>>>> Example Test that does not work:
>>>>>>>>>> https://gist.github.com/maennchen/16d411eeda3255fa3d3152fe9d836a82
>>>>>>>>>>
>>>>>>>>>> *Proposal*
>>>>>>>>>>
>>>>>>>>>> For testing this use case, it would be good if any raw binary 
>>>>>>>>>> would also be passed through. (Maybe via option "encoding: 
>>>>>>>>>> :raw_binary")
>>>>>>>>>>
>>>>>>>>>> Additionally, it would be good if there was a proper error for 
>>>>>>>>>> invalid characters instead of the currently raised ArgumentError.
>>>>>>>>>>
>>>>>>>>>> *Real World Example*
>>>>>>>>>>
>>>>>>>>>> Here is a real test, that would be made possible by this change: 
>>>>>>>>>> https://github.com/elixir-gettext/expo/blob/9048fe242830614f6d4235cbd345de844693f28a/test/mix/tasks/expo.msgfmt_test.exs#L18
>>>>>>>>>>
>>>>>>>>>> *PR*
>>>>>>>>>>
>>>>>>>>>> I'm happy to provide a PR for this as well.
>>>>>>>>>>
>>>>>>>>>> Thanks & Kind Regards,
>>>>>>>>>> Jonatan
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>> Google Groups "elixir-lang-core" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>>> send an email to [email protected].
>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/e35e97cf-00d6-422e-b3c1-ec508ff1e36fn%40googlegroups.com
>>>>>>>>>>  
>>>>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/e35e97cf-00d6-422e-b3c1-ec508ff1e36fn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>>> Groups "elixir-lang-core" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>> send an email to [email protected].
>>>>>>>>>
>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/9e817fc9-6f61-4476-bb27-c062ed6167fan%40googlegroups.com
>>>>>>>>>  
>>>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/9e817fc9-6f61-4476-bb27-c062ed6167fan%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "elixir-lang-core" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected].
>>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/f2816480-4d91-4576-9241-3bdc9351a920n%40googlegroups.com
>>>>>>>>  
>>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/f2816480-4d91-4576-9241-3bdc9351a920n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to a topic in 
>>>>>>> the Google Groups "elixir-lang-core" group.
>>>>>>> To unsubscribe from this topic, visit 
>>>>>>> https://groups.google.com/d/topic/elixir-lang-core/RR7nbeHsluQ/unsubscribe
>>>>>>> .
>>>>>>> To unsubscribe from this group and all its topics, send an email to 
>>>>>>> [email protected].
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KdzYkX%3DL%3D3%3DYHSJk4R8iq1%3Dbe9262Fg54eX1yLrx-c9g%40mail.gmail.com
>>>>>>>  
>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KdzYkX%3DL%3D3%3DYHSJk4R8iq1%3Dbe9262Fg54eX1yLrx-c9g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "elixir-lang-core" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected].
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/F3C0BE5C-EC1A-4574-8586-7FC5E7AFD39A%40maennchen.ch
>>>>>>>  
>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/F3C0BE5C-EC1A-4574-8586-7FC5E7AFD39A%40maennchen.ch?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>> -- 
>>>>>> You received this message because you are subscribed to a topic in 
>>>>>> the Google Groups "elixir-lang-core" group.
>>>>>> To unsubscribe from this topic, visit 
>>>>>> https://groups.google.com/d/topic/elixir-lang-core/RR7nbeHsluQ/unsubscribe
>>>>>> .
>>>>>> To unsubscribe from this group and all its topics, send an email to 
>>>>>> [email protected].
>>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4%2BSqS4j-VRG_aBfSfs3uc3c0Q_m3svGTKS%2Bw2OR5aO_rg%40mail.gmail.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4%2BSqS4j-VRG_aBfSfs3uc3c0Q_m3svGTKS%2Bw2OR5aO_rg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "elixir-lang-core" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to [email protected].
>>>>>>
>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/08729069-5CF7-4400-BAC3-A93E1DA43B1A%40maennchen.ch
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/08729069-5CF7-4400-BAC3-A93E1DA43B1A%40maennchen.ch?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "elixir-lang-core" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>>
>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/elixir-lang-core/d5886ad9-002e-48fd-948a-067a96770830n%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/elixir-lang-core/d5886ad9-002e-48fd-948a-067a96770830n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elixir-lang-core" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elixir-lang-core/dec562c1-0f0c-4f9d-90f9-7865017a3b1en%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elixir-lang-core/dec562c1-0f0c-4f9d-90f9-7865017a3b1en%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/2d6875c7-5468-4d54-bf51-b7232ed61120n%40googlegroups.com.

Reply via email to