All three of them would require changes to Erlang and I would love for
someone to chase this path. 2 sounds doable with EEP 54 but 1 would
definitely require more discussion.

I am not sure about 3. binwrite pretty much means "I know what I am doing"
and we should allow the user to write whatever they want as is.

On Thu, Apr 7, 2022 at 7:25 AM Jonatan Männchen <[email protected]>
wrote:

> Thanks, that solved my specific issue. I still think that some
> improvements are needed.
>
> Looking at the code for CaptureIO, I think those changes should be made
> directly in the StringIO / IO modules and not specifically for CaptureIO.
>
> The following things are not great today IMHO:
>
>    1. The encoding that lets everything through is called latin1. I think
>    we should introduce & properly document a new encoding called something
>    like raw_binary. It would work exactly the same though.
>    2. IO.write with invalid characters into an io device with any
>    encoding should have a better error message. (Something like "<<222>> is
>    not a valid unicode string. Provide `encoding: :raw_binary` when opening
>    the io device")
>    3. IO.binwrite with invalid characters into an encoding: :unicode io
>    device should probably at least warn if invalid characters are passed.
>
> This would something like this in the form of a test:
> https://gist.github.com/maennchen/f428360a71d23a323538d9b7d51e638b
>
> On Wednesday, April 6, 2022 at 9:29:12 PM UTC+2 Wojtek Mach wrote:
>
>> > Additionally, it would be good if there was a proper error for invalid
>> characters instead of the currently raised ArgumentError.
>>
>> Yeah the error is pretty bad:
>>
>> IO.puts(<<222>>)
>> ** (ArgumentError) errors were found at the given arguments: unknown error: 
>> put_chars
>>
>>
>> It is slightly more informative when used inside capture io though:
>>
>>     assert capture_io(fn ->
>>              IO.puts(<<222>>)
>>            end) == <<222>>
>> ** (ArgumentError) errors were found at the given arguments: unknown error: 
>> {put_chars,unicode,[<<"Þ">>,10]}
>>
>>
>> We get error because stdio is with unicode encoding:
>>
>> iex> :io.getopts(:standard_io)[:encoding]
>> :unicode
>>
>>
>> and we're writing <<222>> which isn't unicode.
>>
>> For writing in raw mode, use IO.binwrite. This and the previously
>> mentioned :encoding option will make the following test succeed:
>>
>>     assert capture_io([encoding: :latin1], fn ->
>>              IO.binwrite(<<222>>)
>>            end) == <<222>>
>>
>>
>> On April 6, 2022, "maennchen.ch" <[email protected]> wrote:
>>
>> That unfortunately gives me the same result.
>>
>> On Wednesday, April 6, 2022 at 6:57:35 PM UTC+1 Wojtek Mach wrote:
>>>
>>> I believe capture_io(encoding: :latin1, fun) should do the trick, can
>>> you check?
>>>
>>> On April 6, 2022, "maennchen.ch" <[email protected]> wrote:
>>>
>>> Hi everyone,
>>>
>>> *Background*
>>>
>>> While developing tests for a mix task, that returns non UTF8 binaries
>>> into STDOUT (building block to be piped into a file / pipe), I found that
>>> ExUnit.CaptureIO can only handle UTF8 and Latin1.
>>>
>>> Example Test that does not work:
>>> https://gist.github.com/maennchen/16d411eeda3255fa3d3152fe9d836a82
>>>
>>> *Proposal*
>>>
>>> For testing this use case, it would be good if any raw binary would also
>>> be passed through. (Maybe via option "encoding: :raw_binary")
>>>
>>> Additionally, it would be good if there was a proper error for invalid
>>> characters instead of the currently raised ArgumentError.
>>>
>>> *Real World Example*
>>>
>>> Here is a real test, that would be made possible by this change:
>>> https://github.com/elixir-gettext/expo/blob/9048fe242830614f6d4235cbd345de844693f28a/test/mix/tasks/expo.msgfmt_test.exs#L18
>>>
>>> *PR*
>>>
>>> I'm happy to provide a PR for this as well.
>>>
>>> Thanks & Kind Regards,
>>> Jonatan
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elixir-lang-core" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elixir-lang-core/e35e97cf-00d6-422e-b3c1-ec508ff1e36fn%40googlegroups.com
>>> <https://groups.google.com/d/msgid/elixir-lang-core/e35e97cf-00d6-422e-b3c1-ec508ff1e36fn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elixir-lang-core" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>>
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elixir-lang-core/9e817fc9-6f61-4476-bb27-c062ed6167fan%40googlegroups.com
>> <https://groups.google.com/d/msgid/elixir-lang-core/9e817fc9-6f61-4476-bb27-c062ed6167fan%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "elixir-lang-core" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elixir-lang-core/f2816480-4d91-4576-9241-3bdc9351a920n%40googlegroups.com
> <https://groups.google.com/d/msgid/elixir-lang-core/f2816480-4d91-4576-9241-3bdc9351a920n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KdzYkX%3DL%3D3%3DYHSJk4R8iq1%3Dbe9262Fg54eX1yLrx-c9g%40mail.gmail.com.

Reply via email to