On Tue, Oct 22, 2013 at 9:13 PM, Junio C Hamano <gits...@pobox.com> wrote:
> Antoine Pelisse <apeli...@gmail.com> writes:
>
>> git-fast-import documentation says that paths can be C-style quoted.
>> Unfortunately, the current remote-hg helper doesn't unquote quoted
>> path and pass them as-is to Mercurial when the commit is created.
>>
>> This result in the following situation:
>>
>> - clone a mercurial repository with git
>> - Add a file with space: `mkdir dir/foo\ bar`

Note to myself, mkdir doesn't create a "file"

>> - Commit that new file, and push the change to mercurial
>> - The mercurial repository as now a new directory named '"dir', which
>> contains a file named 'foo bar"'
>>
>> Use python ast.literal_eval to unquote the string if it starts with ".
>> It has been tested with quotes, spaces, and utf-8 encoded file-names.
>>
>> Signed-off-by: Antoine Pelisse <apeli...@gmail.com>
>> ---
>
> A path you read in fast-import input indeed needs to be unquoted
> when it begins with a dq, and I _think_ by using ast.literal_eval(),
> you probably can correctly unquote any valid C-quoted string.
>
> But it bothers me somewhat that what the patch does seems to be
> overly broad.  Doesn't ast.literal_eval() take a lot more than just
> strings?

Good point

>     ast.literal_eval(node_or_string)
>
>         Safely evaluate an expression node or a Unicode or Latin-1
>         encoded string containing a Python expression. The string or
>         node provided may only consist of the following Python literal
>         structures: strings, numbers, tuples, lists, dicts, booleans,
>         and None.

Fortunately, I don't believe any of the other type can start with a
dq. So currently, I don't believe we can end-up with anything else but
a string. We could certainly check that this is always true though.

> Also doesn't Python's double-quoted string have a lot more magic
> than C-quoted string, e.g.
>
>         $ python -i
>         >>> import ast
>         >>> not_cq_path = '"abc" "def"'
>         >>> ast.literal_eval(not_cq_path)
>         'abcdef'

It is true that I have expected "valid output" from git-fast-export.
And I don't have in mind any easy solution to detect that the output
is broken, yet still accepted as a valid string by python. We could
obviously write a unquote_c_style() equivalent in python if needed.

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to