Copilot commented on code in PR #1292:
URL:
https://github.com/apache/cassandra-python-driver/pull/1292#discussion_r3184422737
##########
docs/getting_started.rst:
##########
@@ -215,6 +215,56 @@ Althought it is not recommended, you can also pass
parameters to non-prepared
statements. The driver supports two forms of parameter place-holders:
positional
and named.
+.. warning::
+
+ Never use Python string formatting (f-strings, ``str.format()``, the ``%``
+ operator) to interpolate query results directly into CQL strings.
+ Collection types such as ``map``, ``list``, and ``set`` are returned by the
+ driver as Python objects (e.g.
:class:`~cassandra.util.OrderedMapSerializedKey`).
+ Their ``__str__`` representation uses Python's ``repr()`` format for nested
+ string values, which escapes backslashes (``\`` -> ``\\``). Embedding this
+ representation in a CQL string will corrupt data containing backslashes or
+ other special characters because CQL does not treat backslash as an escape
+ character.
+
+ For example, suppose a row contains ``data = 'https:\/\/example.com'``
+ (one backslash before each slash) and
+ ``map_data = {'url': 'https:\/\/example.com'}``:
+
+ .. code-block:: python
+
+ row = session.execute("SELECT * FROM t WHERE id = 'id1';").one()
+
+ # row.data is a plain Python str – str() prints it as-is:
+ print(row.data)
+ # -> https:\/\/example.com (1 backslash – correct)
+
+ # row.map_data is an OrderedMapSerializedKey – str() uses repr()
+ # format for nested strings, which escapes every backslash:
+ print(row.map_data)
+ # -> {'url': 'https:\\/\\/example.com'} (2 backslashes – wrong!)
+
+ # Embedding str(row.map_data) in a CQL string sends the doubled
+ # backslashes to Cassandra, corrupting the stored value:
+ session.execute( # WRONG
+ f"UPDATE t SET data='{row.data}', "
+ f"map_data={row.map_data} WHERE id={row.id}"
+ )
+ # The CQL Cassandra receives:
+ # UPDATE t SET map_data={'url': 'https:\\/\\/example.com'} ...
Review Comment:
This example no longer demonstrates the intended corruption bug if `id` is
the string shown in the preceding `SELECT` (`'id1'`). Interpolating `WHERE
id={row.id}` produces invalid CQL for text/ascii primary keys, so readers
copying the snippet will hit a syntax error before they ever observe the
backslash corruption described here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]