abhishekrb19 commented on code in PR #19162:
URL: https://github.com/apache/druid/pull/19162#discussion_r2944491179
##########
server/src/main/java/org/apache/druid/segment/metadata/AbstractSegmentMetadataCache.java:
##########
@@ -991,11 +991,6 @@ static RowSignature analysisToRowSignature(final
SegmentAnalysis analysis)
{
final RowSignature.Builder rowSignatureBuilder = RowSignature.builder();
for (Map.Entry<String, ColumnAnalysis> entry :
analysis.getColumns().entrySet()) {
- if (entry.getValue().isError()) {
Review Comment:
Thanks for taking a look!
Yes, these analysis errors are indeed coming from `fold`. I think I got
tripped up by the error logs I had added: `Folding column[xyz] is an
error[error:cannot_merge_diff_types: [json] and [STRING]] for
mergeStrategy[strict] dataSources[abc]`.
As for why this happens here, I’m not entirely sure....but these were
happening only on the realtime tasks in our case. So I had a suspicion that the
data for some of these JSON columns in the ingestion spec were numeric
strings/null/sparse data and contributing to these analysis errors
`[cannot_merge_diff_types](error:cannot_merge_diff_types: [json] and
[STRING]])` .
1. For the unstable signature issue, this can cause spurious query failures
or return no data. Unfortunately, there’s no good workaround other than forcing
a reorder of the columns, so this fix should help address it.
2. For any type-related correctness issues, I think users can likely work
around them with some form of casting in the queries. (I hadn't seen this issue
reported by users so far)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]