KalleOlaviNiemitalo commented on code in PR #2519:
URL: https://github.com/apache/avro/pull/2519#discussion_r1335403993


##########
lang/csharp/src/apache/main/Schema/Schema.cs:
##########
@@ -243,11 +244,28 @@ internal static Schema Parse(string json, SchemaNames 
names, string encspace)
             Schema sc = PrimitiveSchema.NewInstance(json);
             if (null != sc) return sc;
 
+            // Refer to https://issues.apache.org/jira/browse/AVRO-3856
+            // Refer to https://github.com/JamesNK/Newtonsoft.Json/pull/2904
+            // Newtonsoft author advised to use JObject.Load/JArray.Load 
instead of JObject.Parse()/JArray.Parse()
+            // The reason is we can set the MaxDepth property on the 
JsonReader.
+            JsonReader reader = new JsonTextReader(new StringReader(json));
+            // Another issue discovered is JsonReader.Push(JsonContainerType 
value) method overcounting the depth
+            // level of Avro schema.  Here are the observation of 
over-counting depth level in Newtonsoft's JsonReader:
+            // Avro Schema Depth       JsonReader Depth Level Count
+            // 4                       11
+            // 16                   44
+            // 32                      92
+            // 64                      188
+            // So, roughly speaking, the depth level count is about 2.75 times 
of Avro schema depth.
+            // Below is the hard-coded value to compensate over-counting of 
depth level in Newtonsoft
+            // to support Avro schema depth level to 64 slightly beyond.

Review Comment:
   Please reword this not to give the impression that the Newtonsoft.Json 
library has a bug that makes it count the depth incorrectly.  The difference in 
depth counts is rather caused by how the Avro schemas are represented in JSON; 
each nested Avro schema requires multiple nested JSON containers.  The 
Newtonsoft.Json library is not specific to Avro and is not designed to count 
Avro schemas, so the behaviour seems correct to me.
   
   The Avro schema in 
<https://github.com/JamesNK/Newtonsoft.Json/pull/2904#issuecomment-1732764055> 
has 4 levels of record schemas, but its JSON representation has 12 levels of 
nested containers: 
   
   ```JSON
   { /* depth 1: object */
     "type": "record",
     "name": "Level1",
     "fields": [ /* depth 2: array */
       {
         "name": "field1",
         "type": "string"
       },
       {
         "name": "field2",
         "type": "int"
       },
       { /* depth 3: object */
         "name": "level2",
         "type": { /* depth 4: object */
           "type": "record",
           "name": "Level2",
           "fields": [ /* depth 5: array */
             {
               "name": "field3",
               "type": "boolean"
             },
             {
               "name": "field4",
               "type": "double"
             },
             { /* depth 6: object */
               "name": "level3",
               "type": { /* depth 7: object */
                 "type": "record",
                 "name": "Level3",
                 "fields": [ /* depth 8: array */
                   {
                     "name": "field5",
                     "type": "string"
                   },
                   {
                     "name": "field6",
                     "type": "int"
                   },
                   { /* depth 9: object */
                     "name": "level4",
                     "type": { /* depth 10: object */
                       "type": "record",
                       "name": "Level4",
                       "fields": [ /* depth 11: array */
                         { /* depth 12: object */
                           "name": "field7",
                           "type": "boolean"
                         },
                         {
                           "name": "field8",
                           "type": "double"
                         }
                       ]
                     }
                   }
                 ]
               }
             }
           ]
         }
       }
     ]
   }
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@avro.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to