Thiago Romão Barcala created AVRO-4090:
------------------------------------------

             Summary: PHP data is validated multiple times for nested schemas
                 Key: AVRO-4090
                 URL: https://issues.apache.org/jira/browse/AVRO-4090
             Project: Apache Avro
          Issue Type: Improvement
            Reporter: Thiago Romão Barcala


Consider the test script below:
{code:php}
<?php

use Apache\Avro\Datum\AvroIOBinaryEncoder;
use Apache\Avro\Datum\AvroIODatumWriter;
use Apache\Avro\IO\AvroStringIO;
use Apache\Avro\Schema\AvroSchema;

require_once 'vendor/autoload.php';

$writer = new AvroIODatumWriter();

$schemaJson = <<<'JSON'
    {
        "type": "record",
        "name": "A",
        "fields": [
            {
                "name": "a",
                "type": {
                    "type": "record",
                    "name": "B",
                    "fields": [
                        {
                            "name": "b",
                            "type": {
                                "type": "record",
                                "name": "C",
                                "fields": [
                                    {
                                        "name": "c",
                                        "type": {
                                            "type": "record",
                                            "name": "D",
                                            "fields": [
                                                {
                                                    "name": "d",
                                                    "type": {
                                                        "type": "record",
                                                        "name": "E",
                                                        "fields": [
                                                            {
                                                                "name": "e",
                                                                "type": "string"
                                                            }
                                                        ]
                                                    }
                                                }
                                            ]
                                        }
                                    }
                                ]
                            }
                        }
                    ]
                }
            }
        ]
    }
    JSON
    ;

$data = ['a' => ['b' => ['c' => ['d' => ['e' => 'value']]]]];

$schema = AvroSchema::parse($schemaJson);
$io = new AvroStringIO();
$writer->writeData($schema, $data, new AvroIOBinaryEncoder($io));
var_dump($io->__toString()); {code}
By running the script above with the command line below, it is possible to see, 
by inspecting the profiler output, that the method AvroSchema::isValidDatum is 
called 21 times:
{code:bash}
php -dxdebug.start_with_request=true -dxdebug.mode=profile 
-dxdebug.output_dir=$(pwd) test.php
{code}
The validation should be called only 6 times though, once for each record, and 
once for the string value. This is happening, because writeData is being called 
for every field of the record, and writeData validates the entire data graph.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to