ashb commented on a change in pull request #21445:
URL: https://github.com/apache/airflow/pull/21445#discussion_r806227197



##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -707,6 +710,9 @@ def deserialize_operator(cls, encoded_op: Dict[str, Any]) 
-> Union[BaseOperator,
                 v = cls._deserialize_timedelta(v)
             elif k in encoded_op["template_fields"]:
                 pass
+            elif k == "resources" and 
Resources.is_operator_resources_json_string(v):

Review comment:
       Can you explain why this` is_...()` function  is needed? It's not a 
pattern we have elsewhere in serialization.

##########
File path: tests/serialization/test_dag_serialization.py
##########
@@ -1174,6 +1175,24 @@ def test_operator_deserialize_old_names(self):
         serialized_op = SerializedBaseOperator.deserialize_operator(blob)
         assert serialized_op.downstream_task_ids == {'foo'}
 
+    def test_task_resources(self):
+        """
+        Test TaskGroup serialization/deserialization.
+        """
+        from airflow.operators.dummy import DummyOperator
+
+        execution_date = datetime(2020, 1, 1)
+        task_id = 'task1'
+        with DAG("test_task_resources", start_date=execution_date) as dag:
+            task = DummyOperator(task_id=task_id, resources={"cpus": 0.1, 
"ram": 2048})
+
+        SerializedDAG.validate_schema(SerializedDAG.to_dict(dag))
+

Review comment:
       Please can you add a test here that checks the json/dict structure, not 
just the end-to-end. (This is important for making sure that we can continue to 
load the serialized blobs we store in the DB.)

##########
File path: tests/serialization/test_dag_serialization.py
##########
@@ -1174,6 +1175,24 @@ def test_operator_deserialize_old_names(self):
         serialized_op = SerializedBaseOperator.deserialize_operator(blob)
         assert serialized_op.downstream_task_ids == {'foo'}
 
+    def test_task_resources(self):
+        """
+        Test TaskGroup serialization/deserialization.
+        """
+        from airflow.operators.dummy import DummyOperator
+
+        execution_date = datetime(2020, 1, 1)
+        task_id = 'task1'
+        with DAG("test_task_resources", start_date=execution_date) as dag:
+            task = DummyOperator(task_id=task_id, resources={"cpus": 0.1, 
"ram": 2048})
+
+        SerializedDAG.validate_schema(SerializedDAG.to_dict(dag))
+

Review comment:
       That the resources part of the task is correctly represented -- 
crucially that it's tested in this file, so that it's more obvious if someone 
makes a breaking change to how this is serialized in the future.

##########
File path: tests/serialization/test_dag_serialization.py
##########
@@ -1174,6 +1175,24 @@ def test_operator_deserialize_old_names(self):
         serialized_op = SerializedBaseOperator.deserialize_operator(blob)
         assert serialized_op.downstream_task_ids == {'foo'}
 
+    def test_task_resources(self):
+        """
+        Test TaskGroup serialization/deserialization.
+        """
+        from airflow.operators.dummy import DummyOperator
+
+        execution_date = datetime(2020, 1, 1)
+        task_id = 'task1'
+        with DAG("test_task_resources", start_date=execution_date) as dag:
+            task = DummyOperator(task_id=task_id, resources={"cpus": 0.1, 
"ram": 2048})
+
+        SerializedDAG.validate_schema(SerializedDAG.to_dict(dag))
+

Review comment:
       This is sort of what I was thinking of 
https://github.com/apache/airflow/blob/bff0b101388dc2a7c0fdf550540d38240b020181/tests/serialization/test_dag_serialization.py#L1676-L1726

##########
File path: tests/serialization/test_dag_serialization.py
##########
@@ -1654,6 +1673,28 @@ def test_mapped_operator_xcomarg_serde():
     assert xcom_arg.operator is serialized_dag.task_dict['op1']
 
 
+def test_task_resources_serde():
+    """
+    Test task resources serialization/deserialization.
+    """
+    from airflow.operators.dummy import DummyOperator
+
+    execution_date = datetime(2020, 1, 1)
+    task_id = 'task1'
+    with DAG("test_task_resources", start_date=execution_date) as _:
+        task = DummyOperator(task_id=task_id, resources={"cpus": 0.1, "ram": 
2048})
+
+    serialized = SerializedBaseOperator._serialize(task)
+    assert serialized['resources'] == (
+        '{'
+        '"cpus": {"name": "CPU", "qty": 0.1, "units_str": "core(s)"}, '
+        '"disk": {"name": "Disk", "qty": 512, "units_str": "MB"}, '
+        '"gpus": {"name": "GPU", "qty": 0, "units_str": "gpu(s)"}, '
+        '"ram": {"name": "RAM", "qty": 2048, "units_str": "MB"}'
+        '}'
+    )

Review comment:
       This shouldn't be a string of a JSON object inside a json object -- it 
should be the JSON object directly




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to