Hi Shlomi,

If you intend to make use of GPU's for the purposes of Machine Learning
Inference, the following resources may also be of interest to you:
RunInference transform information:
https://beam.apache.org/documentation/sdks/python-machine-learning/

You may also want to have a look at:
https://cloud.google.com/dataflow/docs/machine-learning

Cheers

Reza

On Mon, 6 Feb 2023 at 13:24, Bruno Volpato via dev <dev@beam.apache.org>
wrote:

> Hi Shlomi,
>
> Unfortunately, those cited references are about as much as we have
> available. I acknowledge that they are not very comprehensive -- so I'll
> try to share some insight.
>
> Related to your sample, I believe there are relevant missing pieces, as I
> am note sure what the input looks like (bounded / unbounded, how the
> triggering looks like if unbounded) or how KVs became Rows.
> But regarding ResourceHints, they are applicable to any PTransform, so in
> your example, you can apply it directly when composing
> AvroIO.parseFilesGenericRecords:
>
> .apply("Match file names", FileIO.*matchAll*())
> .apply("Read Avro files", FileIO.*readMatches*())
> *.apply**(**"Parse Avro files into GenericRecord"**, **AvroIO**.*
> *parseFilesGenericRecords**(**new **CustomerTransformFn**()*
> *)        **.withCoder**(**KvCoder**.**of**(**Customer**.**keyCoder**()*
> *, **Customer**.**valueCoder**()**)**)*
>
>         .setResourceHints(ResourceHints.create().withMinRam("50GB")*)*
>
> .apply("Chunk customer", GroupIntoBatches.<Row, Row>*ofSize*(size)
>         .withMaxBufferingDuration(Duration.*standardSeconds*(duration)))
>
>
> Accelerators are mostly related to usage of GPUs (
> https://cloud.google.com/dataflow/docs/guides/using-gpus) that may
> overcome CPUs in certain scenarios (such as graphics or ML workloads that
> require highly parallelization/vectorization), but I don't think those
> transforms mentioned here are ready to leverage them.
>
> Besides providing good resource hints so the workers are sized
> accordingly, I'd suggest analyzing which steps are being fused together
> (please check
> https://cloud.google.com/dataflow/docs/guides/right-fitting#right_fitting_and_fusion),
> as it may be the case that you could separate file discovery / matching
> (again, without analyzing the missing parts of the graph, it may be hard to
> make good suggestions).
>
>
> Best,
> Bruno
>
> On Mon, Feb 6, 2023 at 2:50 PM Ahmet Altay <al...@google.com> wrote:
>
>> Adding @John Casey <johnjca...@google.com> @Bruno Volpato
>> <bvolp...@google.com> - who might be able to point to relevant docs.
>>
>> On Sat, Feb 4, 2023 at 11:59 AM Shlomi Elbaz <shlom...@optimove.com>
>> wrote:
>>
>>> Hello All,
>>>
>>>
>>>
>>> We developed a service with Apache Beam where we read an Avro file that
>>> locate in GCP bucket,
>>>
>>> We had a load and benchmark tests, during the pipeline we got a
>>> bottleneck and *out-of-memory* issues in the stage where the service
>>> accesses the Avro’s by AvroIO.*parseFilesGenericRecords*
>>>
>>>
>>>
>>> The issue happened in highlight part:
>>>
>>> .apply("Match file names", FileIO.*matchAll*())
>>> .apply("Read Avro files", FileIO.*readMatches*())
>>> *.apply**(**"Parse Avro files into GenericRecord"**, **AvroIO**.*
>>> *parseFilesGenericRecords**(**new **CustomerTransformFn**()*
>>> *)         **.withCoder**(**KvCoder**.**of**(**Customer**.**keyCoder*
>>> *()**, **Customer**.**valueCoder**()**)**)**)*
>>> .apply("Chunk customer", GroupIntoBatches.<Row, Row>*ofSize*(size)
>>>         .withMaxBufferingDuration(Duration.*standardSeconds*(duration)))
>>>
>>>
>>>
>>> Issues we saw a tutorial regarding resource-hints in Apache Beam
>>> website, but there is no examples/information how to use with *AvroIO*
>>> *.**parseFilesGenericRecords*.
>>>
>>> https://beam.apache.org/documentation/runtime/resource-hints/
>>>
>>>
>>>
>>> is there more information or examples where we can read about ResourceHints
>>> and Accelerator’s?
>>>
>>>
>>>
>>> Also, would you please recommend us for optimal settings of using
>>> ResourceHints?
>>>
>>>
>>>
>>> The additional tutorials that we rely on:
>>>
>>> https://www.youtube.com/watch?v=9fc2MNQHQ2s
>>>
>>> https://cloud.google.com/dataflow/docs/guides/right-fitting
>>>
>>>
>>> https://cloud.google.com/blog/products/data-analytics/introducing-vertical-autoscaling-in-dataflow-prime
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Shlomi Elbaz,
>>>
>>>
>>>
>>>
>>>
>>> ---
>>> Optimove Named a Leader in the 2022 IDC MarketScape for Retail CDP -
>>> <https://www.optimove.com/lp/optimove-leader-forrester-wave2021?utm_source=signature&utm_medium=email&utm_campaign=forrester2021_signature&utm_id=Forrester2021>
>>> <https://www.optimove.com/blog/optimove-recognized-as-a-leader-in-cross-channel-campaign-management-by-forrester>Download
>>> report here
>>> <https://www.optimove.com/blog/optimove-named-a-leader-in-the-2022-idc-marketscape-for-retail-cdp?utm_campaign=Tech_org&utm_source=Email&utm_medium=Signature>
>>>
>>> Say Hello to Optitext - Optimove Adds Native SMS Capabilities-
>>> <https://www.optimove.com/blog/gartner-peer-insights-optimove-receives-95-willingness-to-recommend-by-clients>
>>> <https://www.optimove.com/blog/optimove-acquires-advanced-mobile-marketing-platform-kumulos-heres-whats-in-it-for-you>
>>> <https://optimove.com/blog/optimove-acquires-advanced-mobile-marketing-platform-kumulos-heres-whats-in-it-for-you?utm_source=signature&utm_medium=email&utm_campaign=kumulos_signature&utm_id=kumulos22>read
>>> about it here
>>> <https://www.optimove.com/blog/optimove-acquires-real-time-personalization-platform-graphyte?utm_campaign=Tech_org&utm_medium=Signature&utm_source=Email>
>>>
>>> ---
>>>
>>> *Shlomi Elbaz*
>>> Fullstack Developer
>>>
>>>
>>> <https://www.optimove.com/?utm_source=emailSig&utm_medium=email&utm_campaign=sig-Logo>
>>> CRM Journeys, Mapped by AI
>>>
>>> Connect with us on LinkedIn <https://www.linkedin.com/company/optimove>
>>> | Twitter <https://twitter.com/optimove> | Facebook
>>> <https://www.facebook.com/optimove> | Youtube
>>> <https://www.youtube.com/optimove>
>>> Read our thoughts on the Optimove Blog
>>> <https://www.optimove.com/blog?utm_source=emailSig&utm_medium=email&utm_campaign=sig-Blog>
>>>
>>>
>>>

Reply via email to