Hi Becket, Talking about the IPC, DJL is leveraging the JNI/JNA directly to connect to DL engines C /C++ API. So the latency between C++ and Java is minimum (~10ns). Performance wise speaking, DJL can offers true multi threading Java inference, means load model once, use in as many threads you would like... It’s been using to deal with some streaming tasks with 1ms budget.. and DJL only consume 400ns or less per inference for recommendations models.
Thanks, Qing > On Jan 15, 2021, at 12:36 AM, Becket Qin <becket....@gmail.com> wrote: > > Hi Qing, > > Thanks for raising the discussion. It is great to know the project DJL. > > If I understand correctly, the discussion is mostly about inference. DJL > essentially provides a uniform Java API for people to use different deep > learning engines. It is useful for people to combine Flink and DJL so they > can essentially have "deep learning udf" in their Flink job to do the > inference. It is what the Sentiment Analysis example does and makes a lot > of sense to me. > > Personally speaking I think it seems already simple enough for people to > leverage DJL in the inference case via Flink UDFs. But there might be > something we can do to make this solution more visible if we want to: > > 1. Have a built-in DJLPredictorMapper which wraps the DJL predictor. This > makes this solution a bit more visible to the users, but I am not sure if > it is worth doing, because this would introduce external dependency on DJL > in Flink which is something we may want to avoid. > 2. Add the DJLPredictorMapper to a 3rd party project (personally I don't > think it is necessary, a code snippet example seems good enough), list it > in the flink packages website[1], and add a Flink ML use case page to Flink > website to advertise the usage with other Flink ML usages. > > I am in favor of option 2 here. > > Apart from that, I am very curious about the exact latency and performance > overhead of IPC in DJL - I assume there is an IPC between JVM and other > processes under the hood. > > Thanks, > > Jiangjie (Becket) Qin > > [1] > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fflink-packages.org%2F&data=04%7C01%7C%7C4f7fb42b7e614b9d355808d8b9284b56%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637462930008906011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Qobsue2QNoEIjmikJJ3IPudcMsvuSl2z4TQIaxOrBgg%3D&reserved=0 > >> On Fri, Jan 15, 2021 at 10:12 AM Qing Lan <lanking...@live.com> wrote: >> >> Hi all, >> >> On behalf of the AWS DJL team, I would like to discuss about the Apache >> Flink's ML integration development. We would like to contribute some more >> Deep Learning (DL) based applications to Flink that including but not >> limited to TensorFlow, PyTorch, Apache MXNet, Apache TVM and more through >> DJL... Do you have any thoughts to have these DL engines on Flink ML module? >> >> Here is an example using Apache Flink to do Sentiment Analysis with >> PyTorch: >> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Faws-samples%2Fdjl-demo%2Ftree%2Fmaster%2Fflink%2Fsentiment-analysis&data=04%7C01%7C%7C4f7fb42b7e614b9d355808d8b9284b56%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637462930008906011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2F4Bqu2YSxSOvIfmz6QZbAJVZS0TqC0OndykYVxjMUzk%3D&reserved=0 >> >> Some background about DJL: DJL >> (https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fawslabs%2Fdjl&data=04%7C01%7C%7C4f7fb42b7e614b9d355808d8b9284b56%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637462930008906011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0yn06BgSF0fn5XGTlhAFBi2jUvxmJtj06ZXXv9SrMt4%3D&reserved=0) >> is an >> open-source project (Licensed Apache 2.0) that is aimed to bridge DL >> applications into the Java world. It offers the full multi-threading and >> low-memory inference experience across all DL engines and has been used in >> online service, streaming applications, and distributed inference. >> >> Thanks, >> Qing >>