Hi people, This message is neither a good news, nor asking for help. I'm writing to share some of my points about Deep Learning Framework packaging, after a re-evaluation of the status of TensorFlow's latest build systems. My thoughts are concluded from failures instead of success. That said, they should be helpful to future maintainers who'd like to maintain similar packages[1]. And you would probably find some of my root initiatives for DUPR[2] or SIMDebian[6] in the points.
In Debian's context, maintainers have to face three obstacles: 1. License. Unfortunately the de facto dominating performance library is cuDNN[3]. I'd say no serious user[4] would use a D-L framework without cuDNN or TPU[5] acceleration. Maintaining a bunch of contrib or non-free stuff is not good experience in Debian. Packaging for cuDNN is avaialble under Salsa:nvidia-team, but the plan for uploading it had been aborted because it's license looks too scary. 2. ISA Baseline. If you remember SIMDebian[6], or some of my motivations of DUPR[2], it would be very easy to understand how the absense of SIMD code affects the critical computational performance. People provided helpful suggestions at this point, including ld.so[7] tricks and some gcc features which allows run-time code selection according to cpu capability[8]. The ld.so trick would bloat the resulting .deb packages but it's the most applicable solution. In contrast, patching a million lines of Tensorflow code to enable the "function attributes" feature is probably impossible to a volunteer. 3. Build system. Look at the build systems of TensorFlow and PyTorch[10]. They are volatile due to the fast pace of development. Specifically, TensorFlow's build system "bazel" is very hard to package for Debian, and an anount of patching work is still required to prevent bazel from downloading ~3.0GiB of ???[9] before building TensorFlow. PyTorch's setup.py+cmake+shell build system ... requires some patching work too. So I recommend any future contributor who is about to deal with any deep learning packages to carefully assess the 3 aspects above. To some extent I envy some other distros such as Arch and Gentoo, since they already made a great progress in this field. Sometimes ago (maybe several months?) in debian science team I said I'm aborting D-L framework related development. Today Paul Liu poked me and asked me about the status of src:tensorflow (in experimental). I spent several hours re-evaluating the situation, and finally decided to fully give up and write the above points, because I'm not willing to undertake the workload any more. At the same time, I filed Orphan[11] bugs against tensorflow and several of its dependencies, except for src:nsync which contains a neat set of cmake files. I plan to convert those Orphan bugs into RM bugs after a year, if no one would touch them. I do research with neural networks and I use these frameworks frequently. Anadonda and Pip are already good enough for me. So DUPR[2] is the best choice to me if I'd like some .deb packages. This time I'm really giving up all related efforts [12], and shall never touch them again. I don't feel pity, even if these points seem to be tightly connected to some of my Debian activities. Apart from that, I'm still willing to provide personal opinions about related packaging works, or machine learning datasets, pretrained neural networks, etc. Well, this result looks bad. Let's hope for a sun rise. Best, Mo [1] Please take extra care in computational performance. [2] https://github.com/dupr/duprkit [3] (non-free) https://developer.nvidia.com/cudnn [4] Bussiness groups, researchers. [5] Google's computation acceleration hardware. [6] https://github.com/SIMDebian/SIMDebian [7] man ld.so -> search for "hardware capabilities" [8] info gcc "Function Attributes"; See Guillem's recent reply to "SIMDebian: ..." (d-devel@l.d.o) [9] I don't know what they are. They are more than build-deps. [10] They are the top-2 frameworks. [11] What a relief. [12] My on-going works about intel-mkl / BLAS / LAPACK are unrelated. I still have strong interest in many other aspects of Debian development.