Dear mentors, Wookey and I are trying to come up with a sane concept to package the googleapis project [1]. During our initial investigation a few questions came up that we would like to discuss publicly.
BACKGROUND: 'googleapis' is a collection of protocol buffer [2] files, an interface description language for stable binary encoding of data, and gRPC service files. From those files, bindings for a variety of target languages (Python, Ruby, Java, C++ etc.) can be generated, using the 'protoc' compiler with gRPC plugins. They _do_ offer a Makefile for generating those bindings, albeit quite out-dated apparently (due to ignoring protos from subfolder 'grafeas'). However, this Makefile only generates source files (and headers), which is fine for Python etc. but not particularly useful for Java/C++ etc. Furthermore, compiling these source files yourself can be quite tedious, because you need to know the dependency structure within the project, and this structure changes rather frequently. Example: Depending on 'google/longrunnig/operations.pb.cc' requires you to also compile and link - google/api/annotations.pb.cc - google/rpc/status.pb.cc for an older version of the project (143084a2624b6591ee1f9d23e7f5241856642f4d). Whereas on current master, you additionally need to compile and link google/api/client.pb.cc. Most users probably do not want to deal with such internal dependencies and just like to do: apt install libgoogleapis-dev pkg-config --libs googleapis_longrunning Therefore, Wookey's idea was to also compile the Java/C++ bindings and package the resulting libraries. Here is where things become difficult: - We do not have a build description for most of the bindings (some subfolders have Bazel BUILD files, but most do not) - We are talking about ~3,500 proto files. Building all of them results in extremely huge files. - jar: ~160MB - shared lib: ~3GB (with debug info) ~180MB (after stripping) - static lib: ~11GB (with debug info) ~600MB (after stripping) QUESTIONS: 1. Due to the missing build description, is it ok if the maintainer provides a Makefile for building the C++ libraries in ./debian? 2. With such large libraries, I guess it makes sense to split them up. I think a good approach to group proto files (for separation to different libraries) would be to look for their 'package' identifier (like a namespace, can be read from the file). Some packages belong to "sub packages" that might cause cyclic dependencies (e.g., grafeas.v1beta1.discovery). Therefore, I would suggest to use a heuristic to cut-off the package ID on first segment that matches '^v[1-9]+' (e.g., grafeas.v1beta1, resulting in libgoogleapis_grafeas_v1beta1.{a,so}). Doing this will result in 'only' 413 different packages/libraries. What do you think about this approach? 3. What granularity should we use for packaging? Should we provide these separated libraries via - a single debian package and a single dev package? - a debian package and dev package per library? - a debian package per library, but a single dev package for all headers? 4. Such a Makefile (and control file) will be quite lengthy. My current solution is to use a Python script for analysing the proto files, grouping them according to their package id, building up a dependency graph, checking it for cycles, and finally generating the Makefile (and control file/pkg-config files etc.). With upcoming library releases this script could be extended and rerun. 5. The Java bindings are considerably smaller. In my opinion, those could be provided in a single debian package, containing a single jar file. What do you think? 6. As the googleapis repository is not versioned, it is hard to judge which protoc version is compatible with the current proto source base. I was talking to an ex-Googler and he told me I should look at the PiperOrigin-RevId (shown in some of the commits). That's their internal linear commit counter. According to him, we should look up the protobuf-compiler version that is currently packaged in that release. Then we should look for that ID in the googleapis commits and package the revision that fulfils the condition: PipeId(googleapis) <= PiperId(protoc) According to him, that's what has been tested at Google internally and is guaranteed to work. The same applies to the packaged protobuf-compiler-grpc, which is also a build dependency to googleapis. Do you think this is a valid approach? 7. With no version given, what version should we use for this package? That's all for now. Any suggestions are very welcome. Many thanks! Oliver [1] https://github.com/googleapis/googleapis [2] https://protobuf.dev/