Re: Incoming PRs related to 'tree' enhancements

Bear Giles Mon, 07 Apr 2025 09:01:41 -0700

It's getting close to PR time - we should have my JIRA account situation
straightened out but I need to verify that.


The code is substantially complete - I just need to track down some new
errors. I suspect they're related to (safe) changes to the generated output
but need to be sure.

I ended up converting all of the existing drivers to velocity templates, or
more precisely velocity macros. It was mostly a CYA decision since I wanted
to ensure that all of the resources were available if new formats were
added and of course each format had just enough of a difference from the
rest to require a bit more work. E.g., do you include a final comma or not?
Do you just add the children or wrap them in [ ], etc.

The main things to show the differences are:

New images created by dot/graphviz:
https://github.com/coyotesong/maven-dependency-plugin/tree/convert-tree-dot-to-velocity/examples/dot/with-velocity

Overall changes:
https://github.com/coyotesong/maven-dependency-plugin/tree/convert-tree-dot-to-velocity/examples

Note in particular that the PNG image went from 32767 x 472 to 7687 x
1243.  That's still large - but it's usable. The original was actually
scaled down (since it was too large for PNG format) and some viewers still
couldn't handle it.

Note that the graphml and tgf formats went from an 'id' based on the
DependencyNode's hashCode() to a sequentially assigned one. This makes the
generated output much smaller and easy to check by hand. I suspect it may
also be the cause of some tests breaking since checking the generated id
against the object's hashCode() is a natural test.

The templates are at
https://github.com/coyotesong/maven-dependency-plugin/tree/convert-tree-dot-to-velocity/src/main/resources/templates
and
especially
https://github.com/coyotesong/maven-dependency-plugin/blob/convert-tree-dot-to-velocity/src/main/resources/templates/dependency-tree.vm.
I've managed to support all current formats with a single template - this
means all of the work can be done in the format-specific macros. It also
guarantees an overall consistency. We should still look at adding template
name and/or macro libraries as mojo parameters since I want to come back to
outputs group by groupid in addition to dependencies.

Of course new implementations could use their own templates, or even skip
velocity in general.

The individual macros were... an adventure. But they all work and I'll add
some documentation explaining how you get from something like "no final ,"
to the corresponding macro. Compare the macros to the original java code....

(Before you ask - I plan to move all of the macros to a single subdirectory
since there's only one file per format at the moment.)

Finally ALL of the existing *DependencyNodeVisitor implementations now look
like this:

/**
 * A dependency node visitor that serializes visited nodes to a writer
using the
 * <a href="https://en.wikipedia.org/wiki/GraphML";>graphml format</a>.
 *
 * @author <a href="mailto:jerome.creig...@gmail.com";>Jerome Creignou</a>
 * @author <a href="mailto:bgi...@coyotesong.com";>Bear Giles</a> (3.9)
 * @since 2.1
 */
public class GraphmlDependencyNodeVisitor extends
VelocityDependencyNodeVisitor {
    public static final String DEFAULT_MACRO_LIBRARY =
"templates/macros/graphml/graphml-macros-for-dependency-tree.vm";

    /**
     * {@inheritDoc}
     */
    public GraphmlDependencyNodeVisitor(Writer writer) {
        super(writer, DEFAULT_TEMPLATE_NAME, DEFAULT_MACRO_LIBRARY);
    }
}

JIRA issue and PR should soon follow...

Bear


On Sun, Apr 6, 2025 at 10:43 AM Bear Giles <bgi...@coyotesong.com> wrote:

> I tried sending an update but hit the attachment size limit.
>
> After sleeping on it I realized that making several small patches might be
> better from the review perspective but many decisions won't make sense
> until you see it in action. So I'm back to a single larger patch, but one
> without the advanced stuff I was experimenting with earlier. That includes
> taking advantage of the poorly documented (imo) fact that the
> DependencyNodeVisitor can know that it's the last call to endVisit() since
> the node will not have a parent. My patch includes this check but then does
> all of the real work in a different method.
>
> This also let me check whether velocity could handle the different output
> formats without much effort and that's lead to a possible separate
> contribution to the velocity team. The class can still be duplicated in
> this plugin since it's not very large and it would be difficult to insist
> the next release depends on the most recent velocity release. However it
> could be marked optional since velocity is currently only used during
> testing....
>
> So the changes are:
>
>  - introducing a velocity util + a few related classes due to the
> ability/need for .dot files to include additional information
>  - a conversion of .dot, .json, and possibly a few other formats. I had
> only planned to convert .dot but it looks like several other formats will
> also be easy to convert and will have the same improved maintainability,
> etc.
>  - introducing a YamlDependencyNodeVistor. I doubt there's an outcry for
> it but it's an additional demonstration of how easy it is to support many
> formats once you have a template engine.
>  - all changes should be limited to the 'tree' package and a new templates
> or velocity utils package. I looked at the other repo but didn't see an
> obvious place to put these changes. I guess the velocity utils could go
> there...
>
> One other item came up during this research - most formats include all
> properties, including null. This is usually fine but it makes the output
> files larger and that can have secondary effects. Like hitting the limit on
> a max attachment size. It would be easy to add a bit of logic to drop any
> null values but I know it can't be the default since some applications will
> want every value to be provided.
>
> Is it worth adding a mojo parameter to indicate whether null values should
> be suppressed?
>
> FWIW I'm already planning to add a mojo parameter for the name of the
> velocity template since users may want to customize it. The plugin will
> always default to the existing format (modulo possible reformatting) but it
> would be easy to include a second template that removes anything with a
> null value.
>
> I'll follow up with some links to examples on my forked repo...
>
>
> Bear
>
> On Sun, Mar 30, 2025 at 10:38 PM Bear Giles <bgi...@coyotesong.com> wrote:
>
>> My primary motivation is making the `mvn dependency:tree
>> -DoutputType=dot` output usable. That's why I've been looking at that code
>> - I've used 'dependency' for years.
>>
>> I can definitely look at repeating the effort but unless the existing
>> implementation is going away soon I would prefer to finish my current work
>> and then circle back to this. (This is actually about 3 levels deep from my
>> primary focus - it's been a series of "oh, if I have *this* then I can do
>> *that* more quickly. I would like to get back to those other tasks asap.)
>>
>> I'm having trouble creating a PR so I can point you to what I have as a
>> start. (I suspect it's because I added an 'examples' directory so I can
>> include links instead of attachments.)
>>
>> The fork is at https://github.com:/coyotesong/maven-dependency-plugin.
>>
>> This directory contains the results of running 'mvn dependency:tree' on
>> this project.
>> https://github.com/coyotesong/maven-dependency-plugin/tree/master/examples.
>> The generated image files are located in
>>
>> https://github.com/coyotesong/maven-dependency-plugin/tree/master/examples/dot/3.8.2-SNAPSHOT
>> .
>>
>> The proposed changes are pretty much all in this package:
>> https://github.com/coyotesong/maven-dependency-plugin/tree/add-velocity-utils/src/main/java/org/apache/maven/plugins/dependency/utils/templates
>>
>> There's also the test class, some test templates, etc.
>>
>> Bear
>>
>>
>>
>> On Sun, Mar 30, 2025 at 1:57 PM Guillaume Nodet <gno...@apache.org>
>> wrote:
>>
>>> Which project are you targeting exactly ?
>>> If you're targeting Maven Core, do you think you could target the new API
>>> rather than the resolver one ?
>>>
>>>
>>> https://github.com/apache/maven/blob/master/api/maven-api-core/src/main/java/org/apache/maven/api/Node.java
>>> Or are you targeting the resolver ?
>>>
>>> Le dim. 30 mars 2025 à 19:04, Bear Giles <bgi...@coyotesong.com> a
>>> écrit :
>>>
>>> > I've taken the time to look around the other 'tree' classes, or more
>>> > precisely the '*DependencyNodeVisitor" classes, and noticed that most
>>> if
>>> > not all of them follow the anti-pattern of doing more than one thing in
>>> > each method. More precisely they mix logic ('what to include') and
>>> > presentation ('what to send to the writer). This makes code difficult
>>> to
>>> > understand, to test, to extend by a third party, etc.
>>> >
>>> > This is why template engines were developed! :-)
>>> >
>>> > Fortunately there's already a dependency on 'velocity', although it
>>> appears
>>> > to only be used during the extensive testing. However it's still there
>>> in
>>> > the 'compile' scope so using it won't require adding new dependencies.
>>> >
>>> > That's a very clean solution to the problem identified above. For most
>>> > formats the velocity template won't need anything beyond what's already
>>> > included in the WrappedNode (DependencyNode). If a format can benefit
>>> from
>>> > adding a tiny bit of additional information it can be added to the
>>> > WrappedNode since that class is exclusive to the 'tree' package. In
>>> more
>>> > complex cases the individual class can maintain it's own data
>>> structures
>>> > and add them to the Context provided to velocity.
>>> >
>>> > This should work with all of the existing classes although I'm only
>>> > planning to modify one at this time. It should make the classes much
>>> easier
>>> > to maintain since they can focus on the logic of how to convert the
>>> data
>>> > provided by the DependencyNode to the data usable by the Velocity
>>> engine. I
>>> > know other template engines, e.g., jinjava, have a way to provide
>>> > user-defined functions but I don't think velocity does - that's
>>> unfortunate
>>> > since it will require a modest amount of pre-computation. Fortunately
>>> that
>>> > can be handled separately, immediately before calling the Velocity
>>> 'merge'
>>> > method.
>>> >
>>> > This will only require a few classes - basically just a wrapper to set
>>> up
>>> > the velocity engine. It will also require a standard place to put the
>>> > templates and I recommend 'templates/{format}/{name} and
>>> > 'templates/{format}/macros/{name}. The 'format' would be 'dot', 'json',
>>> > etc., and the 'macros' allows velocity macros to be kept separately
>>> from
>>> > the templates since they're much more likely to be reused. (Hmm...
>>> perhaps
>>> > also 'templates/macros/' for macros that are format-agnostic.
>>> >
>>> > -----------------------------
>>> >
>>> > The second PR will probably be a bit more controversial since there's
>>> > multiple ways to implement it and comments in some of the 'upstream'
>>> > methods suggest there's already WIP that may affect how the data is
>>> > provided to the *DependencyNodeVisitor classes.
>>> >
>>> > At the moment these classes only include two methods - visit(node) and
>>> > endVisit(node). It is the responsibility of the endVisit(node) method
>>> to
>>> > determine that all data has been received. I think that's an obvious
>>> > omission but the DependencyNodeVisitor interface is defined in a
>>> different
>>> > project. It would be trivial to add to the existing implementations,
>>> and
>>> > java8 supports optional methods so there would be no impact on others
>>> if
>>> > this method is added to the interface with an empty optional method.
>>> >
>>> > That's not a big problem if your visitor is able to run in a
>>> 'streaming'
>>> > mode, but some formats can benefit from using a 'document' mode
>>> instead.
>>> > In most cases this doesn't matter since the results are not intended
>>> for
>>> > human consumption - they need to be a standard format accepted by other
>>> > applications and these formats are usually designed to work in a
>>> streaming
>>> > fashion.
>>> >
>>> > However the dot format, and potentially other, are intended for human
>>> > consumption. The file itself isn't human-friendly but there are widely
>>> > available tools to convert it into human-friendly images and even
>>> > interactive formats like svg. In these cases it makes much more sense
>>> to
>>> > use a document mode when creating the final document since it is often
>>> > appropriate to provide the same information, or at least related
>>> > information, in multiple places in the document.
>>> >
>>> > So there's clearly a benefit to a new method that is called immediately
>>> > after the final endVisit(), one that provides the complete dependency
>>> tree.
>>> > The implementations should be able to ignore visit() and endVisit().
>>> (And
>>> > in fact they can be added to the abstract class if not the
>>> > DependencyNodeVisitor interface)
>>> >
>>> > The question is how to implement this, esp. when looking at other
>>> potential
>>> > enhancements to the class.
>>> >
>>> > For instance it's possible to have DependencyNodeVisitor implement
>>> > Consumer<DependencyNode> and the 'accept()' method calls visit(). It
>>> may
>>> > not be practical since many implementations will need the nesting that
>>> > endVisit() provides.
>>> >
>>> > Another possibility is to have DependencyNodeVisitor implement
>>> > Function<DependencyNode, Boolean> that's called after the final
>>> endVisit()
>>> > call. Again the idea is to simplify integration into streams but
>>> there's no
>>> > reason you couldn't just use the (object::function) notation.
>>> >
>>> > I think this ultimately comes down to what's best with future
>>> improvements
>>> > to the upstream code that creates the data sent to these classes.
>>> There are
>>> > notes that indicate the desire to shift to a streaming pattern (while
>>> > maintaining legacy interfaces, of course) and it would make sense to
>>> use a
>>> > consistent approach here - even if it means reformatting the data
>>> > immediately before calling the new method.
>>> >
>>> > For now I'll just add a new method to the abstract class that all of
>>> the
>>> > implementations extend but I consider it a stopgap measure since this
>>> > functionality would be useful to third party developers, etc.
>>> >
>>> > (Sidenote - it could even quietly replace some of the existing logic
>>> since
>>> > the final method could easily perform a recursive descent of rootNode
>>> and
>>> > call visit() and endVisit() itself. I think this could be put into the
>>> > DependencyNodeVisitor interface as a default method so there would be
>>> no
>>> > risk to removing the existing code. The primary drawback would be that
>>> the
>>> > current approach allows the implementation to bail out early if it
>>> hits a
>>> > problem while this approach would require the entire rootNode
>>> structure to
>>> > be generated first.)
>>> >
>>> > Bear
>>> >
>>>
>>>
>>> --
>>> ------------------------
>>> Guillaume Nodet
>>>
>>

Re: Incoming PRs related to 'tree' enhancements

Reply via email to