Re: [pve-devel] [PATCH v1 pve-storage 2/8] pluginbase: add high-level plugin API description

Max Carrara Wed, 02 Apr 2025 09:31:47 -0700

On Mon Mar 31, 2025 at 5:13 PM CEST, Fabian Grünbichler wrote:
> On March 26, 2025 3:20 pm, Max Carrara wrote:
> > Add a short paragraph in DESCRIPTION serving as an introduction as
> > well as the GENERAL PARAMETERS and CACHING EXPENSIVE OPERATIONS
> > sections.
> > 
> > These sections are added in order to avoid repeatedly describing the
> > same parameters as well as to elaborate on / clarify a couple terms,
> > e.g. what the $cache parameter does or what a volume in our case is.
> > 
> > Signed-off-by: Max Carrara <m.carr...@proxmox.com>
> > ---
> >  src/PVE/Storage/PluginBase.pm | 77 +++++++++++++++++++++++++++++++++++
> >  1 file changed, 77 insertions(+)
> > 
> > diff --git a/src/PVE/Storage/PluginBase.pm b/src/PVE/Storage/PluginBase.pm
> > index e56aa72..16977f3 100644
> > --- a/src/PVE/Storage/PluginBase.pm
> > +++ b/src/PVE/Storage/PluginBase.pm
> > @@ -4,6 +4,83 @@ C<PVE::Storage::PluginBase> - Storage Plugin API Interface
> >  
> >  =head1 DESCRIPTION
> >  
> > +This module documents the public Storage Plugin API of PVE and serves
> > +as a base for C<L<PVE::Storage::Plugin>>. Plugins must B<always> inherit 
> > from
> > +C<L<PVE::Storage::Plugin>>, as this module is for documentation purposes
> > +only.
>
> does this make sense? if we now provide a clean base for the structure
> of plugins, why shouldn't plugins be able to use that, but instead have
> to inherit from PVE::Storage::Plugin which has a lot of extra stuff that
> makes things messy?
>
> granted, switching over to load from PluginBase could be done as a
> follow up or with 9.0 (or not at all, if there is a rationale)..
>
> after this series we have:
>
> SectionConfig
> -> PluginBase (not an actual base plugin w.r.t. SectionConfig, and not
> something you base plugins on as a result)
> --> Plugin (a combination of base plugin and base of all our
> directory-based plugins)
> ---> other plugins, including third party ones
>
> which seems unfortunate, even if the contents of PluginBase are helpful
> when implementing your own..
>
> IMHO this should either be
>
> SectionConfig
> -> PluginBase (actual base plugin, with SectionConfig implementations
> moved over from current Plugin.pm as needed)
> --> PluginTemplate (what's currently PluginBase in this series - nicely
> documented parent of third party plugins, not actually registered, just
> contains the storage.cfg interface without the low-level SectionConfig
> things)
> ---> NewThirdPartyPlugin (clean slate implementation just using the
> nicely documented interfaces, guaranteed to not use any helpers from
> Plugin since it's not a (grand)parent)
> --> Plugin (base of built-in plugins, probably base of existing third
> party plugins, should ideally have a different name in the future!)
> ---> DirPlugin
> ---> ...
> ---> ExistingThirdPartyPlugin (this might rely on helpers from Plugin,
> so we can't just rename that one unless we wait for 9.0)
>
> or
>
> SectionConfig
> -> PluginBase (actual base plugin + docs of Plugin API)
> --> Plugin (base of our plugins and existing third party ones, dir-related 
> helpers, ..)
> ---> other plugins, including third party ones
> --> NewThirdPartyPlugin (clean slate as above)


I agree with your point here -- for now, `::PluginBase` should IMO still
only be about documentation and enumerating what's part of the Plugin
API, *but* adding more layers inbetween can still be done eventually.

However, I think we shouldn't have two different parents for internal
and external plugins, as it would introduce yet another thing we'd have
to track wrt. APIAGE resets etc. Maybe it's not actually an issue in
this case, but ...

That actually gives me an idea that's similar to your first suggestion:

As of this series, the hierarchy would be as follows:

PluginBase
└── Plugin
    ├── ExistingThirdPartyPlugin
    ├── DirPlugin
    └── ...

We could keep `PluginBase` as it is, since IMO having the docs and the
SectionConfig-related stuff in one place is fine, unless we really want
to keep the docs separate from the rest of the code. (Would seem a bit
redundant to introduce another inheritance layer in that case, but I
personally don't mind.)

Then, we eventually introduce `PluginTemplate` on the same layer as
`Plugin`. `PluginTemplate` would only contain implementations for the
most basic methods (and not provide defaults for file-based storages).

PluginBase
├── Plugin
│   ├── ExistingThirdPartyPlugin
│   ├── DirPlugin
│   └── ...
└── PluginTemplate

The idea behind this is that we could then "migrate" each plugin and
base it off `PluginTemplate` instead. Helpers that are shared between
plugins could go into `PVE::Storage::Common::*` instead of being
implicitly becoming part of plugins' modules due to inheritance.

PluginBase
├── Plugin
│   ├── ...
│   └── ExistingThirdPartyPlugin
└── PluginTemplate
    ├── ...
    └── DirPlugin (cleaned up)

That way we could step by step "disentangle" the existing plugins from
each other without having to constantly keep the original behaviour(s)
of `Plugin` in the back of one's head and account for them. Instead,
each plugin would implement precisely what it itself needs.

Since both the old `Plugin` and the new `PluginTemplate` share the same
interface, namely `PluginBase`, we could still support the old `Plugin`
until everything's been moved over and third-party devs have had enough
time to adapt their own code, too.

While doing all this, we could also rework parts of the API that didn't
age that gracefully, perhaps deprecate certain old methods, introduce
new methods, etc. as we'd have a "clean" break, sotospeak.


So, to summarize my idea:
- Keep `PluginBase` as it is right now, but also include
  SectionConfig-related code
- Introduce `PluginTemplate` with minimal implementation later down the
  line on the same inheritance layer as `Plugin`
- Slowly migrate our plugins, basing them off of `PluginTemplate` while
  tossing out old code, making them independent from one another, and
  collecting shared helpers in `PVE::Storage::Common::*`


I'd really like to hear your thoughts on this, because I'm not sure if
this is *actually* feasible or provides any ROI down the line. One
alternative that I can think of is to just keep the inheritance
hierarchy as it is (as in this series) and disentangle the plugins as
they are right now, without changing their parent (so, almost the same
as your second idea). I did start breaking apart our plugins like that
last year, but that was too much all at once [1].

[1]: 
https://lore.proxmox.com/pve-devel/20240717094034.124857-1-m.carr...@proxmox.com/

>
> side-note: should we mention somewhere that plugin code is not called
> directly (pre-existing exceptions that we want to get rid off ignored),
> but that PVE::Storage is the "gateway"/interface for it?

Yeah, good point; we should definitely mention that. Will include it in v2!

>
> > +
> > +=head2 DEFAULT IMPLEMENTATIONS
> > +
> > +C<L<PVE::Storage::Plugin>> implements most of the methods listed in
> > +L</PLUGIN INTERFACE METHODS> by default. These provided implementations are
> > +tailored towards file-based storages and can therefore be used as-is in 
> > that
> > +case. Plugins for other kinds of storages will most likely have to adapt 
> > each
> > +method for their individual use cases.
>
> see above
>
> > +
> > +=head2 GENERAL PARAMETERS
> > +
> > +The parameter naming throughout the code is kept as consistent as possible.
> > +Therefore, common reappearing subroutine parameters are listed here for
> > +convenience:
> > +
> > +=over
> > +
> > +=item * C<< \%scfg >>
> > +
> > +The storage configuration associated with the given C<$storeid>. This is a
> > +reference to a hash that represents the section associated with 
> > C<$storeid> in
> > +C</etc/pve/storage.cfg>.
> > +
> > +=item * C<< $storeid >>
> > +
> > +The ID of the storage. This ID is user-provided; the IDs for existing
> > +storages can be found in the UI via B<< Datacenter > Storage >>.
>
> The unique ID of the storage, as used in C</etc/pve/storage.cfg> and as
> part of every volid.
>
> this is not really user-provided in most cases (that makes it sound like
> you have to be super careful when handling it to prevent exploits),
> although the user of course initially named the section like that ;)

ACK! Thanks for the suggestion!

>
> > +
> > +=item * C<< $volname >>
> > +
> > +The name of a volume. The term I<volume> can refer to a disk image, an ISO
> > +image, a backup, etc. depending on the content type of the volume.
>
> backup archive/snapshot ?
>
> should we list all types here?

Perhaps not here, but maybe in a separate section instead to which we
can refer to? Then again, they're listed in the docstring for
`plugindata()`. Perhaps its docs could benefit from such a section, too.

I'll see what I can do.

>
> > +
> > +=item * C<< $volid >>
> > +
> > +The ID of a volume, which is essentially C<"${storeid}:${volname}">. Less 
> > used
> > +within the plugin API, but nevertheless relevant.
>
> s/essentially// (it is exactly that, not essentially)
>
> the second sentence doesn't really add much, if we want to keep it, then
> I suggest replacing it with something like
>
> Frequently used in guest-specific API calls and passed to
> PVE::Storage::parse_volume_id to split it into storeid and volname parts
> before calling into storage plugin code.

Agree, this is much better.

>
> > +
> > +=item * C<< $snapname >> (or C<< $snap >>)
> > +
> > +The name of a snapshot associated with the given C<$volname>.
>
> what is "given" here? the phrasing doesn't really make sense IMHO ;)

Woops, that was a remnant of when I was moving this around. Thanks for
spotting!

>
> The name of a snapshot of a volume.
>
> > +
> > +=item * C<< \%cache >>
> > +
> > +See L<CACHING EXPENSIVE OPERATIONS>.
> > +
> > +=item * C<< $vmid >>
> > +
> > +The ID of a guest (so, either of a VM or a container).
> > +
> > +=back
> > +
> > +=head2 CACHING EXPENSIVE OPERATIONS
> > +
> > +Certain methods take a C<\%cache> as parameter, which is used to store the
> > +results of time-consuming / expensive operations specific to certain 
> > plugins.
>
> this is not really accurate. the cache is used to store different
> information, currently (according to a quick grep through the code):
>
> for storages (activate_storage/activate_storage_list):
> - parsed /proc/mounts for various dir-based storages where the plugin
>   handles mounting, to speed up "is storage already mounted" checks
> - udev sequence numbers (to know whether udev needs settling after
>   activating a storage)
> - a flag that a storage was activated (as an optimization to skip
>   "is it already active" checks)
>
> this already mixes state of PVE::Storage with plugin-specific data,
> which is a mess.
>
> when listing images across multiple storages:
> - vgs/lvs output (LVM plugins)
>
> that one we've eliminated from most plugins, since it rarely makes
> sense. LVM is a bit of an exception there, as we query global LVM state
> that is valid for multiple storage instances. this maybe could be
> changed to drop the usage as well, and instead query VG-specific
> information only?
>
> > +The exact lifetime of the cached data is B<unspecified>, since it depends 
> > on
> > +the exact usage of the C<L<PVE::Storage>> API, but can generally be 
> > assumed to
> > +be B<short-lived>.
>
> this is not really something that helps me as a plugin developer - what
> kind of information can/should/must I store in the cache (or not)? how
> is it structured?
>
> > +
> > +For example, the C<L<PVE::Storage::LVMPlugin>> uses this cache to store the
> > +parsed output of the C<lvs> command. Since the number and precise 
> > information
> > +about LVM's logical volumes is unlikely to change within a short time, 
> > other
> > +API methods can then use this data in order to avoid repeatedly calling and
> > +parsing the output of C<lvs>.
>
> this reads like the cache is used across PVE API calls (although maybe
> you meant "storage plugin API"?). the original (flawed) reason was/is
> that if we activate 20 volumes on a single storage for a certain task,
> then it is enough to check with LVM once and re-use the (slightly stale)
> data. this since got eliminated from most plugins as it was not working.
> the other use case (see above) is if we (potentially) activate 10
> storages, we can check once what is already mounted and re-use that for
> the subsequent 9 activations. I am not sure this is worth it either to
> be honest.
>
> > +
> > +Third-party plugin developers should ensure that the data stored and 
> > retrieved
> > +is specific to their plugin, and not rely on the data that other plugins 
> > might
> > +store in C<\%cache>. Furthermore, the names of keys should be rather 
> > unique in
> > +the sense that they're unlikely to conflict with any future keys that may 
> > be
> > +introduced internally. To illustrate, e.g. C<myplugin_mounts> should be 
> > used
> > +instead of a plain C<mounts> key.
>
> and this here clearly shows that the current interface is bogus and
> under-specified in any case. so until that is fixed, this here should
> read "ignore the cache parameter it is for internal use only". if a
> plugin needs to cache things internally, it can do so anyway based on
> its own criteria..

In the initial description I tried to simplify / generalize its
description and not really document every single implementation detail
so that third-party devs could use it for their own purposes in order to
avoid running the same time-consuming code in short succession.

I hadn't really thought of the fact that plugins can just keep an
internal cache if they want to cache stuff. 🤦

So yeah, after reading all this, I agree that we should just deter third
parties from using the parameter altogether.

Side-note: I'll also refrain from using it in the SSHFS example plugin I
recently cooked up [2].

[2]: 
https://lore.proxmox.com/pve-devel/20250328171209.503132-1-m.carr...@proxmox.com/

>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Re: [pve-devel] [PATCH v1 pve-storage 2/8] pluginbase: add high-level plugin API description

Reply via email to