Bug#1068890: diffoscope: --hard-timeout option

2024-04-18 Thread Chris Lamb
Holger Levsen wrote:

>> (1) You suggest it should start again with "--max-container-depth 3",
>> but it would surely need some syntax (or another option?) to control
>> that "3" (but for the second time only).
>
> another option, --second-pass-max-container-depth or some such
>
>> (2) In fact, its easy to imagine that one would want to restart with
>> other restrictions as well: not just --max-container-depth. For
>> instance, excluding external commands like readelf and objdump that
>> you know to be slow.
>
> yes, that's a good idea and IMO should be automatically implied for the
> 2nd pass or round or try.

It's definitely a "good idea" in the sense that I can  definitely  see
someone   wanting   to   achieve   that   as   an   end   result:)

Yet… upon thinking about it a bit, I don't think it is a good idea  at
all for diffoscope to  grow  a  bunch  of  new  options  or  hardcoded
defaults for a second run.  What (1) and (2) show here is that as soon
as a user would like to adjust these second pass options in  any  way,
then the whole interface becomes very  unwieldy.  Not  only  that, but
from the user's point of view it's neither flexible nor transparent as
well, especially when compared to "just" running diffoscope twice with
different options.  There's no "magic" there, if you see what I  mean.

Can we implement running diffoscope twice  on  tests.r-b.org  manually
first and see how that  goes?   I'm  not  100%  against  the  idea  of
implementing this in diffoscope eventually, but it would make a lot of
sense to try out the "manual" version first and gain  some  real-world
experience first.


Regards,

-- 
  o
⬋   ⬊  Chris Lamb
   o o reproducible-builds.org 💠
⬊   ⬋
  o

___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds


Bug#1068890: diffoscope: --hard-timeout option

2024-04-18 Thread Chris Lamb
Vagrant Cascadian wrote:

> On 2024-04-16, Chris Lamb wrote:
>> However, I think this first iteration of --hard-timeout time has a few
>> things that would need ironing out first, and potentially make it not
>> worth implementing:
>>
>> (1) You suggest it should start again with "--max-container-depth 3",
>> but it would surely need some syntax (or another option?) to control
>> that "3" (but for the second time only).
>
> What about going the other direction ... starting with a very small
> value for max-container-depth, and incrementally increasing it,
> generating a report (or at least storing sufficient data to generate
> one) in between each increment, so you always get some information, but
> essentially incrementally increase the resolution?
>
> Or would that approach just be too inefficient?

This is probably a separate required best suited to another  issue  at
this point, but I do like the idea  of  being  able  to  incrementally
increase the resolution over time.  Depending  on  how  it  worked  in
practice, there should not be significant overhead  in  managing  this
if, say, the commands that could not be run "in time" would have token
placeholders internally that rendered to text  in  the  output  rather
than non-trivial/expensive binary diffs.

On the negative side though, I think this would still require a robust
way of killing long-running processes  as  outlined  previously.   But
moreover it would require a HUGE reworking of how  diffoscope  handles
containers and recurses into nested structures in its tree-like style.
Indeed, thinking about it, this change would pretty  much  be  exactly
the same work needed to make diffoscope  run  in  parallel  (!)  which
hopefully communicates both the scope of the  changes  that  would  be
needed to achieve this, and that making  diffoscope  run  in  parallel
also  has   other   benefits.Anyway,   mini   brain   dump   over.


Regards,

-- 
  o
⬋   ⬊  Chris Lamb
   o o reproducible-builds.org 💠
⬊   ⬋
  o

___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds


Bug#1068890: marked as done (diffoscope: --hard-timeout option)

2024-04-18 Thread Debian Bug Tracking System
Your message dated Thu, 18 Apr 2024 13:09:44 +
with message-id 
and subject line Re: Bug#1068890: diffoscope: --hard-timeout option
has caused the Debian Bug report #1068890,
regarding diffoscope: --hard-timeout option
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)


-- 
1068890: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1068890
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Package: diffoscope
Version: 264
Severity: wishlist

Dear Maintainer,

currenlty diffoscope has a --timeout option

   --timeout SECONDS
  Best-effort attempt at a global timeout in seconds. If enabled, 
diffoscope will not recurse into any further sub-archives
  after X seconds of total execution time.  (default: no timeout) 
[experimental]

however this doesnt give any guarantees how long diffoscope will be running, so
so far we haven't used it for the RB CI tests, mostly because I'm not sure
what would be a good inner timeout (=for diffoscope) and what would be a good
good outer timeout (=for killing diffoscope from the outside no matter what).

Currently we use 2h as outer timeout, but have no inner timeout. Maybe we should
use --timeout 1h?

Anyhow, about my --hard-timeout option idea:

my idea of "--hard-timeout $time" is that diffoscope terminates itself after
$time, no matter what *and* then re-starts itself with "--max-container-depth 3"
(or whatever is useful to get a glimpse on what files in a Debian package
are different) (probably also with another hard timeout set...) as to guarantee
to always produce meaningful output (especially html output if specified with 
--html).

What do you think?

Else we could also extend the current code for tests.r-b.o/debian, which 
currently
just kills diffoscope after 2h, to then run diffoscope --max-container-depth 3 
:)

https://tests.reproducible-builds.org/debian/index_breakages.html lists
251 pkg/suite/arch combinations where diffoscope runs into a timeout...


& many thanks for rocking diffoscope airlines..! \o/

-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

Bottled water companies don't produce water, they produce plastic bottles.


signature.asc
Description: PGP signature
--- End Message ---
--- Begin Message ---
On Thu, Apr 18, 2024 at 12:31:59PM +0100, Chris Lamb wrote:
> It's definitely a "good idea" in the sense that I can  definitely  see
> someone   wanting   to   achieve   that   as   an   end   result:)

heh
 
> Yet… upon thinking about it a bit, I don't think it is a good idea  at
> all for diffoscope to  grow  a  bunch  of  new  options  or  hardcoded
> defaults for a second run.  What (1) and (2) show here is that as soon
> as a user would like to adjust these second pass options in  any  way,
> then the whole interface becomes very  unwieldy.  Not  only  that, but
> from the user's point of view it's neither flexible nor transparent as
> well, especially when compared to "just" running diffoscope twice with
> different options.  There's no "magic" there, if you see what I  mean.

right, you convinced me, thank you!

> Can we implement running diffoscope twice  on  tests.r-b.org  manually
> first and see how that  goes?  

sure!

> I'm  not  100%  against  the  idea  of
> implementing this in diffoscope eventually, but it would make a lot of
> sense to try out the "manual" version first and gain  some  real-world
> experience first.

As you will have noted I'm closing this bug with this email, because based
on what you wrote further above, I'm (now) very sceptical that this option
can be done sensible in diffoscope for all... 


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

There are many ways to kill. You can stab someone in the guts, take their bread
away,  not heal someone from disease,  put someone in a bad living space,  work
someone to death,  drive them to suicide, lead someone to war etc.  Only few of
these are prohibited in our state." (Bertolt Brecht)


signature.asc
Description: PGP signature
--- End Message ---
___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds