Re: RFCv2: debuginfod debian archive support

2019-12-22 Thread Mark Wielaard
Hi Frank,

On Fri, 2019-12-13 at 14:25 -0500, Frank Ch. Eigler wrote:
> > I see, I missed that both functions are only called after first
> > checking the archive type. I think it might be helpful/clearer if
> > both methods would be called with the intended archive type then,
> > also because that might make it simpler to...
> 
> The archive subtype (rpm vs deb) is not stored in the database, as
> this would break the current schema, and is really not needed.  In any
> case, it's easily recovered for purpose of separate metrics.  Following
> sur-patch added to fche/debuginfod-deb branch:

I think this is ready to be squashed and put on master. I had hoped out
Debian/Ubuntu friends would speak up. But they had their chance and
will now have to live with the code as is :)

Cheers,

Mark


Re: rfc/patch: debuginfod client $DEBUGINFOD_PROGRESS env var

2019-12-22 Thread Mark Wielaard
Hi Frank,

On Wed, 2019-12-18 at 19:47 -0500, Frank Ch. Eigler wrote:
> [...]
> > I would add something like:
> > 
> >   /* Make sure there is at least some progress,
> >  try to get at least 1K per progress timeout seconds.  */
> >   curl_easy_setopt(curl, CURLOPT_LOW_SPEED_TIME, 5 * 1024L);
> >   curl_easy_setopt(curl, CURLOPT_LOW_SPEED_LIMIT, progress_timeout);
> > 
> > The idea being that if we didn't at least get 1K per 5 seconds then the
> > connection is just so bad that it doesn't make sense to wait for it to
> > finish, since that will most likely be forever (or feel like it for the
> > user).

Note that the comment and the pseudo code are off. That "5 *" shouldn't
be there in the code.

> The problem with that is that, for a large download such as a kernel,
> it can take almost a minute to just decompress the kernel-debuginfo
> rpm far enough to start streaming the vmlinux file.  (In the presene
> of caching somewhere in the http proxy tree, it gets much better the
> second+ time.)  So any small default would be counterproductive to
> e.g. systemtap users: they'd be forced to override this for basic
> usage.

I can see how 5 seconds might be too low in such a case. But a whole
minute surprises me. But... indeed I tried myself and although not a
whole minute, it does seem to take more than 40 seconds. Most of it is
spend in rpm2cpio (I even tries a python implementation to compare).
There is of course some i/o delay involved. But the majority of the
time is cpu bound because the file needs to be decompressed.

Not that it would help us now, but I wonder if it would be worth it to
look at parallel compression/decompression to speed things up.

So 5 seconds no progress seems too low. But I still don't like infinite
as default. It seems unreasonable to let the user just wait
indefinitely when the connection seem stuck. How about saying something
like you need to get at least 450K in 90 seconds as default? I am
picking 90 seconds because that seems twice the worse case time to
decompress and that gives it about 45 seconds to provide ~10K/sec. But
if you are seeing 60 seconds as worse case we could pick something like
120 seconds or something.

But it should probably be a separate timeout from the connection
timeout, and maybe from the total timeout (or maybe replace it?). What
do you think?

Cheers,

Mark


Re: oss-fuzz

2019-12-22 Thread Mark Wielaard
Hi Berkeley,

On Fri, 2019-12-20 at 17:21 +0200, Berkeley Churchill wrote:
> Any interest in integrating with oss-fuzz?  It's a google project
> that supports open source projects by fuzzing. It allows Google to
> find and report bugs, especially security bugs, to the project.
> I'm willing to work on writing fuzzers and performing the integration,
> if this would be welcome by the maintainers.   Thoughts?

Certainly interested. I have been running afl-fuzz on various utilities
and test cases. That has found lots of issues. But it isn't very
structured. And it often needs to go through a completely valid ELF
file before fuzzing the more interesting data structures inside it.

The only request I would have is that if the fuzzer targets are added
to elfutils itself then they should also be made to work locally. So
someone could also use them with e.g. afl-fuzz or some other fuzzing
framework, or simply as extra testcase.

Please also see:
https://sourceware.org/git/?p=elfutils.git;f=CONTRIBUTING;hb=HEAD

Cheers,

Mark


Re: rfc/patch: debuginfod client $DEBUGINFOD_PROGRESS env var

2019-12-22 Thread Frank Ch. Eigler
Hi -


> There is of course some i/o delay involved. But the majority of the
> time is cpu bound because the file needs to be decompressed.
> Not that it would help us now, but I wonder if it would be worth it to
> look at parallel compression/decompression to speed things up.

Yeah, maybe.

> picking 90 seconds because that seems twice the worse case time to
> decompress and that gives it about 45 seconds to provide ~10K/sec. But
> if you are seeing 60 seconds as worse case we could pick something like
> 120 seconds or something.

That's a possibility.

> But it should probably be a separate timeout from the connection
> timeout, and maybe from the total timeout (or maybe replace
> it?). What do you think?

Yeah, a connection timeout per se is probably not really worth having.
A URL having unreasolvable hosts will fail immediately.  A reachable
http server that is fairly busy will connect, just take time.  The
only common cases a connection timeout would catch is a running http
server that is so overloaded that it can't even service its accept(4)
backlog, or a nonexistent one that has been tarpit/firewalled.  A
minimal progress timeout can subsume cases too.

OTOH, it's worth noting that these requests only take this kind of
time if they are being seriously serviced, i.e., "they are worth it".
Error cases fail relatively quickly.  It's the success cases - and
these huge vmlinux files - that take time.  And once the data starts
flowing - at all - the rest will follow as rapidly as the network
allows.

That suggests one timeout could be sufficient - the progress timeout
you the one you found - just not too short and not too fast.


- FChE



Re: oss-fuzz

2019-12-22 Thread Berkeley Churchill
Great, thanks for the feedback!

One of my first tasks will be to support llvm/clang builds.  I've seen some
prior discussion on what's needed for that, but if you have any extra tips
I'll take them.  I'll be sure to create a build target for the fuzzers so
they can be run standalone.

Berkeley

On Mon, Dec 23, 2019 at 3:12 AM Mark Wielaard  wrote:

> Hi Berkeley,
>
> On Fri, 2019-12-20 at 17:21 +0200, Berkeley Churchill wrote:
> > Any interest in integrating with oss-fuzz?  It's a google project
> > that supports open source projects by fuzzing. It allows Google to
> > find and report bugs, especially security bugs, to the project.
> > I'm willing to work on writing fuzzers and performing the integration,
> > if this would be welcome by the maintainers.   Thoughts?
>
> Certainly interested. I have been running afl-fuzz on various utilities
> and test cases. That has found lots of issues. But it isn't very
> structured. And it often needs to go through a completely valid ELF
> file before fuzzing the more interesting data structures inside it.
>
> The only request I would have is that if the fuzzer targets are added
> to elfutils itself then they should also be made to work locally. So
> someone could also use them with e.g. afl-fuzz or some other fuzzing
> framework, or simply as extra testcase.
>
> Please also see:
> https://sourceware.org/git/?p=elfutils.git;f=CONTRIBUTING;hb=HEAD
>
> Cheers,
>
> Mark
>