It's June 2014 and this bug still hits the latest LTS release Trusty
Thar.

# Why we are here

Some people say "fix the noise" and others say "it's not noise".
What's needed is to fix the noise and keep the signal.

Has anyone used Kevin's patches (see comments #34 and #35) ?

My guess is: many people have made local changes to their systems and
not bother afterwards, making the bug still occur in latest LTS release.

I'm patching my system now but it's important to fix this.


# Why this is important

Reason 1: Noise in mailboxes is a more serious issue than it appears.
It's a fact that each uninformative message received from a system makes
more probable that important messages will get missed.

Reason 2: The current situation makes the same message in normal
operation (anacron instances overlap) and in dangerous situation
(security updates not applied for extended periods of time, see comment
#31).

Please raise this bug's "importance".


To have the bug fixed once and for all, let's clear up the situation.


# Why this is a bug in anacron and not only apt or whatever.

Normal operations:

* (1) Long job duration in normal operation.
* (2) Asynchronous uncontrollable job start times making overlaps happen in 
normal operation (e.g. daily).
* (3) Overlaps are reported by e-mail.

Any issue arising from (1), (2) or (3) only is a bug in anacron.

Abnormal operations:

* (4) Some jobs get stuck forever (abnormal operation).
* (5) Stuck jobs prevent other jobs for a possibly unlimited time.

Any issue arising from (4) or (5) may be anacron bugs or wishes to make
anacron more robust, just like we generally expect our system to
robustly stop buggy programs without crashing the computer.

## Normal operations

(1) some jobs are designed to wait for a long time (up to half an hour), even 
when everything is fine, form example /etc/cron.daily/apt . But some people 
don't see it because their config. My fresh 14.04 Trusty has package 
update-notifier-common installed which seems sufficient to trigger a sleep up 
to half an hour on that job invocation every time. This is normal operation.
(2) anacron is setup to be run on several occasions to minimize delay.  For 
example, on top of running it daily at 07:30, it also runs at boot and on 
resume from suspend. Nothing prevents booting/resume minutes before 07:30, a 
delay shorter than normal jobs.
(3) (nothing more to say)

(1)+(2) makes overlaps part of normal operation.
(1)+(2)+(3) makes *noise* in mailbox about overlaps.

(1)+(2)+(3) makes this bug an *actual anacron bug*.

## Abnormal operations

(4) some jobs get broken and get stuck forever. This should not happen on plain 
human beings' machines. Sysadmin caring a little wish to be notified. 
Heavyweight sysadmins already have other ways to get notified and/or get jobs 
killed automatically.
(5) Prevented jobs may include security updates, which make it a serious issue 
(see comment #31).

(1)+(3)+(4) makes *signal* in mailbox about stuck jobs, but which looks like 
noise
(5) prevented jobs make the whole issue serious.


# What to do ?

Now we know where's the anacron bug and where's the feature wish.

## Raise bug importance

Noise in mailboxes is a more serious issue than it appears.  It's a fact
that each uninformative message received from a system makes more
probable that important messages will get missed.  Plus the current
situation makes the same message in harmless and dangerous situation.

For these reasons, I request to raise the importance of this bug.

## Fix anacron bug: disable noise mail that report overlap because it's
really noise.

Actually (3) reports overlaps, not stuck jobs.  On a server running all
the time and never rebooting, it somehow only reports stuck jobs, but
Ubuntu is not only for servers never rebooting.  Turning on or resuming
your system minutes before a scheduled run are normal operation for many
computers.  On such machines there's no way (3) can be reliably used to
detect stuck jobs yet it makes noise.

Kevin's patch (comment #34) fixes (3). No more noise in mailbox, but no
more signal.

## Grant anacron's wish: get mail about *stuck jobs* (not overlap)
because that's what sysadmin really need.

Kevin's patch (comment #35) actually reports stuck jobs because it can
measure the duration and react to it.

It is important because it makes an explicit signal saying that there's
a stuck job, not some dull "already started" noise.

## Grant another anacron's wish: be more resilient to stuck jobs

Find a clean solution to ensure that other jobs are not just ignored
when one jobs gets stuck.


Thank you for your attention and for any comment.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/606491

Title:
  start: Job is already running: anacron

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/anacron/+bug/606491/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to