On Sun, Nov 05, 2006 at 10:41:50AM +0100, Ed Schofield <[EMAIL PROTECTED]> was
heard to say:
> Update: I can run aptitude fine under valgrind. It prints out lots of
> messages describing (real or imagined) memory errors, but it runs
> without segfaulting. Running the same command without valgrind produces
> a segfault every time. Valgrind's memcheck tool traps every single
> memory access and performs it synthetically. It also uses much _more_
> memory than the program would alone. So I think we can conclude that
> this segfault is not just due to my machine running out of RAM, but due
> to old-fashioned heap corruption.
>
> I'll attach the valgrind log file below. Perhaps the lines
>
> ==1444== Mismatched free() / delete / delete []
> ==1444== at 0x401CCBC: operator delete(void*) (vg_replace_malloc.c:244)
> ==1444== by 0x82708E2: reset_surrounding_or_memoization() (apt.cc:89)
>
> indicate the problem??
Might be.
> ==1444== Conditional jump or move depends on uninitialised value(s)
> ==1444== at 0x826D942:
> aptitudeDepCache::build_selection_list(OpProgress&, bool, bool, char
> const*) (aptcache.cc:349)
> ==1444== by 0x826E71E: aptitudeDepCache::Init(OpProgress*, bool,
> bool, char const*) (aptcache.cc:193)
> ==1444== by 0x826E94F: aptitudeCacheFile::Open(OpProgress&, bool,
> bool, char const*) (aptcache.cc:1638)
> ==1444== by 0x8272E1C: apt_load_cache(OpProgress*, bool, char const*)
> (apt.cc:295)
> ==1444== by 0x81D8022: cmdline_upgrade(int, char**, char const*,
> bool, bool, bool, bool, bool, bool, bool, bool, bool, int)
> (cmdline_upgrade.cc:37)
> ==1444== by 0x80E6C08: main (main.cc:480)
This looks like it might be a real bug; I'm not sure how it's lasted
so long. Apparently one of aptitude's internal state parameters,
indicating the current state of a package, is being left uninitialized
when the program starts up. I think that "Unknown" is probably right
here.
> ==1444== Conditional jump or move depends on uninitialised value(s)
> ==1444== at 0x4069187: pkgTagSection::Scan(char const*, unsigned
> long) (in /usr/lib/libapt-pkg-libc6.3-6.so.3.11.0)
> ==1444== by 0x82C1075: insert_tags(pkgCache::VerIterator const&,
> pkgCache::VerFileIterator const&) (tags.cc:164)
> ==1444== by 0x82C1A6E: load_tags(OpProgress&) (tags.cc:221)
> ==1444== by 0x8272FB3: apt_load_cache(OpProgress*, bool, char const*)
> (apt.cc:331)
> ==1444== by 0x81D8022: cmdline_upgrade(int, char**, char const*,
> bool, bool, bool, bool, bool, bool, bool, bool, bool, int)
> (cmdline_upgrade.cc:37)
> ==1444== by 0x80E6C08: main (main.cc:480)
I'm not sure where this comes from. It looks to me like the values
that should influence Scan's behavior are all either initialized by
aptitude or generated by apt routines. I know that I've noticed
valgrind apparently being confused by references into the apt cache in
the past; maybe that's what this is.
> ==1444== Mismatched free() / delete / delete []
> ==1444== at 0x401CCBC: operator delete(void*) (vg_replace_malloc.c:244)
> ==1444== by 0x82708E2: reset_surrounding_or_memoization() (apt.cc:89)
> ==1444== by 0x8270EB1: apt_close_cache() (signal.h:544)
> ==1444== by 0x828054E:
> download_install_manager::finish(pkgAcquire::RunResult, OpProgress&)
> (download_install_manager.cc:179)
> ==1444== by 0x81D98CA: cmdline_do_download(download_manager*)
> (cmdline_util.cc:185)
> ==1444== by 0x81D8474: cmdline_upgrade(int, char**, char const*,
> bool, bool, bool, bool, bool, bool, bool, bool, bool, int)
> (cmdline_upgrade.cc:110)
> ==1444== by 0x80E6C08: main (main.cc:480)
> ==1444== Address 0x6F67028 is 0 bytes inside a block of size 616,560
> alloc'd
> ==1444== at 0x401D7C1: operator new[](unsigned) (vg_replace_malloc.c:195)
> ==1444== by 0x8271CD7: surrounding_or(pkgCache::DepIterator,
> pkgCache::DepIterator&, pkgCache::DepIterator&, pkgCache*) (apt.cc:479)
> ==1444== by 0x8272489: package_recommended(pkgCache::PkgIterator
> const&) (apt.cc:570)
> ==1444== by 0x81C80D5: cmdline_show_preview(bool,
> std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>,
> std::allocator<pkgCache::PkgIterator> >&,
> std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>,
> std::allocator<pkgCache::PkgIterator> >&,
> std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>,
> std::allocator<pkgCache::PkgIterator> >&, bool, bool, bool, int)
> (cmdline_prompt.cc:493)
> ==1444== by 0x81C873D: cmdline_do_prompt(bool,
> std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>,
> std::allocator<pkgCache::PkgIterator> >&,
> std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>,
> std::allocator<pkgCache::PkgIterator> >&,
> std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>,
> std::allocator<pkgCache::PkgIterator> >&,
> std::set<pkgCache::PkgIterator, std::less<pkgCache::PkgIterator>,
> std::allocator<pkgCache::PkgIterator> >&, bool, bool, bool, bool, int,
> bool, bool) (cmdline_prompt.cc:736)
> ==1444== by 0x81D8434: cmdline_upgrade(int, char**, char const*,
> bool, bool, bool, bool, bool, bool, bool, bool, bool, int)
> (cmdline_upgrade.cc:99)
> ==1444== by 0x80E6C08: main (main.cc:480)
That looks like a definite aptitude bug. I don't know if it's causing
your crash, though.
Could you see what happens if you apply the attached patch?
Thanks,
Daniel
diff -rN -u old-head/src/generic/apt/aptcache.cc
new-head/src/generic/apt/aptcache.cc
--- old-head/src/generic/apt/aptcache.cc 2006-11-07 17:46:59.000000000
-0800
+++ new-head/src/generic/apt/aptcache.cc 2006-11-07 17:46:59.000000000
-0800
@@ -226,6 +226,7 @@
package_states[i].reinstall=false;
package_states[i].install_reason=manual;
package_states[i].remove_reason=manual;
+ package_states[i].selection_state = pkgCache::State::Unknown;
}
if(WithLock && lock==-1)
diff -rN -u old-head/src/generic/apt/apt.cc new-head/src/generic/apt/apt.cc
--- old-head/src/generic/apt/apt.cc 2006-11-07 17:46:59.000000000 -0800
+++ new-head/src/generic/apt/apt.cc 2006-11-07 17:46:59.000000000 -0800
@@ -80,13 +80,13 @@
static void reset_interesting_dep_memoization()
{
- delete cached_deps_interesting;
+ delete[] cached_deps_interesting;
cached_deps_interesting = NULL;
}
static void reset_surrounding_or_memoization()
{
- delete cached_surrounding_or;
+ delete[] cached_surrounding_or;
cached_surrounding_or = NULL;
}