On Fri, Feb 27, 2026 at 5:12 PM Tomasz Kaminski <[email protected]> wrote:

>
>
> On Fri, Feb 27, 2026 at 4:48 PM Jonathan Wakely <[email protected]>
> wrote:
>
>> On Wed, 07 Jan 2026 at 15:30 +0100, Tomasz Kamiński wrote:
>> >The warning was produced by following seqeunce, given an
>> istream_iterator<char>
>> >it, such that *it will result in hitting EoF in it->_M_get(), and thus
>> clearing
>> >_M_sbuf, the subsequent call to ++it, will result in _M_sbuf->sbumpc()
>> call
>> >on null pointer dereference. This is however an false-positive, as in
>> such sitution
>> >it == istream_iteator() returns true, and the iterator should not be
>> dereferenced
>> >in first place.
>> >
>> >This patch addresses the above by clearing the _M_sbuf in operator++,
>> instead of
>> >_M_get(). This removes the need for making _M_sbuf mutable, and thus
>> make the
>> >implementation conforming with regards to C++11 [res.on.data.races] p3.
>> >
>> >This change should not or positive have performance impact on the usual
>> iteration
>>
>> I think this should be:
>>
>> "This change should have zero or positive performance impact ..."
>>
>> >patterns, in form:
>> > while (it != end) { process(*it); ++it; }
>> >In case when it is end-of-stream iterator, the it != end returns in one
>> call of
>> >_M_sbuf->sgetc() both before and after the change. However we do not
>> modify _M_sbuf
>> >in this case. For non-empty range, we replace call to _M_sbuf->sbumpc()
>> with
>> >_M_sbuf->snextc() in pre-increment, and extract the check again from *it
>> to
>> >++it. However, as _M_sbuf is now cleared during increment, so last it !=
>> end
>> >check avoids _M_sbuf->sgetc() call to check against EoF.
>> >
>> >However, this change impact the behavior of the post-increment (*it++),
>> as
>> >we now load both current character (for return value) and next character
>> (to
>> >check against EoF). In consequence we call both sgetc() and snextc(), in
>> contrast
>> >to previous single sbumpc() call.
>> >
>> >       PR libstdc++/105580
>> >
>> >libstdc++-v3/ChangeLog:
>> >
>> >       * include/bits/streambuf_iterator.h
>> (istreambuf_iterator::_M_sbuf):
>> >       Remove mutable and adjust whitespace.
>> >       (istreambuf_iterator::_M_c): Adjust whitespace.
>> >       (istreambuf_iterator::operator++()): Clear _M_sbuf if next
>> character
>> >       is EoF.
>> >       (istreambuf_iterator::operator++(int)): Use _M_sbuf->sgetc() to
>> >       load current character, and define in terms of __*this.
>>
>> This should be ++ not __
>>
>> >       (istreambuf_iterator::_M_get()): Do not clear _M_sbuf in case of
>> EoF.
>> >       * testsuite/24_iterators/istreambuf_iterator/2.cc: Test for using
>> >       multiple iterators to same rdbuf.
>> >---
>> >This warning turned out to be false-positive (as explained in the commit
>> >decription), so we can just add add pragrams to address that. But this
>> >change seem to be beneficial anyway, as we are no longer causing
>> >data-races.
>> >
>> >Tested on x86_64-linux. Locally tested with -Wnull-dereference,
>> >and the istream_iterator warning in 24_iterators/istreambuf_iterator/2.cc
>> >disappears (there are some other warning present I haven't looked into
>> >that yet).
>>
>> Can we add a new testsuite/24_iterators/istreambuf_iterator/105580.cc
>> using exactly the testcase from the bug, with -Wnull-dereference in
>> the dg-options? I don't think the additions to 2.cc cover it. I'd like
>> us to have a test that gives the warning today and doesn't warn after
>> the fix.
>>
> OK will do.
>
>>
>> > .../include/bits/streambuf_iterator.h         | 20 +++----
>> > .../24_iterators/istreambuf_iterator/2.cc     | 58 +++++++++++++++++++
>> > 2 files changed, 66 insertions(+), 12 deletions(-)
>> >
>> >diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h
>> b/libstdc++-v3/include/bits/streambuf_iterator.h
>> >index 93d3dd24f93..095928ca4d8 100644
>> >--- a/libstdc++-v3/include/bits/streambuf_iterator.h
>> >+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
>> >@@ -112,8 +112,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> >       // the "end of stream" iterator value.
>> >       // NB: This implementation assumes the "end of stream" value
>> >       // is EOF, or -1.
>> >-      mutable streambuf_type* _M_sbuf;
>> >-      int_type                        _M_c;
>> >+      streambuf_type* _M_sbuf;
>>
>> I've been trying to figure out if this change (removing 'mutable')
>> could be unsafe if we mix objects compiled with old and new GCC.
>>
>> In a TU compiled by an old GCC, _M_sbuf could change by changed when
>> the iterator reaches EOF. In a TU compiled by the new GCC, could the
>> compiler optimize based on the fact that if only const member
>> functionss are used, the member can't change?
>>
>> I don't think it's a problem, because the new compiler has to assume
>> that any non-inline code could already do a const_cast on the iterator
>> and so call non-const members that way. The only way it can optimize
>> by assuming _M_sbuf never, ever changes is if it can see the original
>> definition of the iterator and it was defined as const. But in that
>> case, nobody can ever call non-const members on it anyway.
>>
> I thought about it, and the only situation when removing the mutable could
> have
> any impact, is that you declare an iterator as a const object, so the
> compiler can in
> that case now can propagate _M_sbuf value. But in that case, there is no
> way to
> mutate _M_sbuf without UB.
>
>
>>
>> So I think this is safe. I would feel more comfortable if we left the
>> 'mutable' for GCC 16 with a comment saying it's not needed now, and
>> then remove it for GCC 17 early in stage 1.
>>
> Given that the warning is FP, and is emitted from system headers, do we
> even
> need to have fix for GCC16? Maybe I should simply add pragma to ignore
> warnings
> for now, and postpone the whole endeavor for GCC17? Or just leave the
> warning as is.
> is emitted from system headers anyway.
>
I also feel uneasy about touching istream iterator now. In the beginning,
we prioritized the
change because we thought this was a true bug, so I think a better approach
now may be to
add test emmiting and pragmas disabling the warning now (with backports)
 and land the changes to GCC17.

>
>> >+      int_type        _M_c;
>> >
>> >     public:
>> >       ///  Construct end of input stream iterator.
>> >@@ -172,7 +172,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> >
>>  _M_message(__gnu_debug::__msg_inc_istreambuf)
>> >                               ._M_iterator(*this));
>> >
>> >-      _M_sbuf->sbumpc();
>> >+      if (_S_is_eof(_M_sbuf->snextc()))
>> >+        _M_sbuf = 0;
>>
>> Again, I'm considering old vs new compilers. What if one TU inlines
>> the old definition of operator++ which doesn't set _M_sbuf to null,
>> and makes a non-inline call to the new _M_get, which uses a definition
>> from a new TU, which doesn't set _M_sbuf either?
>>
>> I think that means we will never set _M_sbuf = 0 but we still have the
>> right behaviour, because comparing to the past-the-end iterator uses
>> _M_at_eof() which will get a character from the streambuf, which will
>> say we're at EOF. So it still works, it's just a bit slower because we
>> have to go to the streambuf every time instead of just seeing that
>> _M_sbuf is null.
>
> Again, I do not think that matters in practice, because "every time"
> would be repeated comparision against end, after we determine that end
> was reached.
>
>
>>
>> So I think this isn't an ABI change. I've been wondering whether we
>> could be sure it's safe by adding __attribute__((__always_inline__))
>> to all the increment oeprators, and everything that directly or
>> indirectly calls _M_get. But that's basically *everything*. We'd need
>> to add it to equal() and _M_at_eof() and operator== etc. etc.
>>
>> If we did that, old TUs would continue to use their own definitions
>> (via implicit instantiations) and new TUs would have force-inlined
>> new definitions. But we could still get the behaviour above, if all
>> the increments happen in the old TU and all the deferences happen in
>> the new TU. So I don't think using always_inline really solves
>> anything. And I don't think there's a real problem to solve anyway.
>>
>> >       _M_c = traits_type::eof();
>> >       return *this;
>> >       }
>> >@@ -181,14 +182,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> >       istreambuf_iterator
>> >       operator++(int)
>> >       {
>> >-      __glibcxx_requires_cond(_M_sbuf &&
>> >-                              (!_S_is_eof(_M_c) ||
>> !_S_is_eof(_M_sbuf->sgetc())),
>> >-
>> _M_message(__gnu_debug::__msg_inc_istreambuf)
>> >-                              ._M_iterator(*this));
>> >-
>> >       istreambuf_iterator __old = *this;
>> >-      __old._M_c = _M_sbuf->sbumpc();
>> >-      _M_c = traits_type::eof();
>> >+      __old._M_c = _M_sbuf->sgetc();
>> >+      ++*this;
>> >       return __old;
>> >       }
>> >
>> >@@ -206,8 +202,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> >       _M_get() const
>> >       {
>> >       int_type __ret = _M_c;
>> >-      if (_M_sbuf && _S_is_eof(__ret) && _S_is_eof(__ret =
>> _M_sbuf->sgetc()))
>> >-        _M_sbuf = 0;
>> >+      if (_M_sbuf && _S_is_eof(__ret))
>>
>> Would it make sense to use __builtin_expect here?
>>
>>         if (__builtin_expect(_M_sbuf && _S_is_eof(__ret), 1))
>>
>> We should only be calling _M_get() on dereferenceable iterators, so
>> they should always have non-null _M_sbuf, and __ret is always eof
>> except for the prvalue returned from the post-increment
>> operator++(int).
>>
>> If that might pessimize process(*it++) then we could make it so the
>> prediction is only on _M_sbuf:
>>
>>         if (__builtin_expect(_M_sbuf, 1) && _S_is_eof(__ret))
>>
>> Oh no ... because we call _M_get() to check if an iterator is
>> past-the-end, which we do in operator==, so that happens on every
>> iterator of a loop like while (it != eof_it) process(*it++);
>>
>> Drat.
>>
>> Comparing to the default_sentinel doesn't have that problem, because
>> we only call _M_at_eof() for the iterator, not for both the iterator
>> and the sentinel. That's an interesting data point I'd never
>> considered before: it's better to use the default sentinel here,
>> because istreambuf_iterator does more than just compare the data
>> members of two iterators.
>>
>> We could maybe consider changing _M_at_eof() to do:
>>
>>    return _M_sbuf && _S_is_eof(_M_get());
>>
>> so that we can have the branch prediction in _M_get() and it would be
>> correct, because we would only call it for operator* and when we
>> already know _M_sbuf is not null.
>>
>> But I don't think we need to do that as part of this patch.
>>
> I tried, but I haven't seen any gains from it. What have actual benefit is
> to
> change the return type of postfix++ to proxy (as in your prototype),
> eliminate writes to _M_c, and then adjust _M_get to:
>         if (!_S_is_eof(_M_c)) [[unlikely]]
>           return _M_c;
>         return _M_sbuf->sgetc()
> The unlikely case of _M_c not being eof is someone explicitly converting
> it++ to iterator, or getting in returned from older TU, compiled with older
> standard, that returned ++it.
>
>> >+        __ret = _M_sbuf->sgetc();
>> >       return __ret;
>> >       }
>> >
>> >diff --git
>> a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
>> b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
>> >index 78701d71cee..0318a677f35 100644
>> >--- a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
>> >+++ b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
>> >@@ -109,8 +109,66 @@ void test02(void)
>> >     }
>> > }
>> >
>> >+void
>> >+test_empty()
>> >+{
>> >+  std::istreambuf_iterator<char> null(0), end;
>> >+  VERIFY( null == end );
>> >+
>> >+  std::istringstream ess;
>> >+  // Thie specification hear seem to indicate, that such iterators
>> >+  // are not end-of-stream iterators, as rdbuf pointer is nullptr,
>> >+  // however we treat them as such, as otherwise we would produce
>> >+  // a range containing single eof character.
>>
>> Could you mention LWG 2366 here please. I think the deviation from the
>> standard that you describe is fixing the problem identified in 2366,
>> but this patch makes the implementation simpler than what I proposed
>> there.
>>
>> >+  std::istreambuf_iterator<char> it1(ess), it2(ess);
>> >+  VERIFY( it1 == end );
>> >+  VERIFY( it2 == end );
>> >+}
>> >+
>> >+void
>> >+test_multi()
>> >+{
>> >+  // C++98 (and later) define operator* in [istreambuf.iterator.ops] as:
>> >+  // Returns: The character obtained via the streambuf member
>> sbuf_->sgetc().
>> >+  // This nails down behavior of mulptiple iterators to same sequence.
>>
>> "multiple"
>>
>> >+  std::istringstream iss("abcd");
>> >+  std::istreambuf_iterator<char> it1(iss), it2(iss), end;
>> >+
>> >+  VERIFY( it1 != end );
>> >+  VERIFY( it2 != end );
>> >+  VERIFY( *it1 == 'a' );
>> >+  VERIFY( *it2 == 'a' );
>> >+  ++it1;
>> >+
>> >+  VERIFY( it1 != end );
>> >+  VERIFY( it2 != end );
>> >+  VERIFY( *it1 == 'b' );
>> >+  VERIFY( *it2 == 'b' );
>> >+  ++it2;
>> >+
>> >+  VERIFY( it1 != end );
>> >+  VERIFY( it2 != end );
>> >+  VERIFY( *it1 == 'c' );
>> >+  VERIFY( *it2 == 'c' );
>> >+  ++it2;
>> >+
>> >+  VERIFY( it1 != end );
>> >+  VERIFY( it2 != end );
>> >+  VERIFY( *it1 == 'd' );
>> >+  VERIFY( *it2 == 'd' );
>> >+  // second dereference
>> >+  VERIFY( *it1 == 'd' );
>> >+  VERIFY( *it2 == 'd' );
>> >+  ++it1;
>> >+
>> >+  VERIFY( it1 == end );
>> >+  VERIFY( it2 == end );
>> >+}
>> >+
>> > int main()
>> > {
>> >   test02();
>> >+  test_empty();
>> >+  test_multi();
>> >   return 0;
>> > }
>> >--
>> >2.52.0
>> >
>> >
>>
>>

Reply via email to