When I run this code segment, the underflow is always called, there is data is a.wav (it is a audio file), the three prints which I have
commented out:
Calling underflow
gptr is NULL
C = 82
Calling underflow
gptr is NULL
C = 82
......
......
why is underflow always called ? I understand the first call due to the buffer needing to be filled, why is gptr always NULL ,
when C contains data and the buffer is not empty (which is why I suspect underflow is looping)
Essentially, this goes into a tight loop and the read never finishes in main.
Your main() calls istream::read() which, as an Unformatted Input Function, is required to extract characters "as if" by repeated calls to streambuf::sgetc(), which in turn calls streambuf::underflow() (or overridden) when streambuf::gptr() == streambuf::egptr(). According to the [streambuf.virt.get] clause of the C++ Standard, underflow() is expected to have the following effects:
The function sets up the gptr() and egptr() satisfying one of:
a) If the pending sequence is non-empty, egptr() is non-null and egptr() - gptr() characters starting at gptr() are the characters in the pending sequence
b) If the pending sequence is empty, either gptr() is null or gptr() and egptr() are set to the same non-NULL pointer.
An underflow() that does not do that causes istream::getc() to have unspecified behavior (anything can happen).
Btw., underflow() shouldn't call uflow() because uflow() is required to call underflow(). That could be one of the reasons why your program loops indefinitely (it simply fails with our latest implementation of iostreams).
With FILE_buffer::underflow() modified as follows the program works as I would expect with Apache C++ Standard Library as well as with libCstd and libstlport: it reads the first 4 characters from the file and prints them to stdout.
Thanks for the update - I'll look at adding setg into the original code, although I have taking this track once before.
Also, I took out the braindead uflow at the end of the test code (what can I say, I was cutting and pasting), but it
does not stop the looping of underflow.
Let me see if I can re-iterate:
- underflow is called because it meets at least one of the criteria that you have listed, the underflow as coded,
does nothing more than peek at the stream and puts it back to the way it found it and returns.
- Read should start to read the stream because underflow has returned and should not be called again because
it will not meet any of the criteria.
- What I think I am seeing - one of the criteria seems to always be met, and underflow is always getting called.
And it seems that gptr is always NULL.
- The code segment you sent - works because you are reading again ?? what happened to the
original read call ??
langston2 wrote:
- underflow is called because it meets at least one of the criteria that you have listed,
underflow() is called because istream::read() is required to extract characters as "if by" calling sgetc() which, in turn, is required to call underflow() when (gptr() == egptr()) which is initially (when a streambuf object is first constructed) required to hold.
the underflow as coded,
does nothing more than peek at the stream and puts it back to the way it found it and returns.
Right. I was mistaken when I suggested that underflow() must either set up (gptr() < egptr()) or return EOF. After re-reading the spec I believe that underflow() is allowed to return a non-EOF value without setting (gptr() < egptr()).
- Read should start to read the stream because underflow has returned and should not be called again because
it will not meet any of the criteria.
read() will call underflow() until it has extracted the specified number of characters or until it has reached the EOF. An underflow() that doesn't set up a non-empty get area will cause read() to call it for every extracted character.
You can see stdcxx 4.2.1 implementation of read() starting on line 362 of istream.cc.
This implementation calls sgetn() rather than sgetc() in a loop as an optimization. sgetn() is the public interface to xsgetn() which you can see here starting on line 71 of streambuf.cc.
Looking at it closely, I think there is a bug on lines 93-94. The function assumes that when the pending sequence is empty the input sequence has been exhausted, i.e., that it's reached the EOF. But that's not true when underflow() sets (gptr() == egptr()) and returns something other than EOF s. I opened STDCXX-1026 and checked in a fix for it. See the corrected xsgetn on the 4.2.x branch of stdcxx.
- What I think I am seeing - one of the criteria seems to always be met, and underflow is always getting called.
And it seems that gptr is always NULL.
- The code segment you sent - works because you are reading again ?? what happened to the
original read call ??
- Is this a bug w/ studio or not ?
- Is the expected behavior in stdcxx and studio with your stdcxx change now the same ?
- For my last question - in the test case, a read occurs, then in the code
segment you provided, you added an fread in the underflow, did this second
read take care of what is happening, if so, what happened with the first read ? Even
though the test case only does one read, my original code will read in a loop until the
file as been read. Can I just change the first read to a fread and get the same
result, without changing underflow ?
langston2 wrote:
- Is this a bug w/ studio or not ?
I believe the test case in STDCXX-1026 should pass. If it fails it's a bug in the implementation of the library.
- Is the expected behavior in stdcxx and studio with your stdcxx change now the same ?
No. In my experiments, no implementation except for the unreleased stdcxx 4.2.2 (with the fix for STDCXX-1026) passes the test.
- For my last question - in the test case, a read occurs, then in the code
segment you provided, you added an fread in the underflow, did this second
read take care of what is happening, if so, what happened with the first read ? Even
though the test case only does one read, my original code will read in a loop until the
file as been read. Can I just change the first read to a fread and get the same
result, without changing underflow ?
With the fix for STDCXX-1026, this is the simplest implementation of underflow() that "works" i.e., that will let you call istream::read() in a loop to read a whole file.
To test it, I replaced the call to read() in main() with this loop (I also increased the size of buffer to bufferLength 1 characters to fit the terminating NUL):
do {
in.read(buffer, bufferLength);
buffer [in.gcount ()] = '\0';
cout << buffer;
} while (in);
No other implementation that I've tried works. libCstd appears to hang in an infinite loop, libstlport4 crashes with a SIGSEGV, and so does g+ 4.3.1. The Dinkumware implementation that comes with IBM XLC++ fails as well: it reads 4 NULs from the stream instead of "0123" as expected.
Thanks for the ideas and pointers - digging further into the code - the scenario is that on Linux/g++, the read is not going into underflow, whereas, the same code segment w/ Solaris/Studio does go into underflow, when Solaris/Studio goes into underflow, it goes into the tight loop recursively calling itself. On the Linux/g++ side, the code never enters into underflow, instead reads the data and loops until the file is read.
this is the read which loops on Solaris/Studio via underflow, I am researching what parameters would need
to be set for this read to skip the underflow.
I'm afraid that now that I've opened the bugs that I was actually right the first time: underflow() must either return EOF indicating that the pending sequence is empty or set up (gptr() < egptr()) so that the pointers denote an initial subsequence of the non-empty pending sequence. Neither your program nor the test case I created does this: they both return a non-EOF and leave (gptr() == egptr()), for which the standard doesn't specify behavior. To use your FILE_buffer with iostream classes your underflow() must either set up a non-empty read area or return EOF, otherwise the behavior is undefined. Sorry about confusing things!
Do you have a conclusion to your test case which will signify the behavior of underflow as you indicate, for which I can apply to my test case ?
I'm more than happy to re-write the underflow segment if it is not correct, but I have not been able to find clear examples which will show the
true desired behavior. I am not clear of what underflow does when it is not overridden, what it should look like where I'm trying to gather
information, but yet, leave it in a state where the read would continue as normal after I have gathered my information.
Also, if we shuffle the pointer, does this have a negative effect on read(), which is what is starting the sequence ?
I don't think the spec is quite clear on this so I ended up raising the issue on the C++ committee's list (the full text of my post is in this [comment|http://tinyurl.com/6cf58y#action_12655387|difference between null and empty pending sequence] on [STDCXX-1026|http://issues.apache.org/jira/browse/STDCXX-1026|sgetn() fails to extract characters from an unbuffered stream]). In the meantime, you'll be safer if you set up a read area in your underflow() (i.e., gptr() < egptr()), along the lines I showed comment #2.
To understand how underflow() works you need to study a good reference. Looking at/stepping through an open implementation such as [stdcxx|http://stdcxx.apache.org/|Apache C++ Standard Library] will also help. You can view the definition of basic_filebuf::underflow() starting on line 167 of fstream.cc.
Here are some of the references that I've found helpful: