[py-dev] Getting 'close failed: [Errno 9] Bad file descriptor' in several tests

holger krekel hpk at trillke.net
Fri May 13 22:33:06 CEST 2005


Hey Grig, 

at the very minute your mail arrived i wanted to start mailing you :-) 

On Fri, May 13, 2005 at 12:49 -0700, Grig Gheorghiu wrote:
> Just a heads-up. I'm still looking into this issue -- I got deep in the
> bowels of execnet gateways, channels and messages. This may be
> caused by some sort of race condition which causes a channel to get
> deleted, then to continue to receive data.  

Actually i noticed similar problems than the ones you
mentioned earlier on linux platforms, usually while execing
with e.g. '--exec=python2.4'.  There probably is some race
condition and likely it's "just" an unclean shutdown procedure
(shutdown is really quite involved with network code and i
guess the according py.execnet code needs a thorough human
review because it's usually hard to find problems by just
looking at failing or oddly-running tests). 

> Also, I always see this
> message, which may also be a symptom:
> 
> waiting for pid <PID>
> child process <PID> already dead? error:[Errno 10] No child processes

That's probably related to my not knowing how to properly 
wait for a child process to terminate on win32.  If you
don't do a waitpid(childprocessid) on unix then your child
process will become a "zombie".  Is that even a problem on 
win32?  If not, then one might try to just skip this
code for win32?! 

> Here's an output with debugging messages. The test_session1.py file is
> a copy of py/test/testing/test_session.py file, with only the test_exec
> test in it.
>
> Note how the channel with id 5 gets deleted at some point, then it
> still receives data (I modified a bunch of files in execnet so that I
> can print more helpful debugging info):

feel free to commit tracing additions. 

> C:\py\py\test\testing>py.test test_session1.py  -s
> inserting into sys.path: C:\py
> ============================= test process starts
> =============================
> testing-mode: inprocess
> executable:   C:\Python24\python.exe  (2.4.0-final-0)
> ***** svn info C:\py\py
> using py lib: C:\py\py <rev unknown>
> 
> test_session1.py[1] C:\Python24\python.exe
> sending gateway bootstrap code
> Creating channel with id 1
> sent -> <Message.CHANNEL_OPEN channelid=1 len=54>
> Creating channel with id 3
> Creating channel with id 5
> Creating channel with id 7

The last two channels are created for receiving the output
from the redirected sys.stdout/stderr from the remote side. 

> sent -> <Message.CHANNEL_OPEN channelid=3 len=89>
> sent -> <Message.CHANNEL_DATA channelid=3 len=114>
> received <- <Message.CHANNEL_DATA channelid=1 2960>
> Got CHANNEL_DATA for channel 1
> received <- <Message.CHANNEL_CLOSE channelid=1 ''>
> Got CHANNEL_CLOSE; closing channel 1
> deleting channel mapping 1
> received <- <Message.CHANNEL_DATA channelid=5 len=83>
> Got CHANNEL_DATA for channel 5
> received <- <Message.CHANNEL_DATA channelid=5 'testing-mode: child
> process\n'>

So channelid=5 is really responsible for stdout. 

> Got CHANNEL_DATA for channel 5
> received <- <Message.CHANNEL_DATA channelid=5 len=59>
> ... 
> ...
> Got CHANNEL_DATA for channel 3
> received <- <Message.CHANNEL_CLOSE channelid=5 ''>

here the stdout-channel gets closed at the other side ... 

> Got CHANNEL_CLOSE; closing channel 5
> deleting channel mapping 5
> received <- <Message.CHANNEL_DATA channelid=5 'Traceback (most recent
> call last)
> :\n'>

but we still receive stdout-data from the other side.  
This should never happen.  I guess that the channel close-logic 
is not correct in that it deletes the channel mapping too early. 
Or the redirection-per-thread code in py/thread/io.py is not 
working completely.  Or both :-/  

cheers, 

    holger


More information about the py-dev mailing list