Ship it :-)
From: Mitch Hayenga
<mitch.hayenga+g...@gmail.com<mailto:mitch.hayenga+g...@gmail.com>>
Reply-To: gem5 users mailing list
<gem5-users@gem5.org<mailto:gem5-users@gem5.org>>
Date: Monday, 5 August 2013 03:57
To: gem5 users mailing list <gem5-users@gem5.org<mailto:gem5-users@gem5.org>>
Subject: Re: [gem5-users] panic: ListenSocket(listen): listen() failed!
Hi Ali,
There is actually a minor bug/race condition in the gem5 ListenSocket::listen
function (src/base/socktet.cc). I think Hao might be hitting this, I just
haven't had time time to upload the patch for it to the mainline. I hit this
when launching hundreds of simulations at the same time (on the same cluster
that Hao is using).
gem5 makes the incorrect assumption that by "binding" a socket, it effectively
has allocated a port. Linux only allocates ports once you call listen on the
given socket, not when you call bind. So even if the port was free when bind
was called, another process (gem5 instance) could race in between the bind &
listen calls and steal the port. It's a small race condition, but it is there.
In the current code, if the call to bind fails due to the port being in use
(EADDRINUSE), gem5 retries for a different port. However if listen fails, gem5
just panics. The fix is testing the return value of listen and re-trying if
it was due to EADDRINUSE.
Here is my file's diff:
diff -r a5943fcb8b22 src/base/socket.cc
--- a/src/base/socket.ccSun May 05 16:38:11 2013 -0500
+++ b/src/base/socket.ccSun Aug 04 21:48:46 2013 -0500
@@ -103,11 +103,13 @@
return false;
}
- if (::listen(fd, 1) == -1)
- panic("ListenSocket(listen): listen() failed!");
+ if (::listen(fd, 1) == -1) {
+ if (errno != EADDRINUSE)
+ panic("ListenSocket(listen): listen() failed!");
+ return false;
+ }
listening = true;
-
anyListening = true;
return true;
}
On Sun, Aug 4, 2013 at 9:26 PM, Ali Saidi
<sa...@umich.edu<mailto:sa...@umich.edu>> wrote:
gem5 opens up a number of ports when it starts for the terminal, debugging,
etc. However if a number of gem5 instances startup at the same time they can
conflict and you'll see the issue below.
If you add m5.disableAllListeners() to the python script your problem will go
away.
Ali
On Aug 4, 2013, at 8:45 PM, Hao Wang
<pkuwa...@gmail.com<mailto:pkuwa...@gmail.com>> wrote:
> Hi.
>
> I get the following error:
> panic: ListenSocket(listen): listen() failed!
> @ cycle 0
> [listen:build/ALPHA/base/socket.cc, line 107]
>
> when I tried to run hundreds of simulations on a cluster.
>
> Each one is a single-core simulation in SE mode.
> And I tried to limit the number of simulations on one node/machine to 4, but
> this error still happens randomly.
>
> Any suggestions?
>
> Hao
>
> _______________________________________________
> gem5-users mailing list
> gem5-users@gem5.org<mailto:gem5-users@gem5.org>
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org<mailto:gem5-users@gem5.org>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
-- IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users