[ LUGOS ] [Fwd: kerneld/multicast bug (tickled by gated)]
Andrej Presern
andrejp na luz.fe.uni-lj.si
Ned Jun 15 16:02:17 CEST 1997
Mogoce je tole pojasnilo v zvezi z vasimi tezavami z request-route.
Andrej
Brian Candler wrote:
>
> Greetings,
>
> I am writing this from Malaysia, where I am very privileged to be working
> alongside Sue Hares (of gated fame) and other experts at the Inet'97
> developing countries workshop.
>
> This year, for a number of reasons, we have decided to use Linux as the
> teaching platform for Track 1 (as opposed to FreeBSD in previous years). We
> have a lab of 23 Linux boxes, running Red Hat 4.1 + some packages from 4.2 +
> a custom 2.0.30 kernel with Multicast enabled. A nice side effect is that I
> have been able to persuade Sue to make gated compile and run cleanly under
> Linux :-)
>
> In running gated on this network, we have discovered a Linux kernel/kerneld
> bug, and I wonder if anyone on this list might be able to shed some light on
> it (or even propose a fix)
>
> The symptoms are as follows:
>
> 1. gated sometimes hangs.
>
> 2. When it is in the hung state, it can still be made to dump core
> ('gdc COREDUMP'), and gdb shows that it was waiting in setsockopt() for
> an IP_ADD_MEMBERSHIP request.
>
> 3. At the same time as gated hangs, an extra process is in the kernel table:
> request-route <zombie>
> which is a child of kerneld
>
> 4. If you kill kerneld, gated suddenly wakes up again (if you didn't make
> it dump core first, that is)
>
> Our solution for the workshop is simple - we run the whole system without
> kerneld, and everything is fine. Perhaps we could instead delete or rename
> /sbin/request-route. It would be nice to get to the root of this problem
> though, and for me there are a number of questions:
>
> - why is the kernel telling kerneld to invoke a userland routing script
> when you change the interface multicast group list?
>
> - why is kerneld not reaping its child?
>
> - why is setsockopt blocking on kerneld?
>
> Red Hat 4.1 has the modules-2.0.0 package. As far as I can see, kerneld
> forks before execlp()ing the script so I have no idea how this can cause the
> blocking.
>
> Thanks for any ideas you have...
>
> Regards,
>
> Brian Candler.
Dodatne informacije o seznamu Starilist