Linux 版 (精华区)
发信人: netiscpu (说不如做), 信区: Unix
标 题: [M] Linux2.1.99中的网络问题
发信站: 紫 丁 香 (Tue May 19 00:50:11 1998), 转信
本文整理自Linux-kernel Mailling List
[问题] 2.1.98 link freezes --> 2.1.99 machine locks up
Bug report (FYI)
Network toplogy:
ethernet 28.8kbps modem line
adam------------asgard-------------------------adamhome
2.1.99 2.1.98 2.0.0
When my desk workstation (adam.yggdrasil.com) was runing
Linux 2.1.~75-2.1.98 a direct telnet session from my home notebook
computer (adamhome) to my desk workstation would sometimes freeze
for about ten minutes and then finally close with an error like
"connection reset by peer." During this time, I could make other
directly telnet connections, but those connections seemed to freeze
much more easily if one connection was already frozen. The freezes
always occurred when at least one screenful of data was in the process
of being sent from my desk workstation to my home notebook computer.
Note that rlogin connections would freeze as often as telnet connection,
but I never had an any other type of connection freeze, such as FTP,
although I use such connections very rarely.
Somebody had suggested to me that the problem was probably
in the fact that my home computer was running 2.0.0, and that I
should upgrade that first, which is completely reasonable. (I am
fighting what appears to be a bug in the regular msdos filesystem
under 2.1.99 to do that upgrade right now, but that's another matter.)
Anyhow, when I upgrade my desk workstation, the behavior changed.
Instead of just the one telnet connection freezing, my 2.1.99 desktop
workstation locks up hard. There is no final kernel message and the
keyboard is completely nonresponsive. So, this definitely is
a problem with the 2.1.99 networking.
This problem never occurs when if I telnet to the machine
in the middle and then to my desktop workstation, and it only
occurs when a lot of output is about to be sent to my home
machine. Therefore, I believe that the problem is probably
triggered by a packet being dropped.
Anyhow, I hope this information is useful to those tinkering
with the networking code (I think this is primarily Dave Miller right
now) and anyone trying to isolate a similar problem.
Adam J. Richter __ ______________ 4880 Stevens Creek Blvd, Suite 205
adam@yggdrasil.com \ / San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l United States of America
fax +1 408 261-6631 "Free Software For The Rest Of Us."
[回答1]
There is nothing in the current 2.1.pre100 networking which should
ever cause a total lockup, I stress the code pretty heavily on
multiple platforms, so I'd imagine I'd see any really bad lockups
before anyone else. So let's not jump to TCP stack conclusions for
the moment.
Instead, what sort of networking card do you use in this machine, and
is it SMP?
Later,
David S. Miller
davem@dm.cobaltmicro.com
[回答2]
The ethernet cards in both my desktop workstation and the
server with the modems on it are from NetGear and use the DEC tulip
chip (with the tulip.o driver). The ethernet runs at 100mbps.
Both machines are uniprocessors running SMP kernels where
everything that can be a module is (except for the initial ramdisk
and romfs which are used to bootstrap everything else).
Adam J. Richter __ ______________ 4880 Stevens Creek Blvd, Suite 205
adam@yggdrasil.com \ / San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l United States of America
fax +1 408 261-6631 "Free Software For The Rest Of Us."
[回答3]
There has been a lot of activity in the Intel/SMP kernel changes area,
in particular with interrupt handling and the like, let's make sure
this is not where you are getting bit before we consider it a
networking issue for now ok?
Later,
David S. Miller
davem@dm.cobaltmicro.com
[回答4]
I've noticed with my PCI NE2000 clone that when I loose network
connectivity there are kernel logs about lost interrupts. Have you
noticed any of these?
Regards,
Richard...
[回答5]
Richard Gooch writes, regarding my machine locking up under 2.1.99
in the midst of transmitting data to my 2.0.0 notebook computer
via an intermediate modem server running 2.1.98:
>I've noticed with my PCI NE2000 clone that when I loose network
>connectivity there are kernel logs about lost interrupts. Have you
>noticed any of these?
No. I have checked the logs. I also went to the trouble of deactivating
the screen saver before the most recent lock up to make sure there
were no "last gasp" messages from the kernel.
Adam J. Richter __ ______________ 4880 Stevens Creek Blvd, Suite 205
adam@yggdrasil.com \ / San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l United States of America
fax +1 408 261-6631 "Free Software For The Rest Of Us."
[回答6]
I rebuilt the kernel without SMP and was able to recreate the problem.
Adam J. Richter __ ______________ 4880 Stevens Creek Blvd, Suite 205
adam@yggdrasil.com \ / San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l United States of America
fax +1 408 261-6631 "Free Software For The Rest Of Us."
[回答7]
Ok, plan B then:
1) Are you using the tulip.o driver in the 2.1.{99,100} tree or
the "Becker driver release of the day"? If the later, please
try with the driver in the tree as that is what I am testing
in all my machines.
2) Barring that, and since you can reproduce it easily, run tcpdump
on the same subnet as the machine which locks up, have it just
listen to packets to/from this test machine, once it locks, send
the tcpdump log to me.
It should be trackable soon enough.
Later,
David S. Miller
davem@dm.cobaltmicro.com
[回答8]
Seems like MTU problem for me. What does ping -s 30000 say?
Pavel
--
The best software in life is free (not shareware)! Pavel
GCM d? s-: !g p?:+ au- a--@ w+ v- C++@ UL+++ L++ N++ E++ W--- M- Y- R+
[完]
--
Enjoy Linux!
-----It's FREE!-----
※ 来源:.紫 丁 香 bbs.hit.edu.cn.[FROM: mtlab.hit.edu.cn]
Powered by KBS BBS 2.0 (http://dev.kcn.cn)
页面执行时间:2.685毫秒