Question : Apache - Tomcat TCP Connection problem - CLOSE_WAIT / FIN_WAIT_2

Hi,
We're running Apache 6.0.20 on Solaris 10 platform. I am getting an error if I run tomcat instance for longtime. I have noticed that this problem go away if I restart Apache. The problem is TCP/IP CLOSE_WAIT and FIN_WAIT_2 states being created each time user connects and logs out. If this continues to happen then no more users could connect; I have to bounce Apache using apache2/bin/apachectl stop / start (it will solve TCP/IP problems). I appreciate if you could help me to resolve this issue.

admin@webserver: netstat -an | grep 9034
      *.9034               *.*                0      0 49152      0 LISTEN
127.0.0.1.64130      127.0.0.1.9034       49152      0 49152      0 CLOSE_WAIT
127.0.0.1.9034       127.0.0.1.64130      49152      0 49152      0 FIN_WAIT_2
127.0.0.1.64131      127.0.0.1.9034       49152      0 49152      0 CLOSE_WAIT
127.0.0.1.9034       127.0.0.1.64131      49152      0 49152      0 FIN_WAIT_2
127.0.0.1.64132      127.0.0.1.9034       49152      0 49152      0 CLOSE_WAIT
127.0.0.1.9034       127.0.0.1.64132      49152      0 49152      0 FIN_WAIT_2
admin@webserver: netstat -an | grep 9044
      *.9044               *.*                0      0 49152      0 LISTEN
127.0.0.1.64127      127.0.0.1.9044       49439      0 49152      0 CLOSE_WAIT
127.0.0.1.9044       127.0.0.1.64127      49152      0 49439      0 FIN_WAIT_2
127.0.0.1.64128      127.0.0.1.9044       49152      0 49152      0 CLOSE_WAIT
127.0.0.1.9044       127.0.0.1.64128      49152      0 49152      0 FIN_WAIT_2
127.0.0.1.64129      127.0.0.1.9044       49152      0 49152      0 CLOSE_WAIT
127.0.0.1.9044       127.0.0.1.64129      49152      0 49152      0 FIN_WAIT_2

Answer : Apache - Tomcat TCP Connection problem - CLOSE_WAIT / FIN_WAIT_2

Connections in CLOSE_WAIT are generally an application problem. I have seen a lot of middleware applications that interacted with the web applications to cause CLOSE_WAIT. Notice that the FIN_WAIT_2 connections are just the other side of a connection that is in CLOSE_WAIT.

CLOSE_WAIT is caused when one side of the connection has closed its end of the TCP connection, but the other end is still open. On UNIX based systems, the socket is not closed until the last process that has it open closes it. That means that either the application that owns it is either not reading the socket (and thus does not know that the other end closed it) or just isn't closing it, or there is another process with that same socket open.

This is why we see this problem with middleware. There tends to be a lot of incoming connections all the time, and different threads accept the connections and then fork off a new process to handle it. If between the accept and the fork, another connection is accepted, when the fork happens the second process now has both sockets open, but only knows about one. That means that the first process cannot close its socket until the second process exits. If the second process is very long lived, then the connection sticks in CLOSE_WAIT.

I also saw a case where there was a race condition in the server itself and the sockets accumulated there.

Sometimes the middleware or application servers expects that the application is supposed to close all unused sockets. Sometimes it is just a bug at one of the three levels.

What you should do is identify which process has the socket open. You can use the pfiles call (cd /proc ; pfiles *) to find the processes involved. Then try to figure out if it is a process that legitimately had the socket and is not closing it, or if it is a process that should not have had it at all. This will give you an idea what is going on.
Random Solutions  
 
programming4us programming4us