Ticket #97 ( Closed )

Short Description compiler problem
Entered By: PeteL When: 1998-07-21 06:07:17 Build: 1.03.19a
Categories Type: Problem   Department: Product   Category: DevCenter Compiling
Description
the last couple of times i have tried to compile on the production server (multi-processor), it has gone into a never-ending loop saying something like "waiting for term-port stop" - this is very early in the morning, when there is definitely nobody on the server. when that happens, i have to reboot - the surfer control function no longer works. this is a very bad situation - i guess i could take the service down first, but again, very bad.
Append By: PeteL  When: 1998-07-21 06:24:21  New Status: Pending IE
Comment update - the surfer service controller did eventually get the service down - didn't have to reboot.
Append By: WindSurfer  When: 1998-07-21 13:09:07  New Status: Pending Customer
Comment You can also kill the ielua.exe process from the task manager in this situation.

This is part of the ongoing attempt to prevent the trap-- next is to determine where the tasks are during the loop-- we will put-in additional diagnostics to ID where the task thread(s) are when they don't go away.

What was happening before was the task stop didn't wait, did the recompile and you got the trap...

Append By: PeteL  When: 1998-07-21 13:54:36  New Status: Pending IE
Comment I assume that a thread in this case is supposed to represent a live in-transit http request? There shouldn't be one for each active session, right?
Append By: WindSurfer  When: 1998-07-21 18:07:49  New Status: Pending Customer
Comment There is one per concurrent "hit", and I believe something like five are started at the beginning as a pool. When the number of concurrent hits exceeds the number of active threads, the number is expanded.

There is logic which then tells threads to go away after a bit, until they return to the base number.

We will review the logic which keeps track of the number of open threads-- along with logic relating to how a thread could remain open even while being signaled to end.

This is a bit loose, mainly because we don't want to just kill threads to do the compile; we need to make sure each active hit is complete before quiescing all the web gateway threads and proceeding with the compile.

Append By: WindSurfer  When: 1998-07-22 10:37:35  New Status: Pending Customer
Comment New fix available at ftp://ftp.ieinc2.com/pub/surfer/t1_3_20a.zip
Append By: PeteL  When: 1998-07-22 10:53:21  New Status: Pending IE
Comment how clean is the fix? i'm in a very precarious position with changing code that might break...
Append By: WindSurfer  When: 1998-07-22 13:21:36  New Status: Pending Customer
Comment Build 20 has only a few more fixes past 1.3.19f, all relating to the termport closing properly.

We think that this time the problem has been properly identified (a single integer value being incremented and decremented by multiple threads without protection using a mutex semaphore-- this will particularly be aggravated in a multi-cpu machine). Not only has the HTTP thread counter been properly managed with a new semaphore, but the compile code additionally checks whether or not threads are still active using another method.

Append By: PeteL  When: 1998-07-23 10:27:03  New Status: Pending IE
Comment I have not had a compiler crash since putting the new code up, but the most recent time I compiled on the server, it appears as though the port was never reopened - all requests just hung. I recycled the service and all was well.
unrelated-but-related question - since you know internally how many threads are active, have you given any thought to providing some sort of statistics, i.e. max concurrent threads etc.?
Append By: WindSurfer  When: 1998-07-23 13:30:54  New Status: Pending Customer
Comment Any error messages in the event log when the terminal port didn't re-open? This is as bad as a crash...can you try doing a shutdown and restart from the console next time this happens?

Sure-- will add-in some statistics, which gives you an idea of maximum concurrent hits...