Ticket #49: crash

Ticket #49 ( Closed )

Short Description	crash
Entered	By: PeteL When: 1998-06-05 07:01:18 Build: 1.03.13b
Categories	Type: Problem Department: Product Category: Not Categorized
Description	more Dr. Watsons, sent via email

Append	By: WindSurfer When: 1998-06-05 07:43:45 New Status: Pending Customer
Comment	Any chance tracing was running when these ocurred? Any other environmental clues? Traps occurred at 6/4/1998 @ 13:35:52.933 and 6/4/1998 @ 9:52:57.348--can you check your event log for any reports at these times? Both faults occur when only one HTTP task is active--was IELUA running for very long prior to the traps? Finally--any chance IELUA had just been started and the machine is totally out of memory??? This trap on complete analysis looks a lot like an out of memory situation.

Append	By: PeteL When: 1998-06-05 09:47:48 New Status: Pending IE
Comment	no tracing pretty sure that both times, there was an out-of-memory error box showing this is probably the clue you're looking for, though: Socket error code 2740 occurred during TermPort Socket Bind The system description for this error code is:WSAEADDRINUSE: The specified address is already in use. (See the SO_REUSEADDR socket option under setsockopt). It immediately preceeded both abends. also, we did have some network problems yesterday, router problems - the PU was up and down a couple of times, don't think the times directly corresponded, but maybe SNA Services didn't clean up real gracefully, maybe consumed too much memory or something.

Append	By: WindSurfer When: 1998-06-08 13:47:05 New Status: Pending Customer
Comment	Memory error box is most likely the cause. The error analysis showed that an out-of-memory condition was the most likely culprit and the box showing is good evidence (about as good as you can get!!). We have put-in error detection on the allocation of the HTTP task management structures, which is a fairly good size allocation, and is at startup of the Web server port. So next time it runs out of memory at this point in time you won't get a trap! Meanwhile, unless it was a wild rampage by SNA Server, you might want to bump-up your page file sizes in the system settings, to reduce the incidence of an out-of-memory condition...

Append	By: PeteL When: 1998-06-08 14:05:17 New Status: Pending IE
Comment	i'll be installing the new server in the next day or two, so probably won't be spending much time tuning the test box. although there was obviously a memory problem, doesn't the "address in use" situation cause some alternative concern? since we had network problems, could it be that you were going through some sort of socket allocation when the initial one was never cleaned up, possibly even causing the memory problem? (keep in mind that i know-not-of-what-i-speak)

Append	By: WindSurfer When: 1998-06-11 11:31:37 New Status: Pending Customer
Comment	It's hard to say, but the trap is definitely caused by an out-of-memory condition, and my guess is that the socket error is another symptom of the same problem, as the "address in use" could be a red herring (out of memory has real spotty error detection in most code; in a virtual environment like NT, it is such a "fatal" that in many cases even detecting it has no real "path to salvation"!).

Append	By: PeteL When: 1998-06-15 10:53:06 New Status: Pending IE
Comment	I think you had more updates, probably via email instead of in here - isn't this the problem that you suspected may have been a small hole in the compile quiesce?

Append	By: WindSurfer When: 1998-06-15 17:13:31 New Status: Pending Customer
Comment	It's hard to tell, but the other crash was on the console port, and really would only occur there...but when the stack is affected it is hard to tell.

Append	By: PeteL When: 1998-06-16 06:07:27 New Status: Pending IE
Comment	so do we just close this one? it's been pretty stable for a few days now, and I have a new build in, so .. ?