[Info-Ingres] Unable to make outbound connections all of a sudden
steveg15
steve.guy at atosorigin.com
Thu Feb 8 07:56:35 CST 2007
On Feb 8, 12:23 pm, "steveg15" <steve.... at atosorigin.com> wrote:
> Hi folks
>
> We has a live production installation with 250+ clients connected via
> an openroad 4.1 application, the b/e is AIX 5.2 . This has had no h/w
> changes for 2 years, no upgrade to Ingres (II 2.6/0305 (rs4.us5/00)
> patch 10384), no CBF changes or PC changes (NODES , that we've been
> told about) . The o/r application has not changed for some months. We
> have load balancing with 5 iigcc's starting (one interesting point is
> the fact we now only appear to have one of them clocking up CPU time
> when looking at a ps -ef output).
>
> Two days ago the following error started appearing in the Ingres
> errlog :-
>
> Y8707E ::[37528 IIGCC, 0000000d]: Wed Feb 7 16:03:18 2007
> E_CLFE05_BS_CONNECT_ERR Unable to make outgoing connection.
> Y8707E ::[37528 IIGCC, 0000000d]: System communication error:
> Socket is not connected.
>
> A restart to Ingres and all will work ok for an hour or so.
>
> Another Company supports the AIX server, although we do have Ingres
> access (but not root), they are saying this is an Ingres problem.....
> We do also have access to errpt and there is an error showing :-
>
> LABEL: GOENT_LINK_DOWN
> IDENTIFIER: DED8E752
>
> Date/Time: Thu 8 Feb 08:03:23 2007
> Sequence Number: 515851
> Machine Id: 005C90DF4C00
> Node Id: Y8707E
> Class: H
> Type: TEMP
> Resource Name: ent1
> Resource Class: adapter
> Resource Type: 14106902
> Location: U0.1-P1-I1/E1
> VPD:
> Product Specific.( ).......10/100/1000 Base-TX PCI-X Adapter
> Part Number.................00P4501
> FRU Number..................00P4501
> EC Level....................H12511
> Manufacture ID..............YL1021
> Network Address.............000255535899
> ROM Level (alterable).......GOL002
> Description
> ETHERNET DOWN
>
> Probable Causes
> CABLE
> CSMA/CD ADAPTER
>
> Failure Causes
> LINK TIMEOUT
>
> Recommended Actions
> CHECK CABLE AND ITS CONNECTIONS
>
> Detail Data
> FILE NAME
> line: 191 file: goent_limbo.c
> PCI ETHERNET STATISTICS
> 0000 0005 0063 0853 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0001
> 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0002 0000 0001 0000 0001 0000 0000
> 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 BB80 00F0 0249 0C00 0000 0000 01A0
> 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
> DEVICE DRIVER INTERNAL STATE
> 3333 3333 0000 0000 0000 0000
> SOURCE ADDRESS
> 0002 5553 5899
>
> We are being told this is a physical NIC which is not been used and
> nothing to do with the problem. ....
>
> Whilst the fault is present you cannot access IPM (except with -s) or
> iinamu. Even when you first re-start Ingres, although you can start
> IPM you cannot access the "Server" option?
>
> I've seen a number of old posts on here on the subject, but little in
> the way of responses, so assume this is either a silly question,
> vague or not a well known subject. Funny enough those looking after
> the server have told us they've requested an engineer to look at the
> server (probably 24-48 hours away...)? My knowledge of TCP/IP is nil,
> but from what I've read this looks like a server problem related to
> TCP sockets i.e. a server problem. But any suggestions would be most
> welcome as there are 250 user that can't work.
>
> Steve
Paul
The config.dat parameters have not changed.
Ingres is started in batch (by us) as Ingres
Netstat shows a number of users connected (when any of them can
connect), but the bulk of them are using a single port. Out of 194
users 180 will be against the second port (18065 in our case) and the
rest split over the other 4.
One significant factor that's since come to light is the fact the
licence has been out of date since 01/01/07 (this is not supported by
us). The CA site indicates there is some problem, but what this
results in is somewhat vague... http://supportconnectw.ca.com/public/
ca_common_docs/lic_expire.asp. What's been said in this looks to me
like CA do not know the implications, but I think the needs to be
fixed a.s.a.p. There is and engineer looking at the AIX server as I
write....
Steve
More information about the Info-Ingres
mailing list