|
|
|
|
|
|
| Author |
Message |
Syltrem *nix forums Guru Wannabe
Joined: 02 May 2005
Posts: 131
|
Posted: Tue Jul 18, 2006 5:45 pm Post subject:
2 CPU's in timeout on my ES40
|
|
|
Has anyone experienced CPU's ceasing to be in service on an ES40 ?
I've never seen this before, there are no errors on the system or reported
by DIAG
2 CPUs out of 4 are not working right now.
$ sh cpu
System: HELIOS, AlphaServer ES40
CPU ownership sets:
Active 0,2
Configure 0-3
CPU state sets:
Potential 0-3
Autostart 0-3
Powered Down None
Not Present None
Failover None
$ sh err
Device Error Count
$1$DQB0: (HELIOS) 1
$1$DQB1: (HELIOS) 1
I tried:
$ set cpu/start 1
%SYSTEM-W-WRONGSTATE, CPU 1 is in the wrong state for the requested
operation
$
But I'm not familiar with these commands, such a problem never happened to
me (and we never had Galaxy)
If you have an idea why this would be, and have a solution, pls write !
Is this a situation where VMS needs to be rebooted just like Windows?
The 2 CPUs may not have restarted since the system was powered back on after
having failed because of the high temperature in the computer room. The
reading with f$getsyi("temperature_vector") showed 100 F at the time. The
system was rebooted when the temperature in the room started to go down.
Thanks
--
Syltrem
http://pages.infinit.net/syltrem (OpenVMS information and help, en français) |
|
| Back to top |
|
 |
John Santos *nix forums Guru Wannabe
Joined: 25 May 2005
Posts: 189
|
Posted: Tue Jul 18, 2006 10:26 pm Post subject:
Re: 2 CPU's in timeout on my ES40
|
|
|
Syltrem wrote:
| Quote: | Has anyone experienced CPU's ceasing to be in service on an ES40 ?
I've never seen this before, there are no errors on the system or reported
by DIAG
2 CPUs out of 4 are not working right now.
$ sh cpu
|
$ show cpu/full might tell more.
| Quote: | System: HELIOS, AlphaServer ES40
CPU ownership sets:
Active 0,2
Configure 0-3
CPU state sets:
Potential 0-3
Autostart 0-3
Powered Down None
Not Present None
Failover None
$ sh err
Device Error Count
$1$DQB0: (HELIOS) 1
$1$DQB1: (HELIOS) 1
I tried:
$ set cpu/start 1
%SYSTEM-W-WRONGSTATE, CPU 1 is in the wrong state for the requested
operation
|
Maybe you need $ start/cpu? The difference between "set cpu/start"
and "start/cpu", at lest in the online help, is clear as mud, but
they might do different things.
| Quote: | $
But I'm not familiar with these commands, such a problem never happened to
me (and we never had Galaxy)
If you have an idea why this would be, and have a solution, pls write !
Is this a situation where VMS needs to be rebooted just like Windows?
The 2 CPUs may not have restarted since the system was powered back on after
having failed because of the high temperature in the computer room. The
reading with f$getsyi("temperature_vector") showed 100 F at the time. The
system was rebooted when the temperature in the room started to go down.
Thanks
|
--
John Santos
Evans Griffiths & Hart, Inc.
781-861-0670 ext 539 |
|
| Back to top |
|
 |
Volker Halle *nix forums beginner
Joined: 29 Jul 2005
Posts: 33
|
Posted: Wed Jul 19, 2006 7:28 am Post subject:
Re: 2 CPU's in timeout on my ES40
|
|
|
Syltrem,
$ ANAL/SYS
SDA> CLUE CONFIG
may provide additional information about the state of those 2 CPUs.
Consider to capture the console output during the next boot to find
possible messages being logged on the console.
Volker. |
|
| Back to top |
|
 |
Syltrem *nix forums Guru Wannabe
Joined: 02 May 2005
Posts: 131
|
Posted: Wed Jul 19, 2006 10:26 am Post subject:
Re: 2 CPU's in timeout on my ES40
|
|
|
"Volker Halle" <volker_halle@hotmail.com> a écrit dans le message de news:
1153294081.483595.17480@i3g2000cwc.googlegroups.com...
| Quote: | Syltrem,
$ ANAL/SYS
SDA> CLUE CONFIG
may provide additional information about the state of those 2 CPUs.
Consider to capture the console output during the next boot to find
possible messages being logged on the console.
Volker.
|
There were no messages on the console.
Simply, those 2 CPU were ignored and not made part of the active set.
We finally had one of the 2 replaced last night, and all is fine.
The system never reported an error though.
Thanks to all who responded.
Syltrem |
|
| Back to top |
|
 |
dave.baxter@bannerhealth. *nix forums beginner
Joined: 29 Jul 2005
Posts: 41
|
Posted: Thu Jul 20, 2006 8:47 pm Post subject:
Re: 2 CPU's in timeout on my ES40
|
|
|
Syltrem wrote:
| Quote: | Has anyone experienced CPU's ceasing to be in service on an ES40 ?
I've never seen this before, there are no errors on the system or reported
by DIAG
2 CPUs out of 4 are not working right now.
$ sh cpu
System: HELIOS, AlphaServer ES40
CPU ownership sets:
Active 0,2
Configure 0-3
CPU state sets:
Potential 0-3
|
Strangely enough, I had a similar event last week. I had an ES40
crash and reboot on Saturday Afternoon. Since it was not critical, I
didn't get to it until Monday, however I found a situation similar to
that described above, (one I had never seen before either!). One CPU
(#2) was in a "TIMEOUT" state. For me, the command above showed
CPU ownership sets:
Active 0,1,3
Configure 0-3
A "show CPU 2 /full" showed
CPU 2 State TIMEOUT
Interestingly, although the crash occurred on Saturday Afternoon,
checking back showed that the CPU had assumed this state sometime
overnight Thursday/Friday, however the system didn't crash at that
time. Also the system rebooted and ran through to Tuesday evening,
when it was taken down for repair. The diagnostics run on the system
by the FE showed that the CPU was toast, and it was replaced.
I mention this only so that the poster doesnt feel so bad about it, and
also to show that although this seemed to be clearly a hardware issue,
neither HP nor I was able to find any errors in system error log
relating to it.
The crash dump indicated a BUGCHECK
System crash information
------------------------
CPU bugcheck codes:
CPU 00 -- CPUSPINWAIT, CPU spinwait timer expired
2 others -- CPUEXIT, Shutdown requested by another CPU
CPU 02 failed to service the bugcheck request
Dave |
|
| Back to top |
|
 |
Volker Halle *nix forums beginner
Joined: 29 Jul 2005
Posts: 33
|
Posted: Fri Jul 21, 2006 5:32 am Post subject:
Re: 2 CPU's in timeout on my ES40
|
|
|
Dave,
once a CPU has entered the active set, it should not be able to get
into a TIMEOUT state all by itself, except through a STOP/CPU and
START/CPU, which would fail to start it.
If a CPU ceases operation while in the active set, you'll either get a
CPUSANITY or a CPUSPINWAIT bugcheck.From the dump, SDA> CLUE CONFIG
will show the state of the CPUs.
In your CPUSPINWAIT case, the fact that CPU #2 did not service the
bugcheck request, may be enough evidence to have a look at the state of
that CPU. Note that if AUTO_ACTION is not set to RESTART, you may get
strange CPUSPINWAIT crashes instead of HALT-crashes (like MCHECKPAL or
KRNLSTAKNV). Look at the state of CPU #2 with SDA> CLUE CONFIG
(especially the HALT code).
Valentin,
once your ES40 was hung, did you try to press the HALT button ? If the
system would not react when pressing HALT, this is most likely a
hardware hang, otherwise it could be software and you can force a crash
entering >>> CRASH.
Volker. |
|
| Back to top |
|
 |
Google
|
|
| Back to top |
|
 |
|
|
The time now is Sun Nov 23, 2008 2:14 pm | All times are GMT
|
|
Credit Cards | Myspace Backgrounds | Credit Check | Yugioh | Loans
|
|
Copyright © 2004-2005 DeniX Solutions SRL
|
|
|
|
Other DeniX Solutions sites:
Unix/Linux blog |
electronics forum |
medicine forum |
science forum |
|
|
Privacy Policy
|
Powered by phpBB © 2001, 2005 phpBB Group
|
|