|
Title: AIX 5.2 ML09 suddenly hung Post by: cucumber90 on May 08, 2008, 03:14:34 AM Hi all,
I am newbie here, yesterday one of my LPAR in p570 suddenly hung :( The status from HMC is running, but I cannot telnet and ping to the LPAR. I even cannot open the terminal window from HMC. Logged a call to IBM, they advised me to reboot and collect dump but when I triggered reboot from HMC the LPAR LED panel stop at D200A200. Then IBM advised me to force shutdown and bring up the server. After the server up, error report doesn't have any thing :'( Any body has a similar situation before? is it a hardware or OS problem? Thanks in advance... Eddy Title: Re: AIX 5.2 ML09 suddenly hung Post by: Michael on May 08, 2008, 11:02:02 AM 1) The code D200A200 is fairly 'normal'. It means AIX has been sent a rc.shutdown signal - in HMC terms: delayed shutdown, not operating system shutdown.
2) If the system "panic" stopped, and you have the system set to save dumps, there should be a file in /var/adm/ras named vmcore.X (X starts at zero, and increments for each core dump saved). To check this setting use: lsattr -El sys0 -a fullcore To check if system/partition auto restarts on a crash: lsattr -El sys0 -a autorestart On my system - I do not not collect fullcore dumps, but I do restart: Note - I add the argument H for header information and requested both attributes. Skip the -a arguments for all attributes. # lsattr -EHl sys0 -a fullcore -a autorestart # Note added H to generate headers attribute value description user_settable fullcore false Enable full CORE dump True autorestart true Automatically REBOOT system after a crash True p.s. AIX 5.2 is old, and ML09 is old within AIX 5.2. Check: AIX 5.2 Service Bulletins (http://rootvg.net/component/option,com_bookmarks/Itemid,90/mode,0/catid,29/navstart,0/search,*/) Title: Re: AIX 5.2 ML09 suddenly hung Post by: cucumber90 on May 09, 2008, 02:30:50 AM Hi Michael,
Thank you for you reply, Quote 1) The code D200A200 is fairly 'normal'. It means AIX has been sent a rc.shutdown signal - in HMC terms: delayed shutdown, not operating system shutdown. How long is that 'normal' process? That time we have waited for about 20 minutes the LED code still D200A200 ??? We can't effort to have a system down more than 30 minutes as it is a production server :(For no 2 I will try to check it out Quote p.s. AIX 5.2 is old, and ML09 is old within AIX 5.2. Check: AIX 5.2 Service Bulletins Yes sir I know, but our application is not compatible with higher OS level and even higher ML :(Title: Re: AIX 5.2 ML09 suddenly hung Post by: Michael on May 09, 2008, 10:10:01 PM The code simply means, in practice, that the signal has been sent to the system - the operating system is still given time to shutdown the system (i.e. respond to the signal). Think of this as a short press on a PC Power button - how the system responds is dependent on OS settings and state. If you feel this is taking too long - normally mine shutdown within 3 minutes - then you can do an immediate shutdown. Think of the immediate shutdown as pressing, and holding, a PC power button until the system shuts off.
In other words - the LCD code does not guarantee that the partition will stop - it only indicates that a signal has been sent to the partition. |