vgHome
Posting Rules
Disclaimer
Privacy Policy and Contacts
About Rootvg
vgBookmarks
old Bookmarks
vgForum
Recent Posts
Old Forums
vgGuides
AIX6
FAQ
HOWTO
Service Bulletins
QuickRef AIX 5.2
QuickRef AIX 5.2 - Help
vgNews
Ledcode 0555 fsck error
Home
Help
Login
Register
Welcome,
Guest
. Please
login
or
register
.
January 09, 2009, 10:53:05 PM
1 Hour
1 Day
1 Week
1 Month
Forever
Login with username, password and session length
ROOTVG
>
AIX
>
Administration
>
Ledcode 0555 fsck error
Pages: [
1
]
2
Go Down
« previous
next »
Print
Author
Topic: Ledcode 0555 fsck error (Read 3319 times)
0 Members and 1 Guest are viewing this topic.
fbergenh
Full Member
Posts: 16
Re: Ledcode 0555 fsck error
«
Reply #22 on:
November 08, 2007, 11:19:00 AM »
By the way, for both of you: thanks for the help.
Logged
fbergenh
Full Member
Posts: 16
Re: Ledcode 0555 fsck error
«
Reply #21 on:
November 08, 2007, 11:18:07 AM »
John, when I arrived at work last monday, it was my intension to ask somebody to replace the disk. I never needed to ask it, because the machine was active again.
This morning, I did a
1) diagnostics on the disks (no errors)
2) certify on hdisk0, output:
Certify media task
device: hdisk0 in location U0.1-P1-I2/Z1-A0
The certify operation is in progress
please standby
... % completed
Disk drive capacity.....................18200 MB
Data errors recovered.......................1
Data errors not recovered..................0
Equipment check errors recovered.......0
equipment check errors not recovered..0
3) fsck on the filesystems /dev/hd1, /dev/hd2, /dev/hd3, /dev/hd4 and /dev/hd9var (no errors).
I guess this means everything is ok right now.
The only thing I'll have to look into is the quorum checking. And yes, I will keep an eye on the errpt....
Logged
Michael
Administrator
Hero Member
Posts: 539
Re: Ledcode 0555 fsck error
«
Reply #20 on:
November 08, 2007, 08:45:57 AM »
John,
regarding VIO - I take that as a challange to do a writeup on best practices. And for those who cant wait - there is an IBM training regarding APV (advanced power virtualization) best practices. This course has lots of hands on (60%+ of the class time is to setting up dual VIO and shared ethernet adapter failover).
In most of the world the class code starts with the code AU78. The code in USA was Q1378 but that has been changed to AU780 (so the USA is now also in "most of the world"
)
Logged
John Peck
Global Moderator
Senior Member
Posts: 46
Re: Ledcode 0555 fsck error
«
Reply #19 on:
November 08, 2007, 01:06:20 AM »
Oh sorry but you said:
Quote
However, I think I am going to stop the efforts and ask the appropriate people to replace the disk (and add another to make mirroring of the rootvg possible).
This has reminded me of another thread where we were talking about running diagnostics to certify (and maybe format) a suspect disk.
As Michael noted, a system can die without being able to write an error to the log on the disk - such dying throws can be extracted from a dump etc.. However. it is entirely possible that a disk can fail, perhaps only in one tiny part of it, and even kill the system without leaving an error log entry anywhere.
You can test this sort of thing with for example adding a disk to the rootvg to put some paging space on, then just pull out that part of the paging space to simulate failure, and as soon as the system decides to try and use that area, bang. Like having a piece of your brain removed I suppose ;-)
Mirroring is always the answer.
You've done all this now apparently, but, when mirroring from a potentially suspect disk be sure to check that suspect disk thoroughly first with the diags and also run fsck checks - requiring maintenance mode etc. While mirroring a disk, look especially for errors at that point of accessing all of the data.
In future at the first sign of any disk errors with that suspect disk, I would replace it. When adding any new disk, it's a good idea (although boringly slow) to do the diag format and certify first.
As an aside, just imagine how troublesome such things can be when you have VIO and maybe dozens of different operating system instances sharing a disk - eggs, basket,...
«
Last Edit: November 08, 2007, 01:23:13 AM by John Peck
»
Logged
fbergenh
Full Member
Posts: 16
Re: Ledcode 0555 fsck error
«
Reply #18 on:
November 07, 2007, 08:24:16 PM »
Quote from: John Peck on November 07, 2007, 07:13:44 PM
Going back to the beginning of this thread, and the original problem...
It appears that the original cause was likely to have been a disk bad block/sector, probably under the hd8 JFSLOG LV, in an unmirrored rootvg.
Sadly the effect of a disk error can be fatal to any system, that's the main reason why it is recommended (here anyway) that you mirror rootvg at least
- and turn quorum checking off on a two disk mirroring set.
You said that you were replacing the duff disk
and adding another to mirror it, so there's no reason why that should occur again. Obviously you will monitor your error log and act on disk errors when and if they are logged. When you appear to have a failing disk, simply un-mirror it out of the volume group, replace and re-mirror - all of which can be done on the fly these days usually.
Not entirely correct. I didn't replace the disk, just added another disk to the rootvg and did the mirror. There are no errors in the errpt (not even on the formerly 'corrupt' disk) and there are no signs that there are (for example) staled partition.
I am new at this company (started working for them on the 9th of september) and was checking all 'my' servers for strange things. Hadn't come to this one, otherwise I would have mirrored the rootvg immediately and well before the problems occured....
Logged
John Peck
Global Moderator
Senior Member
Posts: 46
Re: Ledcode 0555 fsck error
«
Reply #17 on:
November 07, 2007, 07:13:44 PM »
Going back to the beginning of this thread, and the original problem...
It appears that the original cause was likely to have been a disk bad block/sector, probably under the hd8 JFSLOG LV, in an unmirrored rootvg.
Sadly the effect of a disk error can be fatal to any system, that's the main reason why it is recommended (here anyway) that you mirror rootvg at least
- and turn quorum checking off on a two disk mirroring set.
You said that you were replacing the duff disk and adding another to mirror it, so there's no reason why that should occur again. Obviously you will monitor your error log and act on disk errors when and if they are logged. When you appear to have a failing disk, simply un-mirror it out of the volume group, replace and re-mirror - all of which can be done on the fly these days usually.
Logged
fbergenh
Full Member
Posts: 16
Re: Ledcode 0555 fsck error
«
Reply #16 on:
November 07, 2007, 01:13:06 PM »
I am glad the problem is resolved, but I find it a little bit disturbing that I don't have a clue on what caused the problem in the first place and why a powerdown can solve this kind of problems.
As far as I am concerned, it is possible it will occur again sometime in the future. Luckely, the problem won't be as big as it was this time, but still...
Logged
Michael
Administrator
Hero Member
Posts: 539
Re: Ledcode 0555 fsck error
«
Reply #15 on:
November 06, 2007, 10:05:30 PM »
Thanks for the updates. Amazing what a power down can resolve!
Logged
fbergenh
Full Member
Posts: 16
Re: Ledcode 0555 fsck error
«
Reply #14 on:
November 05, 2007, 11:05:35 AM »
I already found an answer on my question:
"readvgda /dev/hdisk1" shows some old lv's from 2002, so I did an "extendvg -f rootvg hdisk1".
Logged
fbergenh
Full Member
Posts: 16
Re: Ledcode 0555 fsck error
«
Reply #13 on:
November 05, 2007, 10:01:24 AM »
New development on this case:
Last weekend, there were some power-maintenance actions which needed all the machines to power down for the whole weekend. Last friday, I did a shutdown -Fh on all my machines (all but this machine, which ofcourse was already down due to the problems) and then I went home.
This morning, I did a ping on all the machine to check if they were active again and automatically, I tried to ping this machine also.
I was very suprised when the ping returned a normal reactie..... It seems that the powerdown was the key to the recovery of the system.
Naturally, the first thing I am trying to do is to mirror the rootvg, but that also causes some problems:
<hostname>:/root # extendvg rootvg hdisk1
0516-1398 extendvg: The physical volume hdisk1, appears to belong to
another volume group. Use the force option to add this physical volume
to a volume group.
0516-792 extendvg: Unable to extend volume group.
when I look at the disks, it seems hdisk1 is not assigned to any vg:
<hostname>:/root # lspv
hdisk0 005b835aa01e3d28 rootvg active
hdisk2 005b835af7e00482 vgSQLTST_MOS_20 active
hdisk3 005b835a1195b4dc vg_bak active
hdisk4 005b835aa01e3d68 vg_esf active
hdisk6 005b835af7e190cf vgSQLTST_MOS_20 active
hdisk1 005b835a1affe1d3 None
hdisk5 00c8140e11571b1d None
hdisk7 none None
hdisk8 none None
hdisk9 none None
hdisk10 none None
dlmfdrv7 005b835a2d2517bf vg_p44 active
dlmfdrv8 005b835a2d2518e6 vg_p44 active
dlmfdrv9 005b835abef765e6 vg_esf active
dlmfdrv10 005b835ac3bc2959 vg_p44 active
dlmfdrv none None
I am not sure if I should use the extendvg -f option, because I am not sure what's on hdisk1. Is there another way to check what is on the disk?
Logged
fbergenh
Full Member
Posts: 16
Re: Ledcode 0555 fsck error
«
Reply #12 on:
October 29, 2007, 01:41:55 PM »
Thanks for the suggestions, John.
However, I think I am going to stop the efforts and ask the appropriate people to replace the disk (and add another to make mirroring of the rootvg possible).
I tried the following commands this morning:
remove hd8 from disk:
# rmlv -f hd8
output on console:
0516-062 lqueryvg: unable to read or write logical volume manager record. PV may be permanently corrupted. Run diagnostics.
0516-912: rmlv: unable to remove lv hd8
Make another loglv on disk:
# mklv -a e -t jfslog -y loglv00 rootvg 1 hdisk0
output on console:
0516-062 lqueryvg: unable to read or write logical volume manager record. PV may be permanently corrupted. Run diagnostics.
0516-822: mklv: unable to create logical volume
I even tried to move hd8 to another part of the disk with the chlv -a e hd8 command, but that also gave me a message like cannot move logical volume.
I guess the disk is really gone....
Logged
John Peck
Global Moderator
Senior Member
Posts: 46
Re: Ledcode 0555 fsck error
«
Reply #11 on:
October 26, 2007, 05:05:11 PM »
Sounds then like it may be the hd8 LV which is over an area of bad disk.
A similarly terminal experience may be had when some disk goes in your paging space area, if you haven't mirrored.
From the maintenance shell, try removing hd8 (rmlv)
and then recreate it in a different place on the disk
- so before that check where it was, doubtless "midway",
although it should be "center" but there probably isn't any space there anyway,
so try "edge" for the worst performance (move it later maybe)
- the key thing to note is that you have to say type is "jfslog"
not the default jfs, and there is of course no help or tab list on that smit field.
- then on the new hd8 LV do the logform before the fsck again.
Incidentally, the log doesn't have to be called hd8, you could change
the log associated with each rootvg filesystem to some other new name
and leave hd8 alone and corrupted (like Britney ?-)
Logged
fbergenh
Full Member
Posts: 16
Re: Ledcode 0555 fsck error
«
Reply #10 on:
October 26, 2007, 10:52:54 AM »
I tried to do that, but with the following result:
# /usr/sbin/logform /dev/hd8
logform: destroy /dev/hd8 (y)? y
/usr/sbin/logform: I/O error
Logged
Michael
Administrator
Hero Member
Posts: 539
Re: Ledcode 0555 fsck error
«
Reply #9 on:
October 26, 2007, 10:09:38 AM »
# logform /dev/hd8
Logged
fbergenh
Full Member
Posts: 16
Re: Ledcode 0555 fsck error
«
Reply #8 on:
October 26, 2007, 09:47:03 AM »
This morning I repeated the whole operation one more time, but the result was the same as yesterday afternoon.
I received exactly the same error messages:
importing volume group
rootvg
checking the / file system
log redo processing for /dev/rhd4
syncpt record at 20c0c8
error writing 7,48
write of map failed 4
update of maps failed
update of superblock failed
write of map failed 5
update of maps failed
update of superblock failed
write of map failed 6
update of maps failed
update of superblock failed
write of map failed 7
update of maps failed
update of superblock failed
write of map failed 8
update of maps failed
update of superblock failed
write of map failed 9
update of maps failed
update of superblock failed
end of log 2142ff8
syncpt record at 20cc0c8
syncpt address 20bf0d4
number of log records = 4998
number of do blocks = 84
number of nodo blocks = 0
failure replaying log = 0
/dev/rhd4: unable to read superblock (TERMINATED)
checking the /usr filesystem
/dev/rhd2: unable to read superblock (TERMINATED)
exit from this shell to continue the process of accessing the root volume group
#
I guess I am in a deadlock situation: I need to fsck / and /usr. In order to succeed the fsck on /, the jfslog needs to be ok. But the logform on hd8 gives me an I/O error.
Is there another way to clean the jfslog partition (or am I using the wrong command)?
Or is the I/O error the clue to a hardware failure?
Frank
Logged
Pages: [
1
]
2
Go Up
Print
« previous
next »
Jump to:
Please select a destination:
-----------------------------
AIX
-----------------------------
=> Administration
=> Virtualization
=> Applications
=> HACMP
=> Security
=> AIX6 Implementation and Administration
-----------------------------
Hardware
-----------------------------
=> Power6
=> Power5
=> Power4
=> RS/6000 (Power III and earlier)
-----------------------------
Linux on POWER
-----------------------------
=> Planning and Installation
=> General
-----------------------------
Announcements
-----------------------------
=> Announcements
=> Discussion
Loading...
FastPath
HowTo
New in AIX6
RBAC
Security
WPAR
Service Bulletins
InfoCenters
AIX 6.1
AIX 5.3
AIX 5.2
AIX 5.1
- - - - - - -
Fix Central
HMC Downloads
IBM Firmware/LIC
VIOS Support
- - - - - - -
Hardware Documents
PowerHA (HACMP)
Tivoli Manuals
- - - - - - -
IBM Training
src="http://e1.extreme-dm.com/s10.g?login=jootvg&j=n&jv=n" />
Terms of Use
and
Privacy and Security Policies
Copyright 2001-2008 Michael Felt and ROOTVG.NET
HOWTO: Quick Setup guide for dual VIOS and MPIO
AIX, HMC and VIOS updates released
Are you open to DNS spoofing?
HOWTO: Cleanup a PVMISSING disk
FAQ: System P Certification - how to become CATE!
HOWTO: Create a boot-only CD or DVD for AIX