Re: ASM bug?
From: Stefan Koehler <contact_at_soocs.de>
Date: Thu, 2 Oct 2014 20:47:40 +0200 (CEST)
Message-ID: <110692609.136344.1412275660587.open-xchange_at_app03.ox.hosteurope.de>
Hi Maureen,
thanks for clarification. Unfortunately my Oracle RAC 12.1.0.2 lab is running with udev, so i have no ASMLIB right here to trace, debug or verify. However a few things to consider regarding your described ASMLIB behavior:
Date: Thu, 2 Oct 2014 20:47:40 +0200 (CEST)
Message-ID: <110692609.136344.1412275660587.open-xchange_at_app03.ox.hosteurope.de>
Hi Maureen,
thanks for clarification. Unfortunately my Oracle RAC 12.1.0.2 lab is running with udev, so i have no ASMLIB right here to trace, debug or verify. However a few things to consider regarding your described ASMLIB behavior:
- DM devices got different major(253)/minor(X) numbers than the underlying path (e.g. 67/<X>, 68/<X>) - for more details check this documentation ( http://www.lanana.org/docs/device-list/devices-2.6+.txt ). It is much more likely that the major/minor device number for the path changes due to failed/deleted/added path. This should be no big issue as well, if you keep using the multi path bindings (= dm devices) in the right ASMLIB scan order. However if you flush the whole unused device map (including dm devices), the kernel assigns the next available major/minor pair for the device type by next (scsi-)rescan. If you have not changed the "device publishing order" (e.g. SAN admin can control that to a specific level) in the meantime it should be no issue as well. However there is some risk for sure (by LUN changes in the meantime), but i can not think about a valid reason to flush the whole unused device map (-F) instead of some specific ones due to a wanted disk change / redesign (-f). Maybe just an unwanted typo.
- Based on your described behavior it seems like ASMLIB does not grab net link socket messages (e.g. NETLINK_KOBJECT_UEVENT) from linux kernel properly (= bug) or not at all (= not implemented) like udev does and does not assign the corresponding new device numbers. You can do stack tracing / sampling to figure it out.
As this is a new RAC system you may want to consider udev instead of ASMLIB. It removes a separate (possibly buggy) software layer, but this whole thing is more like a "religious battle": http://ardentperf.com/2008/10/08/asmlib-performance-vs-udev/
Best Regards
Stefan Koehler
Oracle performance consultant and researcher http://www.soocs.de
> Maureen English <maureen.english_at_alaska.edu> hat am 2. Oktober 2014 um 18:55 geschrieben:
>
> Stefan, Andew;
>
> Yes, we are using ASMLIB.
>
> - Maureen
>
>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Thu Oct 02 2014 - 20:47:40 CEST