SQL server 2005 cluster - second node - sql service will not start - node is showing down in cluster manager


i tryng second node in 2 node cluster start sql services , come in cluster.  second node configured , running in cluster when @ point, sql service and/or cluster service went down.  node not start...as in sql service fails , system log error is:

"the disk associated cluster disk resource 'disk q:' not found. expected signature of disk bf17d33c. if disk removed server cluster, resource should deleted. if disk replaced, resource must deleted , created again in order bring disk online. if disk has not been removed or replaced, may inaccessible @ time because reserved server cluster node."

and application log shows:

error 2(the system cannot find file specified.) occurred while opening file 'c:\program files\microsoft sql server\mssql.1\mssql\data\master.mdf' obtain configuration information @ startup. invalid startup option might have caused error. verify startup options, , correct or remove them if necessary.

i had edit startup parameters on second node because appears cannot "see" original path, network drive on primary node. but, cannot copy master.dbf file primary node (it won't allow me copy it..?).  if move it then will bring down primary node (i suspect).  given the error above, believe if can rebuild master.mdf file , place second node can see, may be able restart services on node.  using default path startup parameter:  

-dc:\program files\microsoft sql server\mssql.1\mssql\data\master.mdf;-ec:\program files\microsoft sql server\mssql.1\mssql\log\errorlog;-lc:\program files\microsoft sql server\mssql.1\mssql\data\mastlog.ldf

as can see, have brute forced proposed solution since not have valid master.mdf file.  on bright side, system log error getting preventing startup

hi,

here article on issue having.... http://support.microsoft.com/kb/243195.

if havent replaced disk, or had failures the key steps going finding guids , resetting them. please sure read entire article

  1. gather information disk resources:
    1. start registry editor (regedt32.exe).
    2. open hkey_local_machine hive, , click on root (hkey_local_machine) select it.
    3. on registry menu, click load hive.
    4. locate %systemroot%\cluster folder, , click clusdb. when prompted key name, type cluster.
    5. locate following registry key:
      hkey_local_machine\cluster\resources\<guid>\parameters\signature
      • under each guid, type value physical disk. after identify physical disks, verify name key. should have value of disk <var>drive</var>. how can identify guid belongs physical disk resource.
      note: each resource listed globally unique identifier (guid). guid has 32 alpha numeric values following format: x11xx1x1-x11x-11x1-xxx1-11111xxx111x.
    6. document disk signature belongs each physical disk resource.
  2. update disk information in cluster registry:
    1. compare disk signatures each hard disk observed using ftedit.exe tool (step 4) disk resource signature values in cluster registry (step 5).
    2. the disk signature ftedit not same 1 listed in signature key correct new disk signature.
    3. document correct disk signature each disk.
    4. replace information in signature key in cluster registry correct disk signature value new disk. make sure make changes in hex.
    5. locate following registry key disk being replaced:
      hkey_local_machine\cluster\resources\<guid>\parameters\diskinfo
      note data type diskinfo key should reg_binary, , should located directly above signatures key modified earlier in article.
    6. delete diskinfo registry key.

      note make sure delete entire key, , not value key. key dynamically re-created next time cluster service started.
    7. select cluster hive, on registry menu, click unload hive.
  3. remove disk signature information cluster disk driver:
    1. locate , delete following registry key:
      hkey_local_machine\system\currentcontrolset\services\clusdisk\parameters\signatures
      note there several registry keys under signatures key. disk signatures drives on shared resource. key , subkeys re-created next time cluster service bring disk resource online on node.
    2. quit regedt32.exe.
  4. remove duplicate entry in disk key:
    1. start ftedit.exe again.
    2. locate , click disk signature old disk (determined in step 6).
    3. click edit, , click delete drive.
    4. quit ftedit.exe, , click yes save changes.
  5. create copy of cluster registry:
    • copy clusdb file located in %systemroot%\cluster folder floppy disk.
  6. remove cluster registry on quorum drive:
    • start windows explorer, go mscs folder on quorum drive , delete quolog.log file , chk<var>xxx</var>.tmp files.

      note cluster registry dynamically re-created on quorum drive when cluster service starts.
  7. start cluster service on node 1:
    1. change following startup values to:
      • system clusdisk (by using devices tool in control panel)
      • automatic cluster service (by using services tool in control panel)
    2. manually start clusdisk driver.
    3. manually start cluster service.important if quorum drive drive failed, may need start cluster service using /fixquorum switch , temporarily change drive quorum drive.
    4. verify on node 1 clusdisk , cluster service started correctly. start cluster administrator , verify physical disk resources online , there 1 each volume on shared disk.
  8. boot node 2:
    • keep node 1 on, , start node 2.

      reminder node 2 fail join cluster because cluster service disabled.
  9. manually synchronize cluster registries:
    • replace %systemroot%\cluster\clusdb file on node 2 clusdb file copied floppy node 1.

      important: after cluster service starts on node 1, impossible copy file without stopping cluster service , unloading registry.
  10. remove disk signature information cluster disk driver on node 2:
    • locate , delete following registry key:
      hkey_local_machine\system\currentcontrolset\services\clusdisk\parameters\signatures
  11. remove duplicate entry in disk key:
    1. start ftedit.exe again.
    2. locate , click disk signature old disk (determined in step 6).
    3. click edit, , click delete drive.
    4. quit ftedit.exe, , click yes save changes.
  12. start cluster service on node 2:
    1. change following startup values to:
      • system clusdisk (by using devices tool in control panel)
      • automatic cluster service (by using services tool in control panel)
    2. manually start clusdisk driver.
    3. manually start cluster service.
    4. verify node 2 joined cluster using cluster administrator. verify able manually move new hard disk resource , forth between 2 nodes. so, right-click resource group disk in , click move group. option changes ownership of resource group , contents other node.

-ivan


ivan sanders my linkedin profile, my blog, @iasanders.


SQL Server  >  SQL Server Setup & Upgrade



Comments

Popular posts from this blog

Conditional formatting a graph vertical axis in SSRS 2012 charts

Register with Power BI failed

SQL server replication error Cannot find the dbo or user defined function........