Friday, March 30, 2012

Instance wil not come online

2 out of 4 instances on my 3 node cluster wil not come online on one off
the clusternodes, unless I add an alias (which I removed after
installing sp3). The application eventlog complains about not being able
to connect to the instance. I can connect to the instance remotely, but
apparently the clusterservice can not. I can not find any differences in
the registry of files. Reinstalling sql did not help. Any ideas?
Hans
Hans de Bruin wrote:
> 2 out of 4 instances on my 3 node cluster wil not come online on one off
> the clusternodes, unless I add an alias (which I removed after
> installing sp3). The application eventlog complains about not being able
> to connect to the instance. I can connect to the instance remotely, but
> apparently the clusterservice can not. I can not find any differences in
> the registry of files. Reinstalling sql did not help. Any ideas?
>
udp 1434 isn't always responding. So when the is no entry in
hkey-local_machine\software\mssqlserverclient\..\lastco nnect on my
workstation osql wil not connect. All nodes and instances report to be
8.00.818 (sp3 + latest hotfix). accoording to netstat al de nodes are
listening to udp 0.0.0.0:1434. Could W2k3 still be blocking 1434 or is
there a problem with registering the instance with the udp thing?
Hans
|||Hans de Bruin wrote:
> Hans de Bruin wrote:
>
> udp 1434 isn't always responding. So when the is no entry in
> hkey-local_machine\software\mssqlserverclient\..\lastco nnect on my
> workstation osql wil not connect. All nodes and instances report to be
> 8.00.818 (sp3 + latest hotfix). accoording to netstat al de nodes are
> listening to udp 0.0.0.0:1434. Could W2k3 still be blocking 1434 or is
> there a problem with registering the instance with the udp thing?
>
hmm, on ether node:
clear the node of all sql instances:
clear the registry on the client.
fail over one sql instance to the node.
connect to the instance: success
fail over a second instance to the node
connect to the second instance: timout on udp 1434
clear the registry on the client.
connect to the fisrt instance again: success
move the fist instance to an other node.
connect to the fisrt second again: success
So I am only able to connect to the fist activated instance on a node
useing udp 1434. Where did I go wrong?
Hans
|||Hans,
I have seen a similar problem to yours in one of our clusters.
Can you post the errors in the eventlog.
Thanks,
PK
Message posted via http://www.sqlmonster.com
|||PK via SQLMonster.com wrote:
> Hans,
> I have seen a similar problem to yours in one of our clusters.
> Can you post the errors in the eventlog.
> Thanks,
> PK
>
It looks like this one:
http://groups.google.nl/groups?threa...GP11.phx. gbl
My instances were running without local admin rights. After I
fixed this, udp1434 was behaving normally, and I could remove the
alias on the clusternodes.
I want to run some of the instances without local admin rights so I can
give developers sysadmin rights on their instance without giving them
access to the whole system.
Hans
|||Hans de Bruin wrote:
> PK via SQLMonster.com wrote:
>
> It looks like this one:
> http://groups.google.nl/groups?threa...GP11.phx. gbl
oops try this one:
http://groups.google.nl/groups?threa...0a% 40phx.gbl

>
> My instances were running without local admin rights. After I
> fixed this, udp1434 was behaving normally, and I could remove the
> alias on the clusternodes.
> I want to run some of the instances without local admin rights so I can
> give developers sysadmin rights on their instance without giving them
> access to the whole system.
>
|||Hans de Bruin wrote:
> 2 out of 4 instances on my 3 node cluster wil not come online on one off
> the clusternodes, unless I add an alias (which I removed after
> installing sp3). The application eventlog complains about not being able
> to connect to the instance. I can connect to the instance remotely, but
> apparently the clusterservice can not. I can not find any differences in
> the registry of files. Reinstalling sql did not help. Any ideas?
>
Apparently what happens is this:
- i1 come online the node as non admin and takes hold of udp 1434
- the MSCS tries to connect to the i1 and requests for the portnumber.
- i1 looks in its own the reqistry and finds the port number.
- MSCS gets an anwser an connects. i1 is online.
- i2 tries to come online on the node as either admin or non admin.
- the MSCS tries to connect to the i2 and requests for the portnumber.
- i1 tries to looks in i2's reqistry and gets an access denied.
- MSCS does not get any anwser and makes interesing calls to wins and
dns, fails to figure out the portnumer or pipename and fails i2
- i3 tries to come online on the node as either admin or non admin.
- the MSCS tries to connect to the i3 and requests for the portnumber.
- i1 tries to looks in i3's reqistry and gets an access denied.
- MSCS does not get any anwser and makes interesing calls to wins and
dns, fails to figure out the portnumer or pipename and fails i3
...
conclussion:
When running multiple sql instances (clustered or not) sql needs to run
as an local admin or system account or you need to hack the registry
permissions to let udp 1434 work, or or you need create an alias on the
server and the clients so sqlserveragent and the clients can connect.
Hans

No comments:

Post a Comment