Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

beowulf cluster...questions?

Status
Not open for further replies.

Trekkie

Technical User
Apr 29, 2000
150
CA
hi,

can someone help me with the following questions about beowulf clusters. i was looking at mini-how to for building beowulf cluster ( and my questions is:

1) if the master node fails...will the whole cluster fail?
a) if so, is possible to have two master node in one
cluster?

2) is possible to have nodes mirror each other in a cluster ? (ie, mysql database, qmail)

i very much appreciate any advice.

T.k.
 
It did see an ideal scenario where there were two masters using Linux HA ( )

I found it's better to use the databases' built-in clustering (if available) like that of Oracle and MySQL, else you have to start using things like Coda or GPFS filesystems to keep nodes in sync.


"If you always do what you've always done, you will always be where you've always been."
 
thanks rzs0502,

i'll take a look at it.

T.k.
 
1) if the master node fails...will the whole cluster fail?
In general, with one control node, if the main node goes down, the cluster is inaccessable from the outside world. There are three solutions to this problem:

- Have two (or more) control nodes, see (a). (we don't do this)
- Backup to large medium on a regular basis, and be able to "quickly" turn a compute node into the control node while the control node is out.
- Have the master node run with RAID 1 -- and be able to move the mirror and second nic to a compute node "quickly," see (b). (we don't do this either, but I've been pushing for it)

We have one "master" node which among other things has the compiler (licenced to only one), the backup (of the whole cluster), the interface to the outside world and the control
node for most of our parallel work. However, once compiled most/all of the simulations we run can be run using another node as the control node. If the control node went down, we can replace the control node with a compute node in the short term by moving the nic (but we won't be able to compile in parallel). While in this state we can fix the main node (and if nessary restore it's contents to a new hardrive -- this takes a very long time, but the restored system is EXSACTLY as it was at 1:00am of the day it crashed). Work is hindered, but neededly stoped.


a) if so, is possible to have two master node in one cluster?
"master" is an intersting word. There three different types of nodes a cluster can have:
compute - dumb node that just runs the paralllel code
control - the "master" or control for the parallel processes
display/output - the node which collects/displays output from parallel procecesses, stats about system health, CPU usage and the like.

In general a "master" is thought to be just the second and third rolled together. There is nothing stopping you from seperating that functionality or having multipule of each.

2) is possible to have nodes mirror each other in a cluster ? (ie, mysql database, qmail)
Yes. but it is easier to use RAID 1. When the system goes down just move the mirror drive to another computer and stick in a second nic.

But, all this is just my humble opinion as a sys admin of 22 node beowulf for going on 2 years.

[plug=shameless]
[/plug]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top