From 2 Management nodes down to 1 (R.Pi, Cluster n Cream spin-off)

From my testing MySQL Cluster on the Raspberry Pi’s I thought I’d share this little extract, just in case someone tries the same, some day.. somewhere.. why? I don’t know.

Ok, so when we pull the plug on one of the pi’s, we have of each component falling down, but because one of them is the arbitrator (node-id=2) then cluster falls over.

Before the ‘accident':

  ndb_mgm -e show

Connected to Management Server at: localhost:1186
Cluster Configuration
———————
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=2    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11   @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

Whoops, who pulled that plug?

Everything on mypi02 is instantly down, as seen by mgmt_node on mypi01:
  ndb_mgm -e show

Connected to Management Server at: localhost:1186
Cluster Configuration
———————
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4 (not connected, accepting connect from mypi02)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=2    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11 (not connected, accepting connect from mypi02)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

but… a few seconds later:
  ndb_mgm -e show

Connected to Management Server at: localhost:1186
Cluster Configuration
———————
[ndbd(NDB)]     2 node(s)
id=3 (not connected, accepting connect from mypi01)
id=4 (not connected, accepting connect from mypi02)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=2    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10 (not connected, accepting connect from mypi01)
id=11 (not connected, accepting connect from mypi02)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

Check the mgmt_node log:
  vi /opt/mysql/mysql/mgmd_data/ndb_1_cluster.log

..
2013-09-04 14:06:44 [MgmtSrvr] WARNING  — Node 3: Node 2 missed heartbeat 2
..
..
2013-09-04 14:07:02 [MgmtSrvr] ALERT    — Node 3: Forced node shutdown completed. Caused by error 2305: ‘Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other node(s)(Arbitration error). Temporary error, restart node’.

So the physical architecture is too limited. Let’s limit the logical architecture to 1 Management Node:
  vi config.ini
–> comment out nodeid=2 entry in the [ndb_mgmd] section.
  vi my.cnf
–> remove mypi02:1186 from the ndb-connectstring entries in both sections [mysqld] & [mysql_cluster].

Now to start it all back up, but this time with –reload, to make sure the changes are accepted.

On mypi01:
  ndb_mgmd -f /usr/local/mysql/conf/config.ini –config-dir=/usr/local/mysql/conf –ndb-nodeid=1 –reload
  ndbd -c mypi01
On mypi02:
  ndbd -c mypi01
Check status, make sure the data nodes have started ok.
Then on mypi01:
  mysqld –defaults-file=/usr/local/mysql/conf/my.cnf –user=mysql &
On mypi02:
  mysqld –defaults-file=/usr/local/mysql/conf/my.cnf –user=mysql &

And to make sure we’re all connected up fine:
From mypi02 (remember, with only 1 mgmtnode on mypi01 now):
  ndb_mgm -e show -c mypi01
Connected to Management Server at: mypi01:1186
Cluster Configuration
———————
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11   @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

That’s better.

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: