From 2 Management nodes down to 1 (R.Pi, Cluster n Cream spin-off)

From my testing MySQL Cluster on the Raspberry Pi’s I thought I’d share this little extract, just in case someone tries the same, some day.. somewhere.. why? I don’t know.

Ok, so when we pull the plug on one of the pi’s, we have of each component falling down, but because one of them is the arbitrator (node-id=2) then cluster falls over.

Before the ‘accident’:

  ndb_mgm -e show

Connected to Management Server at: localhost:1186
Cluster Configuration
———————
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=2    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11   @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

Whoops, who pulled that plug?

Everything on mypi02 is instantly down, as seen by mgmt_node on mypi01:
  ndb_mgm -e show

Connected to Management Server at: localhost:1186
Cluster Configuration
———————
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4 (not connected, accepting connect from mypi02)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=2    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11 (not connected, accepting connect from mypi02)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

but… a few seconds later:
  ndb_mgm -e show

Connected to Management Server at: localhost:1186
Cluster Configuration
———————
[ndbd(NDB)]     2 node(s)
id=3 (not connected, accepting connect from mypi01)
id=4 (not connected, accepting connect from mypi02)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=2    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10 (not connected, accepting connect from mypi01)
id=11 (not connected, accepting connect from mypi02)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

Check the mgmt_node log:
  vi /opt/mysql/mysql/mgmd_data/ndb_1_cluster.log

..
2013-09-04 14:06:44 [MgmtSrvr] WARNING  — Node 3: Node 2 missed heartbeat 2
..
..
2013-09-04 14:07:02 [MgmtSrvr] ALERT    — Node 3: Forced node shutdown completed. Caused by error 2305: ‘Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other node(s)(Arbitration error). Temporary error, restart node’.

So the physical architecture is too limited. Let’s limit the logical architecture to 1 Management Node:
  vi config.ini
–> comment out nodeid=2 entry in the [ndb_mgmd] section.
  vi my.cnf
–> remove mypi02:1186 from the ndb-connectstring entries in both sections [mysqld] & [mysql_cluster].

Now to start it all back up, but this time with –reload, to make sure the changes are accepted.

On mypi01:
  ndb_mgmd -f /usr/local/mysql/conf/config.ini –config-dir=/usr/local/mysql/conf –ndb-nodeid=1 –reload
  ndbd -c mypi01
On mypi02:
  ndbd -c mypi01
Check status, make sure the data nodes have started ok.
Then on mypi01:
  mysqld –defaults-file=/usr/local/mysql/conf/my.cnf –user=mysql &
On mypi02:
  mysqld –defaults-file=/usr/local/mysql/conf/my.cnf –user=mysql &

And to make sure we’re all connected up fine:
From mypi02 (remember, with only 1 mgmtnode on mypi01 now):
  ndb_mgm -e show -c mypi01
Connected to Management Server at: mypi01:1186
Cluster Configuration
———————
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11   @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

That’s better.

Advertisement

About Keith Hollman

Focused on RDBMS' for over 25 years, both Oracle and MySQL on -ix's of all shapes 'n' sizes. Small and local, large and international or just cloud. Whether it's HA, DnR, virtualization, containered or just plain admin tasks, a philosophy of sharing out-and-about puts a smile on my face. Because none of us ever stop learning. Teams work better together.
This entry was posted in ARM, Cluster 7.3, MySQL, MySQL Cluster, Oracle, Raspberry, Raspberry Pi. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s