NDB 7.4 & SYS schema: When getting locks, detecting the guilty SQL & o.s.pid.

Here’s a way to detect the sql query causing a lock or a session to fail, and also to identify the o.s.pid if need be (btw, no rocket science). “a” way.. I’m sure there are many others, so feel free to suggest, please.

So, we’re using MCM, and have created a MySQL Cluster like mentioned in the cluster intro session (in Spanish I’m afraid), using 7.4.6, which comes with 5.6.24.

With the env up and running, set up a schema, some data and run a few queries:

mysql> create database world;
mysql> use world;
Database changed
mysql> source world_ndb.sql

(world_ndb.sql, as you might guess, is the world_innodb tables script, with a little adjustment as to which storage engine to be used.)

Once created, let’s lock things up in Cluster:

mysql -uroot -h127.0.0.1 -P3306
mysql> use test; begin; select * from City where Name = 'Valencia' for update; update City set
Name = 'Bilbao' where ID=3;

As there’s no ‘end’, this will sit there, and we can see the single transaction:

select * from ndbinfo.cluster_transactions ;
+---------+----------------+-------------------+---------+------------------+------------------------+------------------+----------------+------------------+
| node_id | block_instance | transid           | state   | count_operations | outstanding_operations | inactive_seconds | client_node_id | client_block_ref |
+---------+----------------+-------------------+---------+------------------+------------------------+------------------+----------------+------------------+
|       1 |              0 | 36083772600358092 | Started |                4 |                      0 |               55 |             50 |            32776 |
+---------+----------------+-------------------+---------+------------------+------------------------+------------------+----------------+------------------+
1 row in set (0,04 sec)

Now for the 2nd session:

mysql -uroot -h127.0.0.1 -P3307
mysql> use world; begin; select * from City for update; update City set Name = 'Sevilla' where
ID=2;

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
Warning (Code 1297): Got temporary error 274 'Time-out in NDB, probably caused by deadlock' from NDB
Warning (Code 1297): Got temporary error 274 'Time-out in NDB, probably caused by deadlock' from NDB
Error (Code 1205): Lock wait timeout exceeded; try restarting transaction
Query OK, 1 row affected (0,00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

Now we’ll see the 2 transactions:

select * from ndbinfo.cluster_transactions ;
+---------+----------------+-------------------+---------+------------------+------------------------+------------------+----------------+------------------+
| node_id | block_instance | transid           | state   | count_operations | outstanding_operations | inactive_seconds | client_node_id | client_block_ref |
+---------+----------------+-------------------+---------+------------------+------------------------+------------------+----------------+------------------+
|       1 |              0 | 36083772600358092 | Started |                4 |                      0 |              176 |             50 |            32776 |
|       2 |              0 | 36084872111980548 | Started |                1 |                      0 |                2 |             51 |            32776 |
+---------+----------------+-------------------+---------+------------------+------------------------+------------------+----------------+------------------+
2 rows in set (0,03 sec)

Note that the client_node_id reflects clusters node_id number, 50 & 51 for the sqlnodes.

If this were InnoDB, some people might start looking into “show engine innodb status;”, like “show engine ndb status;”. But of course, this is SQL running on the sqlnode, i.e. wil lonly give us the info pertaining to cluster from this session, on this sqlnode. We’d have to connect to each and every mysqld to get more info. Not practical.

Let’s see what cluster operations are being run (and compare to what we already know):

select * from ndbinfo.cluster_operations;
+---------+----------------+-------------------+----------------+----------+---------+------------+----------------+------------------+------------+-------------+-------------------+
| node_id | block_instance | transid           | operation_type | state    | tableid | fragmentid | client_node_id | client_block_ref | tc_node_id | tc_block_no | tc_block_instance |
+---------+----------------+-------------------+----------------+----------+---------+------------+----------------+------------------+------------+-------------+-------------------+
|       1 |              1 | 36083772600358092 | UPDATE         | Prepared |      12 |          0 |             50 |            32776 |          1 |         245 |                 0 |
|       1 |              1 | 36083772600358092 | READ-SH        | Prepared |      12 |          0 |             50 |            32776 |          1 |         245 |                 0 |
|       1 |              1 | 36083772600358092 | READ-SH        | Prepared |      12 |          0 |             50 |            32776 |          1 |         245 |                 0 |
|       2 |              1 | 36083772600358092 | UPDATE         | Prepared |      12 |          0 |             50 |            32776 |          1 |         245 |                 0 |
|       2 |              1 | 36083772600358092 | READ-SH        | Prepared |      12 |          1 |             50 |            32776 |          1 |         245 |                 0 |
+---------+----------------+-------------------+----------------+----------+---------+------------+----------------+------------------+------------+-------------+-------------------+
5 rows in set (0,02 sec)

In order to see the operations on the actual server, we’d need to connect to the appropriate server (ndbinfo.cluster_transactions.client_node_id) and run:

select * from ndbinfo.server_operations;
+---------------------+---------+----------------+-------------------+----------------+----------+---------+------------+----------------+------------------+------------+-------------+-------------------+
| mysql_connection_id | node_id | block_instance | transid           | operation_type | state    | tableid | fragmentid | client_node_id | client_block_ref | tc_node_id | tc_block_no | tc_block_instance |
+---------------------+---------+----------------+-------------------+----------------+----------+---------+------------+----------------+------------------+------------+-------------+-------------------+
|                   4 |       1 |              1 | 36083772600358092 | UPDATE         | Prepared |      12 |          0 |             50 |            32776 |          1 |         245 |                 0 |
|                   4 |       1 |              1 | 36083772600358092 | READ-SH        | Prepared |      12 |          0 |             50 |            32776 |          1 |         245 |                 0 |
|                   4 |       1 |              1 | 36083772600358092 | READ-SH        | Prepared |      12 |          0 |             50 |            32776 |          1 |         245 |                 0 |
|                   4 |       2 |              1 | 36083772600358092 | UPDATE         | Prepared |      12 |          0 |             50 |            32776 |          1 |         245 |                 0 |
|                   4 |       2 |              1 | 36083772600358092 | READ-SH        | Prepared |      12 |          1 |             50 |            32776 |          1 |         245 |                 0 |
+---------------------+---------+----------------+-------------------+----------------+----------+---------+------------+----------------+------------------+------------+-------------+-------------------+
5 rows in set (0,00 sec)

So, now we know the mysql_connection_id, via a “show processlist” / “show full processlist” we can see the client port:

+----+-------------+-----------------+-------+---------+------+-----------------------------------+------------------+
| Id | User        | Host            | db    | Command | Time | State                             | Info             |
+----+-------------+-----------------+-------+---------+------+-----------------------------------+------------------+
|  1 | system user |                 |       | Daemon  |    0 | Waiting for event from ndbcluster | NULL             |
|  4 | root        | localhost:50493 | world | Sleep   |    1 |                                   | NULL             |
|  7 | root        | localhost:50559 | NULL  | Query   |    0 | init                              | show processlist |
|  8 | root        | localhost:50621 | NULL  | Sleep   |   42 |                                   | NULL             |
+----+-------------+-----------------+-------+---------+------+-----------------------------------+------------------+
4 rows in set (0,00 sec)
And now, with an o.s. command, get the offending process id:

$ netstat -pt | grep 50493
tcp        0      0 localhost:50493         localhost:mysql         ESTABLISHED 26352/mysql     
tcp6       0      0 localhost:mysql         localhost:50493         ESTABLISHED 24483/mysqld

26352 is the os PID. We could kill this one on this sqlnode, but first, let’s see what it’s doing, as if we don’t find out what it’s running, how can we fix it.

To get the sql text, it’s going to be far easier to use SYS schema and activate the data collection:

:: via Workbench 6.1 upwards.
:: via https://github.com/MarkLeith/mysql-sys.

and using Cluster Manager, set performance schema instrumentation on:

mcm> set performance-schema-instrument:mysqld='%=ON' mycluster;

and let’s see what happened:

select a.mysql_connection_id, a.operation_type, a.client_node_id, b.inactive_seconds
   from ndbinfo.server_operations a, ndbinfo.cluster_transactions b
   where a.transid = b.transid
   and b.inactive_seconds > 2;

+---------------------+----------------+----------------+------------------+
| mysql_connection_id | operation_type | client_node_id | inactive_seconds |
+---------------------+----------------+----------------+------------------+
|                  14 |        READ-SH |             50 |             1501 |
|                  14 |         UPDATE |             50 |             1501 |
|                  14 |        READ-SH |             50 |             1501 |
|                  14 |         UPDATE |             50 |             1501 |
|                  14 |        READ-SH |             50 |             1501 |
+---------------------+----------------+----------------+------------------+
5 rows in set (0,03 sec)

SELECT a.thread_id, a.SQL_TEXT, b.processlist_id
FROM     performance_schema.events_statements_history a, performance_schema.threads b
WHERE     a.thread_id = b.thread_id
and     b.processlist_id =
  (select distinct a.mysql_connection_id
   from ndbinfo.server_operations a, ndbinfo.cluster_transactions b
   where a.transid = b.transid
   and b.inactive_seconds > 2)
order by TIMER_START;

+-----------+-------------------------------------------------------+----------------+
| thread_id | SQL_TEXT                                              | processlist_id |
+-----------+-------------------------------------------------------+----------------+
|        32 | NULL                                                  |             14 |
|        32 | begin                                                 |             14 |
|        32 | select * from City where Name = 'Valencia' for update |             14 |
|        32 | update City set Name = 'Bilbao' where ID=3            |             14 |
|        32 | commit                                                |             14 |
|        32 | SELECT DATABASE()                                     |             14 |
|        32 | NULL                                                  |             14 |
|        32 | begin                                                 |             14 |
|        32 | select * from City where Name = 'Valencia' for update |             14 |
|        32 | update City set Name = 'Bilbao' where ID=3            |             14 |
+-----------+-------------------------------------------------------+----------------+
10 rows in set (0,05 sec)

You got me, I ran it twice in the same session. So now we can go and fix the code, and add that little “end” where it’s needed.

Hope this has helped someone.

MySQL in a USB, with a little help of Fabric

If you’ve wanted to see how well MySQL Fabric actually works and how easy it makes setting up replication and switching Masters & Slaves back and forth, with a sample use case of having the master stored in a USB stick, then feel free to have a look at:

http://www.slideshare.net/keithhollman5/mysql-fabric-in-a-usb

o la versión en Español:

http://www.slideshare.net/keithhollman5/mysql-fabric-en-un-usb

Please let me know what you think.

MySQL Cluster: Una introducción en Español.

MySQL Cluster: El ‘qué’ y el ‘cómo’.

Para aquellos que encuentran mucho sobre MySQL en Inglés pero poco en Español: mi pequeña aportación.
En el enlace tenéis información sobre qué es MySQL Cluster y cómo funciona. Incluso con ejemplos técnicos para romper las barreras y ayudar a simplificarlo (espero).

¡A disfrutar!

MySQL 5.7.7 RC & Multi Source Replication 40 to 1.

One of the cool new features in 5.7 Release Candidate is Multi Source Replication, as I previously looked into in 5.7.5 DMR.

I’ve had more and more people like the idea of this, so here’s a quick set-up as to what’s needed and how it could work.

1. Prepare master environments.
2. Prepare 40 masters for replication.
3. Create slave.
4. Insert data into the Masters.

Continue reading

Enterprise Monitor: “Add Bulk MySQL Instances” 50 in 1-click.

Carrying on with my MySQL 5.7 Labs Multi Source Replication scenario, I wanted to evaluate performance impact via MySQL Enterprise Monitor.

Whilst I opened my environment, I remember that I had generated lots of different skeleton scripts that allowed me to deploy the 50 servers quickly, and I didn’t want to add each of my targets 1 by 1 in MEM. So, I used one of the many features available, “Add Bulk MySQL Instances”.

Continue reading

a multisource replication scenario: 10 masters, 1 slave.

A customer asked whether we they could have 10 or more masters consolidate all the data into a single central slave. After getting a bit more information from them and seeing the application functionality, it was clear that MySQL Labs 5.7.5 Multi Source Replication could be a good candidate. Why?:
– Each master is independent from the rest of the masters.
– One-way traffic: there is only one way to update a row, and that’s from the master.
– All the masters use the same schema and table, but no single master will ever need to, nor be able to update a row from another master.
– PK determined via app & master env.

Multisource replication is still in http://labs.mysql.com, but here’s what I did to test it out.

Continue reading