{"id":2003,"date":"2015-01-21T06:49:24","date_gmt":"2015-01-21T06:49:24","guid":{"rendered":"http:\/\/kedar.nitty-witty.com\/?p=2003"},"modified":"2023-04-25T05:14:32","modified_gmt":"2023-04-25T05:14:32","slug":"debugging-percona-xtradb-galera-cluster-node-startup-sst-errors","status":"publish","type":"post","link":"https:\/\/kedar.nitty-witty.com\/blog\/debugging-percona-xtradb-galera-cluster-node-startup-sst-errors","title":{"rendered":"Debugging Percona Xtradb (Galera) Cluster node startup \/ SST errors"},"content":{"rendered":"\n<p>We were working on setting up Percona XtraDB (Galera) Cluster following the Documentation for version 5.6+. This blog post is a use-case of a debugging \/ troubleshooting &amp; fixing different SST\u00a0issues we faced while bringing-up the cluster nodes.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><em>WSREP: Failed to read uuid:seqno from joiner script. 2015-01-15 17:33:19 24618 <\/em><\/h3>\n\n\n\n<h3 class=\"wp-block-heading\"><em>WSREP: SST failed: 32 (Broken pipe)<\/em><\/h3>\n\n\n\n<p>We already have a successfully bootstrapped node-1 and now we&#8217;re adding second node (node-2) which was causing following error.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>\"2015-01-15 17:33:19 24618 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address '10.10.10.1' --auth 'sstuser:xxx' --datadir '\/var\/lib\/mysql\/' --defaults-file '\/etc\/my.cnf' --parent '24618'  '' : 32 (Broken pipe)\n2015-01-15 17:33:19 24618 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.\n2015-01-15 17:33:19 24618 [ERROR] WSREP: SST failed: 32 (Broken pipe)\n2015-01-15 17:33:19 24618 [ERROR] Aborting\"\n<\/em><\/pre>\n\n\n\n<!--more Continue Reading...-->\n\n\n\n<p><br>Verified that the SST user has proper privileges and concluded to check on the donor side.<br>Checked &#8220;innobackup.backup.log&#8221; file on master which logs the xtrabackup output during SST:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>[root@node-1 mysql]# tail -5 innobackup.backup.log\n\n150115 17:33:19  innobackupex: Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_file=\/etc\/my.cnf;mysql_read_default_group=xtrabackup;mysql_socket=\/var\/lib\/mysql\/mysql.sock' as 'sstuser'  (using password: YES).\ninnobackupex: got a fatal error with the following stacktrace: at \/usr\/\/bin\/innobackupex line 2990\n        main::mysql_connect('abort_on_error', 1) called at \/usr\/\/bin\/innobackupex line 1530\ninnobackupex: Error: Failed to connect to MySQL server as DBD::mysql module is not installed at \/usr\/\/bin\/innobackupex line 2990.\n<\/em><\/pre>\n\n\n\n<p>As the above log, innobackup.backup.log, conveys it&#8217;s missing DBD module causing the error. Checked to install perl&#8217;s DBD module for MySQL through yum as follows:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>[root@node-1 mysql]# yum install perl-DBD-MySQL\nLoaded plugins: fastestmirror\nLoading mirror speeds from cached hostfile\n * base: mirror.netaddicted.ca\n * epel: mirror.csclub.uwaterloo.ca\n * extras: mirror.netaddicted.ca\n * updates: mirror.netaddicted.ca\nSetting up Install Process\nPackage perl-DBD-MySQL-4.013-3.el6.x86_64 already installed and latest version\nNothing to do\n<\/em><\/pre>\n\n\n\n<p>Well this resulted that we already have the DBD installed! Tried running the backup command manually and received the same error described above:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>[root@node-1 mysql]# innobackupex --user=sstuser --password=xxxx --defaults-file=\/etc\/my.cnf --slave-info  --tmpdir=\/tmp --stream=tar .\/   2&gt; \/tmp\/xtrabackup.log  |gzip &gt; backup.tar.gz\n[root@node-1 mysql]# vi \/tmp\/xtrabackup.log\n<\/em><\/pre>\n\n\n\n<p>Attempted checking mysql version using following command, pointed missing dependencies for DBD:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>[root@node-1 mysql]#  perl -MDBD::mysql -e 'print $DBD::mysql::VERSION'\nCan't load '\/usr\/lib64\/perl5\/auto\/DBD\/mysql\/mysql.so' for module DBD::mysql: libmysqlclient.so.16: cannot open shared object file: No such file or directory at \/usr\/lib64\/perl5\/DynaLoader.pm line 200.\n at -e line 0\nCompilation failed in require.\nBEGIN failed--compilation aborted.\n<\/em><\/pre>\n\n\n\n<p>Now as it&#8217;s&nbsp;clear about the missing dependencies, tried looking, identifying &amp; fixing the dependencies:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>[root@node-1 mysql]# locate libmysqlclient.so.16\n[root@node-1 mysql]# yum list installed| grep perl-DBD-MySQL\nperl-DBD-MySQL.x86_64  4.013-3.el6      @base\n[root@node-1 mysql]# yum deplist perl-DBD-MySQL.x86_64|grep mysql\n  dependency: libmysqlclient.so.16()(64bit)\n   provider: mysql-libs.x86_64 5.1.73-3.el6_5\n  dependency: libmysqlclient.so.16(libmysqlclient_16)(64bit)\n   provider: mysql-libs.x86_64 5.1.73-3.el6_5\n<\/em><\/pre>\n\n\n\n<p>Installed missing dependencies:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>[root@node-1 mysql]# yum install mysql-libs.x86_64\nLoaded plugins: fastestmirror\nLoading mirror speeds from cached hostfile\n * base: mirror.netaddicted.ca\n * epel: mirror.csclub.uwaterloo.ca\n * extras: mirror.netaddicted.ca\n * updates: mirror.netaddicted.ca\nSetting up Install Process\nPackage mysql-libs is obsoleted by Percona-Server-shared-51, trying to install Percona-Server-shared-51-5.1.73-rel14.12.624.rhel6.x86_64 instead\nResolving Dependencies\n--&gt; Running transaction check\n---&gt; Package Percona-Server-shared-51.x86_64 0:5.1.73-rel14.12.624.rhel6 will be installed\n--&gt; Finished Dependency Resolution\n\nDependencies Resolved\n\n========================================================================================================================================================================\n Package                                      Arch                       Version                                       Repository                                  Size\n========================================================================================================================================================================\nInstalling:\n Percona-Server-shared-51                     x86_64                     5.1.73-rel14.12.624.rhel6                     percona-release-x86_64                     2.1 M\n\nTransaction Summary\n========================================================================================================================================================================\nInstall       1 Package(s)\n\nTotal download size: 2.1 M\nInstalled size: 5.9 M\nIs this ok [y\/N]: y\nDownloading Packages:\nPercona-Server-shared-51-5.1.73-rel14.12.624.rhel6.x86_64.rpm                                                                                    | 2.1 MB     00:00\nRunning rpm_check_debug\nRunning Transaction Test\nTransaction Test Succeeded\nRunning Transaction\n  Installing : Percona-Server-shared-51-5.1.73-rel14.12.624.rhel6.x86_64                                                                                            1\/1\n  Verifying  : Percona-Server-shared-51-5.1.73-rel14.12.624.rhel6.x86_64                                                                                            1\/1\n\nInstalled:\n  Percona-Server-shared-51.x86_64 0:5.1.73-rel14.12.624.rhel6\n\nComplete!\n<\/em><\/pre>\n\n\n\n<p>Now as the perl-DBD-MySQL dependency is fixed,&nbsp;ran the backup manually on node-1 and it completed successfully.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>[root@node-1 mysql]# innobackupex --user=sstuser --password=xxx --defaults-file=\/etc\/my.cnf --slave-info  --tmpdir=\/tmp --stream=tar .\/   2&gt; \/tmp\/xtrabackup.log  |gzip &gt; backup.tar.gz\n<\/em><\/pre>\n\n\n\n<p>I hoped this should resolve the issue but&#8230;<br>On node-1, cleared datadir and attempted restarting the node again failed complaining as follows:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>2016-15 18:54:14 25385 [Note] WSREP: Requesting state transfer: success, donor: 0\n2015-01-15 18:54:16 25385 [Note] WSREP: (ce9728c5, 'tcp:\/\/0.0.0.0:4567') turning message relay requesting off\n2015-01-15 18:54:19 25385 [Note] WSREP: 0.0 (node-1.com): State transfer to 1.0 (node-2.com) complete.\n2015-01-15 18:54:19 25385 [Note] WSREP: Member 0.0 (node-1.com) synced with group.\nWSREP_SST: [ERROR] xtrabackup process ended without creating '\/var\/lib\/mysql\/\/xtrabackup_galera_info' (20150115 18:54:19.507)\n...\n...\nWSREP_SST: [ERROR] Cleanup after exit with status:32 (20150115 18:54:19.543)\nWSREP_SST: [INFO] Removing the sst_in_progress file (20150115 18:54:19.548)\n2015-01-15 18:54:19 25385 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address '10.10.10.1' --auth 'sstuser:s3cret' --datadir '\/var\/lib\/mysql\/' --defaults-file '\/etc\/my.cnf' --parent '25385'  '' : 32 (Broken pipe)\n2015-01-15 18:54:19 25385 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.\n2015-01-15 18:54:19 25385 [ERROR] WSREP: SST failed: 32 (Broken pipe)\n<\/em><\/pre>\n\n\n\n<p>Looked further in details of configuration and found that the xtrabackup_sst_method was different on node-1 vs on node-2:<\/p>\n\n\n\n<p>Node-1:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>mysql&gt; show global variables like '%sst%';\n+---------------------------------+---------------+\n| Variable_name                   | Value         |\n+---------------------------------+---------------+\n| wsrep_sst_auth                  | xxxxxxxx      |\n| wsrep_sst_donor                 |               |\n| wsrep_sst_donor_rejects_queries | OFF           |\n| wsrep_sst_method                | xtrabackup-v2 |\n| wsrep_sst_receive_address       | AUTO          |\n+---------------------------------+---------------+\n<\/em><\/pre>\n\n\n\n<p>Node-2:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>$] grep wsrep_sst_method \/etc\/my.cnf\nwsrep_sst_method=xtrabackup\n<\/em><\/pre>\n\n\n\n<p>The wsrep_sst_method variable sets up the method for taking the State Snapshot Transfer (SST). Correced wsrep_sst_method=xtrabackup-v2 and the node came up clean.<\/p>\n\n\n\n<p>From the docs:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>The xtrabackup-v2 wsrep_sst_method is same as xtrabackup SST except that it uses newer protocol, hence is not compatible. This is the recommended option for PXC 5.5.34 and above.<\/em><\/pre>\n\n\n\n<p>There are other SST methods like rsync &amp; mysqldump depending on the method you prefer. (Check references)<\/p>\n\n\n\n<p>We progressed further but while adding a third node (node-3) we faced another issue and it did not join the cluster. We again checked the donor&#8217;s innobackup.backup.log file to trace the issue:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>[root@node-1 mysql]# vi innobackup.backup.log\n...\n...\ninnobackupex:  Using server version 5.6.21-70.1-56-log\n\ninnobackupex: Created backup directory \/tmp\/tmp.fFyzHDRH1f\n^Gxbstream: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)\ninnobackupex: 'xbstream -c' returned with exit code 1.\ninnobackupex: got a fatal error with the following stacktrace: at \/usr\/\/bin\/innobackupex line 4865\n        main::backup_file_via_stream('\/tmp\/tmp.HE5T8DeArs', 'backup-my.cnf') called at \/usr\/\/bin\/innobackupex line 4914\n        main::backup_file('\/tmp\/tmp.HE5T8DeArs', 'backup-my.cnf', '\/tmp\/tmp.HE5T8DeArs\/backup-my.cnf') called at \/usr\/\/bin\/innobackupex line 4938\n        main::write_to_backup_file('\/tmp\/tmp.HE5T8DeArs\/backup-my.cnf', '# This MySQL options file was generated by innobackupex.\\x{a}\\x{a}# T...') called at \/usr\/\/bin\/innobackupex line 3746\n        main::write_backup_config_file('\/tmp\/tmp.HE5T8DeArs\/backup-my.cnf') called at \/usr\/\/bin\/innobackupex line 3673\n        main::init() called at \/usr\/\/bin\/innobackupex line 1557\ninnobackupex: Error: Failed to stream '\/tmp\/tmp.HE5T8DeArs\/backup-my.cnf': 1 at \/usr\/\/bin\/innobackupex line 4865.\n<\/em><\/pre>\n\n\n\n<p>This time we have confirmed all the issues we faced earlier are answered but still the SST was not progressing.<br>Running the xtrabackup command manually doesn&#8217;t cause any error and backup finished correctly:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>[root@node-1 mysql]# innobackupex --user=root --password=$pw --defaults-file=\/etc\/my.cnf --slave-info  --tmpdir=\/tmp --stream=tar .\/   2&gt; \/tmp\/xtrabackup.log  |gzip &gt; backup.tar.gz\n[root@node-1 mysql]# vi \/tmp\/xtrabackup.log\n<\/em><\/pre>\n\n\n\n<p>As the manual backup goes well it hints that the issue was somewhere on new node though innobackup.backup.log expressed the errors.<\/p>\n\n\n\n<p>Rechecked the configuration file here and noted an important difference.&nbsp;On node-1 we had following innodb settings while on node-3 we did not:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>innodb_log_file_size=48M\ninnodb_data_file_path=ibdata1:12M:autoextend\ninnodb_buffer_pool_size=400M\n<\/em><\/pre>\n\n\n\n<p>Here, difference in values of innodb_log_file_size &amp; innodb_data_file_path caused the xtrabackup restore fail.<br>innodb_log_file_size is pretty much the default value but <span style=\"text-decoration: underline;\"><em>innodb_data_file_path has default as &#8220;ibdata1:10M:autoextend&#8221;<\/em><\/span> which was a differentiating factor causing the (restore) SST fail. Adding these parameters in my.cnf on node-3 fixed the issue and node joined the cluster.<\/p>\n\n\n\n<p>So, here we faced three issues causing the State Snapshot Transfer to fail and hence stopping the nodes to join the cluster:<br>1. The perl-DBD-MySQL was missing dependencies and that we could identify following the logs on node-1.<br>2. The configuration for xtrabackup_sst_method on node-2 was incorrectly set than what was recommended (i.e. xtrabackup-v2).<br>3. Differentiating InnoDB related configuration variables can cause SST to fail.<\/p>\n\n\n\n<p>Correcting them made the Galera happy&#8230;<br>Hope above troubleshooting steps are helpful.&nbsp;Recently we saw Percona&#8217;s blog conveying that we can have multiple reasons for an SST to fail; this post practically agree to that :).<\/p>\n\n\n\n<p><strong>References:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>http:\/\/www.percona.com\/doc\/percona-xtradb-cluster\/5.5\/wsrep-system-index.html#wsrep_sst_method<\/li>\n\n\n\n<li>http:\/\/www.percona.com\/doc\/percona-xtradb-cluster\/5.6\/manual\/state_snapshot_transfer.html<\/li>\n<\/ul>\n\n\n\n<p><strong>Also see:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span style=\"line-height: 1.5;\">http:\/\/www.percona.com\/blog\/2014\/12\/30\/diagnosing-sst-errors-with-percona-xtradb-cluster-for-mysql\/<\/span><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"A Percona Xtradb (Galera) Cluster node may fail to join it due to many possible mistakes causing SST to fail. It could be a  configuration item or purely setup requirement. In this article we will be troubleshooting step by step the SST issues faced.\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[8,377],"tags":[552,386,381,382,427,383,384,553,385,387],"class_list":{"0":"post-2003","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-mysql","7":"category-mysql-articles","8":"tag-error-wsrep","9":"tag-debug","10":"tag-galera","11":"tag-galera-cluster","12":"tag-mysql","13":"tag-percona-xtradb-cluster","14":"tag-sst-errors","15":"tag-sst-failed","16":"tag-state-snapshot-transfer","17":"tag-troubleshoot"},"aioseo_notices":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/kedar.nitty-witty.com\/blog\/wp-json\/wp\/v2\/posts\/2003","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kedar.nitty-witty.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kedar.nitty-witty.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kedar.nitty-witty.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kedar.nitty-witty.com\/blog\/wp-json\/wp\/v2\/comments?post=2003"}],"version-history":[{"count":7,"href":"https:\/\/kedar.nitty-witty.com\/blog\/wp-json\/wp\/v2\/posts\/2003\/revisions"}],"predecessor-version":[{"id":2820,"href":"https:\/\/kedar.nitty-witty.com\/blog\/wp-json\/wp\/v2\/posts\/2003\/revisions\/2820"}],"wp:attachment":[{"href":"https:\/\/kedar.nitty-witty.com\/blog\/wp-json\/wp\/v2\/media?parent=2003"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kedar.nitty-witty.com\/blog\/wp-json\/wp\/v2\/categories?post=2003"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kedar.nitty-witty.com\/blog\/wp-json\/wp\/v2\/tags?post=2003"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}