From fcb5c2f5205a92faa6cd4992269fe6dca1805d15 Mon Sep 17 00:00:00 2001 From: "Suren A. Chilingaryan" Date: Wed, 22 Dec 2021 00:12:11 +0100 Subject: Another MySQL replication failure is documented --- docs/troubleshooting.txt | 19 +++++++++++++++++++ log.txt | 1 + 2 files changed, 20 insertions(+) create mode 100644 log.txt diff --git a/docs/troubleshooting.txt b/docs/troubleshooting.txt index 459143e..315f9f4 100644 --- a/docs/troubleshooting.txt +++ b/docs/troubleshooting.txt @@ -373,8 +373,27 @@ Storage or again we can compare lvm volumes which are used by Gluster bricks and which are not. The later ones should be cleaned up. Again there is the script. + - Status of RAID can be checked with storcli utility + /opt/MegaRAID/storcli/storcli64 /c0/v0 show + /opt/MegaRAID/storcli/storcli64 /c0 show + to further check smart attributes of a specific disk, one first needs find a corresponding DID of the + disk. This is reported by storcli64 in JSON output mode + /opt/MegaRAID/storcli/storcli64 /c0 /eall /sall show j + Then, smartmontools can be used to extract smart attributes from this disk + smartctl --all --device megaraid, /dev/sda + MySQL ===== + - MySQL may stop connecting to the master. While correct username and password are set in the container + environment, mysql slave actually uses values stored somewhere in the database. It seems occasionally + a corruption might happen causing authentication to break and slave report error 1045 + Error connecting to master. retry-time: retries: 1, Error_code: 1045 + The work-around is to set correct username/password again (using the values from environment): + STOP SLAVE; + CHANGE MASTER TO MASTER_USER='replication' MASTER_PASSWORD='...'; + START SLAVE; + This should fix it and the new username/password will be remembered over restarts... + - MySQL may stop replicating from the master. There is some kind of deadlock in multi-threaded SLAVE SQL. This can be seen by exexuting (which should show a lot of slave threads waiting on coordinator to provide the load). diff --git a/log.txt b/log.txt new file mode 100644 index 0000000..209b924 --- /dev/null +++ b/log.txt @@ -0,0 +1 @@ + - ipekatrin1: Replaced disak in section 9. LSI software reports all is OK, but hardware led indicates a error (red). Probably indicator is broken. -- cgit v1.2.3