MySQL 5.7并发复制隐式bug实例分析
前言
我们MySQL线上环境大部分使用的是5.7.18的版本,这个版本已修复了很多bug,但针对主从复制的bug还是有很多的,尤其是一些组复制、并行复制的bug尤为突出,在5.7.19版本有做相应改善和修复。所以建议5.7.19之前的版本还是不要使用mgr和并发复制的功能,如使用建议升级至5.7.19(含)以后的版本。
我这里遇到的问题主要是莫名其妙的数据同步出现问题,无法执行stopslave,数据不一致等现象,经过查看发现是版本bug所致,所以对已上线的从库关闭并发复制,对未上线的系统实行版本升级。此风险非常非常高,各位务必重视。
具体5.7.19修复的复制bug如下:
参考手册:https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-19.html
References:Seealso:Bug#84107.
Replication:InthecaseofdelayedinitializationoftheGroupReplicationplugin,deployedinsingle-primarymode,secondarieswereabletogetwritesthroughanasynchronousreplicationchannel,whichisnotallowedinnormalinitializationoftheGroupReplicationplugin.(Bug#26314756)
Replication:WithGTIDsgeneratedforincidentlogevents,MySQLerrorcode1590(ER_SLAVE_INCIDENT)couldnotbeskippedusingthe--slave-skip-errors=1590startupoptiononareplicationslave.(Bug#26266758)
Replication:AUSEstatementthatfollowedaSETGTID_NEXTstatementsometimeshadnoeffect.(Bug#26128931)
Replication:Groupscannowcontainmembersrunningdifferentserverversionstoenableyoutodoonlineupgradesofareplicationgroup.Therulesforcombiningmembersinagroupwithdifferentversionsare:
Ifyouhaveagroupwith8.0members,youcannotadda5.7member
Ifyouhaveagroupwith5.7membersyoucanadda8.0member,butitremainsinread-onlymode.Writingtothismemberisdangerouswhilethegroupcontainsmultipleserverversionsandshouldbeavoided.
Inasingle-primarygroup,ifthecurrentprimaryleavesthegroupandanewprimarymustbeelected,theprimaryisfirstchosenfromthelowerversionmembers.Ifnolowerversionmemberisfound,theprimaryischosenfromnewerversionmembers.(Bug#25876807)
Replication:Whenbinlog_checksum=NONEwassetonaMySQLserverafterstartup,andthenGroupReplicationwasstarted,ifanerroroccurred,theserverremainedinRECOVERINGstateandcouldnotbeshutdown.(Bug#25793366,Bug#85667)
Replication:InaGroupReplicationsetupwherecircularasynchronousreplicationwasimplementedbetweenmembersofdifferentreplicationgroups,viewchangelogeventswererepeatedlyreplicatedbetweenthegroupswithnewgeneratedGTIDseachtime.Thefixensuresthatviewchangelogeventsareignoredoutsidethenamedreplicationgroupwheretheyoccur,andnevergeneratenewGTIDs.(Bug#25674926)
References:Seealso:Bug#26049695,Bug#25928854,Bug#25721175.
Replication:WhenfirststartingtheMySQLserverfollowinganinstallationfromRPM,passwwordvalidationpluginisactivatedbydefault(trueonlyforRPMinstallations).Ifbinaryloggingwasalreadyenabledatthistime,theactivationwaslogged,eventhoughpluginactivationsshouldnotberecordedinthebinarylog.(Bug#25672750)
Replication:Inasetupwheresingle-primaryGroupReplicationwascombinedwithasynchronousreplication,forexamplewithS1andS2formingagroupandwithS2andS3functioningasmasterandslave,secondariessuchasS2wereacceptingtransactionsandthesecouldthenenterthegroup.Thefixpreventssecondariescreatinganasynchronousreplicationchannelwhenbelongingtoasingle-primarygroup,andGroupReplicationcannotbestartedwhenasynchronousreplicationisrunning.(Bug#25574200,Bug#85047)
References:Seealso:Bug#86325,Bug#26078602.
Replication:Intheeventthatamemberfailedtojoinagroupthememberwasnotstoppingandcontinuedtoaccepttransactions.Toavoidthissetyourmemberstohavesuper_read_only=1inthemy.cfgfile.GroupReplicationnowchecksforthissettinguponsuccessfulstartupandsetssuper_read_only=0.Thisensuresthatmemberswhichdonotsuccessfullyjoinagroupcannotaccepttransactions.(Bug#25474736,Bug#84728)
Replication:Ifthebinarylogonamasterserverwasrotatedandafulldiskconditionoccurredonthepartitionwherethebinarylogfilewasbeingstored,theservercouldstopunexpectedly.Thefixaddsacheckfortheexistenceofthebinarylogwhenthedumpthreadswitchestonextbinarylogfile.Ifthebinarylogisdisabled,allbinarylogsuptothecurrentactivelogaretransmittedtoslaveandanerrorisreturnedtothereceiverthread.(Bug#25076007)
Replication:InterleavedtransactionscouldsometimesdeadlocktheslaveapplierwhenthetransactionisolationlevelwassettoREPEATABLE-READ.(Bug#25040331)
Replication:Ifarelaylogindexfilenamedrelaylogfilesthatdidnotexist,RESETSLAVEALLsometimesdidnotfullycleanupproperly.(Bug#24901077)
Replication:Theslave_skip_errorssystemvariabledidnotpermiterrornumberslargerthan3000.ThankstoTsubasaTanakaforthepatch.(Bug#24748639,Bug#83184)
Replication:mysqlbinlog,ifinvokedwiththe--rawoption,doesnotflushtheoutputfileuntiltheprocessterminates.Butifalsoinvokedwiththe--stop-neveroption,theprocessneverterminates,thusnothingiseverwrittentotheoutputfile.Nowtheoutputisflushedaftereachevent.(Bug#24609402)
Replication:Amemoryleakinmysqlbinlogwasfixed.Theleakhappenedwhenprocessingfakerotateevents,orwhenusing--rawandthedestinationlogfilecouldnotbecreated.Theleakonlyoccurredwhenprocessingeventsfromaremoteserver.ThankstoLaurynasBiveinisforhiscontributiontofixingthisbug.(Bug#24323288,Bug#82283)
Replication:AslaveservercouldloseeventsnotyetappliedwhenMASTER_AUTO_POSITION=0,bothreplicationthreadswerestopped,andtheapplierdelaywaschangedusingCHANGEMASTERTOMASTER_DELAY=N.(Bug#23203678,Bug#81232)
References:Seealso:Bug#25340185,Bug#84375.
Replication:TransmissionoflargeGCSmessagescouldtakesolongthesenderappearedtohavedied.(Bug#22671846)
Replication:Multithreadedslavescouldnotbeconfiguredwithsmallqueuesizesusingslave_pending_jobs_size_maxiftheyeverneededtoprocesstransactionslargerthanthatsize.Anypacketlargerthanslave_pending_jobs_size_maxwasrejectedwiththeerrorER_MTS_EVENT_BIGGER_PENDING_JOBS_SIZE_MAX,evenifthepacketwassmallerthanthelimitsetbyslave_max_allowed_packet.
Withthisfix,slave_pending_jobs_size_maxbecomesasoftlimitratherthanahardlimit.Ifthesizeofapacketexceedsslave_pending_jobs_size_maxbutislessthanslave_max_allowed_packet,thetransactionishelduntilalltheslaveworkershaveemptyqueues,andthenprocessed.Allsubsequenttransactionsarehelduntilthelargetransactionhasbeencompleted.Thequeuesizeforslaveworkerscanthereforebelimitedwhilestillallowingoccasionallargertransactions.(Bug#21280753,Bug#77406)
Replication:AnincidenteventthatbrokereplicationwasnotwrittentothebinarylogwithaGTID,sothatitwasnotpossibletoskiptheeventusingSETgtid_next=value.Instead,itwasnecessarytosettherelaylogfileandrelaylogpositionsdirectly;thismeantthat,whenautopositioningwasenabled,itwasnecessaryfirsttodisableit,thentosettherelaylogfileandposition,andfinallytore-enableautopositioning.
NowinsuchcasesMySQLwritestheincidenteventintothestatementcache,sothataGTIDisgeneratedandwrittenforitpriortoflushing,andthattheslaveapplierworkswiththechange.ThenuserscanskiptheeventusingtheSQLstatementSETgtid_next=value,followedbyBEGINandCOMMIT.(Bug#19594845)
Replication:Incertaincases,themastercouldwritetothebinarylogalast_committedvaluewhichwassmallerthanitshouldhavebeen.Thiscouldcausetheslavetoexecuteinparalleltransactionswhichshouldnothavebeen,leadingtoinconsistenciesorothererrors.(Bug#84471,Bug#25379659)
Replication:Whenusinggroup_replication_ip_whitelist=AUTOMATIC,IPsintheprivatenetworkarepermittedautomatically,butsomeclassCIPaddresseswerenotbeingpermittedcorrectly.(Bug#84329,Bug#25503458)
Replication:WhenanexistingGTID_NEXTtransactionwasassignedaconflictingGTIDbytheserver,GroupReplicationgeneratedanassertupondetectingtwotransactionswithsameGTID.ThiswasbecauseGroupReplicationgeneratestheGTIDafterconflictdetection,whichislaterthanwithmaster/slavereplication.ThefixrelaxessomeconditionstoonlybecalledwhencommitisdoneandamessagehasbeenaddedtoalertyouwhenaGTIDhasalreadybeenused.(Bug#84153,Bug#25232042)
Replication:ThereplicationapplierthreadreturnsError3002ER_INCONSISTENT_ERRORwhenthereisadifferencebetweenanexpectederrornumberandtheactualerrornumber.Itisnowpossibletoignorethiserrorbyusing3002withslave_skip_errors.(Bug#83186,Bug#24753281)
Replication:MySQLlostitsGTIDpositionfollowingarestartwhenadumpfrommysqldumphadbeenusedtoloaddata.
Tokeepthisproblemfromoccurring,themysql.gtid_executedtableisnowexcludedautomaticallyfromdumpsmadebymysqldump.(Bug#82848,Bug#24590891)
References:Seealso:Bug#87455,Bug#26643180.
Replication:Corruptionofrelaylogsforonechannelinmulti-sourcereplicationcausedgoodchannelsnottobeinitalizedduringaserverrestart.Inaddition,whenrunwith--skip-slave-start=false,theserveralsofailedtostartslavethreadsforthosechannelswhichwereingoodcondition,despitethefactthatitshouldhavestartedtheslavethreadsforallgoodchannels.
Now,regardlessofanyerrorsonotherchannels,theserverattemptstocreateandinitializechannelsthatareingoodcondition,andstartsslavethreadsforthegoodchannelsif--skip-slave-startisdisabled.Aspartofthisfix,STARTSLAVEandSTOPSLAVE,whichareintendedtooperateonallchannels,arealsomodifiedsuchthattheycontinueexecutingonallgoodchannelseveniftheyfindbadchannelsamongthem.(Bug#82209,Bug#24285104)
Replication:TheSQLthreadwasunabletoGTIDskipapartialtransaction.(Bug#81119,Bug#25800025)
Debianclientpackagesweremissinginformationaboutconflictswithakonadi-backend-mysqlpackages.(Bug#26002288)
总结
以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作具有一定的参考学习价值,如果有疑问大家可以留言交流,谢谢大家对毛票票的支持。