mysql CPU高负载问题排查
MySQL导致的CPU高负载问题
今天下午发现了一个MySQL导致的向上服务器负载高的问题,事情的背景如下:
在某个新服务器上,新建了一个MySQL的实例,该服务器上面只有MySQL这一个进程,但是CPU的负载却居高不下,使用top命令查询的结果如下:
[dba_mysql@dba-mysql~]$top top-17:12:44up104days,20min,2users,loadaverage:1.06,1.02,1.00 Tasks:218total,1running,217sleeping,0stopped,0zombie Cpu0:0.3%us,0.0%sy,0.0%ni,99.7%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu1:0.3%us,0.0%sy,0.0%ni,99.7%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu2:0.0%us,0.0%sy,0.0%ni,100.0%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu3:0.3%us,0.0%sy,0.0%ni,99.7%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu4:0.3%us,0.0%sy,0.0%ni,99.7%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu5:0.0%us,0.0%sy,0.0%ni,100.0%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu6:100.0%us,0.0%sy,0.0%ni,0.0%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu7:0.0%us,0.0%sy,0.0%ni,100.0%id,0.0%wa,0.0%hi,0.0%si,0.0%st Mem:16318504ktotal,7863412kused,8455092kfree,322048kbuffers Swap:5242876ktotal,0kused,5242876kfree,6226588kcached PIDUSERPRNIVIRTRESSHRS%CPU%MEMTIME+COMMAND 75373mysql200845m699m29mS100.04.4112256:10mysqld 43285root200174m40m19mS0.70.3750:40.75consul 116553root200518m13m4200S0.30.10:05.78falcon-agent 116596nobody200143m62162784S0.30.00:00.81python 124304dba_mysq2001514414201000R0.30.00:02.09top 1root2002145215601248S0.00.00:02.43init
从上面的结果中,可以看到,8核的cpu只有一个核上面的负载是100%,其他的都是0%,而按照CPU使用率排序的结果也是mysqld的进程占用CPU比较多。
之前从来没有遇到过这个问题,当时第一反应是在想是不是有些业务层面的问题,比如说一些慢查询一直在占用CPU的资源,于是登陆到MySQL上使用showprocesslist查看了当前的进程,发现除了有少许update操作之外,没有其他的SQL语句在执行。于是我又查看了一眼慢日志,发现慢日志中的SQL语句执行时间都很短,大多数都是由于未使用索引导致的,但是扫描的记录数都很少,只有几百行,这样看起来业务层面的问题是不存在的。
排除了业务层面的问题,现在看看数据库层面的问题,查看了一眼bufferpool,可以看到这个值是:
mysql--dba_admin@127.0.0.1:(none)17:20:35>>showvariableslike'%pool%'; +-------------------------------------+----------------+ |Variable_name|Value| +-------------------------------------+----------------+ |innodb_buffer_pool_chunk_size|5242880| |innodb_buffer_pool_dump_at_shutdown|ON| |innodb_buffer_pool_dump_now|OFF| |innodb_buffer_pool_dump_pct|25| |innodb_buffer_pool_filename|ib_buffer_pool| |innodb_buffer_pool_instances|1| |innodb_buffer_pool_load_abort|OFF| |innodb_buffer_pool_load_at_startup|ON| |innodb_buffer_pool_load_now|OFF| |innodb_buffer_pool_size|5242880| |thread_pool_high_prio_mode|transactions| |thread_pool_high_prio_tickets|4294967295| |thread_pool_idle_timeout|60| |thread_pool_max_threads|100000| |thread_pool_oversubscribe|3| |thread_pool_size|8| |thread_pool_stall_limit|500| +-------------------------------------+----------------+ 17rowsinset(0.01sec)
从这个结果来看,bufferpool的大小只有5M大小,肯定是有问题的,一般情况下,线上环境的bufferpool都是1G往上,于是我查看了my.cnf配置文件,在配置文件中发现这个实例在启动的时候,innodb_buffer_pool_size的设置是0M,是的,没有看错,是0M。这里不得不提另外一个参数,我们可以看到innodb_buffer_pool_size的大小和innodb_buffer_pool_chunk_size的大小一样,这个chunk的概念是内存块,也就是说每次申请bufferpool的时候,是以"内存块"为单位申请的,一个bufferpool当中包含多个内存块,所以bufferpoolsize的大小需要是chunksize的整数倍。
由于innodb_buffer_pool_chunk_size本身的值为5M,当我们设置它为0M时,它会自动的将其大小设置为5M的倍数,所以我们的innodb_buffer_pool_size值是5M。
既然bufferpool的值比较小,那么我将它改成1G的大小,看看这个问题还会不会发生:
mysql--dba_admin@127.0.0.1:(none)17:20:41>>setglobalinnodb_buffer_pool_size=1073741824; QueryOK,0rowsaffected,1warning(0.00sec) mysql--dba_admin@127.0.0.1:(none)17:23:34>>showvariableslike'%pool%'; +-------------------------------------+----------------+ |Variable_name|Value| +-------------------------------------+----------------+ |innodb_buffer_pool_chunk_size|5242880| |innodb_buffer_pool_dump_at_shutdown|ON| |innodb_buffer_pool_dump_now|OFF| |innodb_buffer_pool_dump_pct|25| |innodb_buffer_pool_filename|ib_buffer_pool| |innodb_buffer_pool_instances|1| |innodb_buffer_pool_load_abort|OFF| |innodb_buffer_pool_load_at_startup|ON| |innodb_buffer_pool_load_now|OFF| |innodb_buffer_pool_size|1074790400| |thread_pool_high_prio_mode|transactions| |thread_pool_high_prio_tickets|4294967295| |thread_pool_idle_timeout|60| |thread_pool_max_threads|100000| |thread_pool_oversubscribe|3| |thread_pool_size|8| |thread_pool_stall_limit|500| +-------------------------------------+----------------+ 17rowsinset(0.00sec)
操作如上,这样我们修改bufferpool的值为1G,我们设置的值是1073741824,而实际的值变成了1074790400,这个原因在上面已经说过了,就是chunksize的值影响的。
此时使用top命令观察CPU使用情况:
[dba_mysql@dba-mysql~]$top top-22:19:09up104days,5:26,2users,loadaverage:0.45,0.84,0.86 Tasks:218total,1running,217sleeping,0stopped,0zombie Cpu0:0.3%us,0.3%sy,0.0%ni,99.3%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu1:0.3%us,0.0%sy,0.0%ni,99.7%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu2:1.0%us,0.0%sy,0.0%ni,99.0%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu3:1.0%us,0.0%sy,0.0%ni,99.0%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu4:0.3%us,0.3%sy,0.0%ni,99.3%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu5:0.3%us,0.0%sy,0.0%ni,99.7%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu6:0.0%us,0.3%sy,0.0%ni,99.7%id,0.0%wa,0.0%hi,0.0%si,0.0%st Cpu7:0.7%us,0.0%sy,0.0%ni,99.3%id,0.0%wa,0.0%hi,0.0%si,0.0%st Mem:16318504ktotal,8008140kused,8310364kfree,322048kbuffers Swap:5242876ktotal,0kused,5242876kfree,6230600kcached PIDUSERPRNIVIRTRESSHRS%CPU%MEMTIME+COMMAND 43285root200174m40m19mS1.00.3753:07.38consul 116842root200202m17m5160S1.00.10:21.30python 75373mysql2001966m834m29mS0.75.2112313:36mysqld 116553root200670m14m4244S0.70.10:44.31falcon-agent 116584root200331m11m3544S0.70.10:37.92python2.6 1root2002145215601248S0.00.00:02.43init
可以发现,CPU的使用率已经下去了,为了防止偶然现象,我又重新把bufferpool的大小改成了最初的5M的值,发现之前的问题又复现了,也就是说,设置大的bufferpool确实是一种解决方法。
到这里,问题是解决了,但是这个问题背后引发的一些东西却值得思考,小的bufferpool为什么会导致其中一个CPU的使用率是100%?
这里,我能想到的一个原因是5M的bufferpool太小了,会导致业务SQL在读取数据的时候和磁盘频繁的交互,而磁盘的速度比较慢,所以会提高IO负载,导致CPU的负载过高,至于为什么只有一个CPU的负载比较高,其他的近乎为0,这个问题可能还需要查一查,如果有知道的朋友,还请不吝赐教。
以上就是mysqlCPU高负载问题排查的详细内容,更多关于MySQLcpu高负载的资料请关注毛票票其它相关文章!