关于Oracle listener日志解析利器的使用方法
前言
我们在处理数据库的问题的时候,有时在数据库端看不到异常的等待,但是应用却报连接异常。
这种情况,有可能是在压力还没有传递到数据库,在网络这层已经发生问题了。如连接风暴,此时的数据库listener的进程已经忙不过来,将连接丢弃。(在主机层面可以看到tcpListeneDrop增加)。
我们如果去看普通的listener日志,看到的东西是这样的:
…… 24-FEB-201709:11:03*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=sqlplus@mynwdb)(HOST=mynwdb)(USER=appl)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65184))*establish*mynwdb*0 24-FEB-201709:11:06*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=STANDARD)(HOST=mynwdb)(USER=appl)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65186))*establish*mynwdb*0 24-FEB-201709:11:10*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=STANDARD)(HOST=mynwdb)(USER=appl)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65188))*establish*mynwdb*0 24-FEB-201709:11:10*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=STANDARD)(HOST=mynwdb)(USER=appl)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65189))*establish*mynwdb*0 FriFeb2409:11:142017 24-FEB-201709:11:14*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65194))*establish*mynwdb*0 24-FEB-201709:11:14*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65195))*establish*mynwdb*0 24-FEB-201709:11:15*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65196))*establish*mynwdb*0 24-FEB-201709:11:15*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65197))*establish*mynwdb*0 24-FEB-201709:11:15*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65198))*establish*mynwdb*0 24-FEB-201709:11:15*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65199))*establish*mynwdb*0 24-FEB-201709:11:15*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65200))*establish*mynwdb*0 24-FEB-201709:11:15*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65201))*establish*mynwdb*0 24-FEB-201709:11:16*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65202))*establish*mynwdb*0 24-FEB-201709:11:16*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65203))*establish*mynwdb*0 24-FEB-201709:11:16*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65204))*establish*mynwdb*0 24-FEB-201709:11:16*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65205))*establish*mynwdb*0 24-FEB-201709:11:16*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65206))*establish*mynwdb*0 24-FEB-201709:11:17*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65207))*establish*mynwdb*0 24-FEB-201709:11:17*(CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr)))*(ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65208))*establish*mynwdb*0 ……
这样我们会看的比较晕,不知道里面的连接数变化。
有一个listener日志的解析利器,可以分析到这种类型的问题。下载见文末。
用法:lsnr_analyzer.pl-t‘timerexge'[-h]filename
如,我们要分析24-FEB-201709:30~09:59的连接情况,可以如下操作:
[ora@mynwdbtrace]$lsnr_analyzer.pl-t'24-FEB-201709:[3-5][0-9]'mynwdb_cut1.log 24-FEB-201709:30=>Total:37 24-FEB-201709:31=>Total:50 24-FEB-201709:32=>Total:17 24-FEB-201709:33=>Total:16 24-FEB-201709:34=>Total:19 24-FEB-201709:35=>Total:16 24-FEB-201709:36=>Total:17 24-FEB-201709:37=>Total:16 24-FEB-201709:38=>Total:13 24-FEB-201709:39=>Total:29 24-FEB-201709:40=>Total:43 24-FEB-201709:41=>Total:38 24-FEB-201709:42=>Total:17 24-FEB-201709:43=>Total:16 24-FEB-201709:44=>Total:42 24-FEB-201709:45=>Total:21 24-FEB-201709:46=>Total:16 24-FEB-201709:47=>Total:15 24-FEB-201709:48=>Total:12 24-FEB-201709:49=>Total:25 24-FEB-201709:50=>Total:47 24-FEB-201709:51=>Total:41 24-FEB-201709:52=>Total:21 24-FEB-201709:53=>Total:12 24-FEB-201709:54=>Total:17 24-FEB-201709:55=>Total:16 24-FEB-201709:56=>Total:20 24-FEB-201709:57=>Total:17 24-FEB-201709:58=>Total:17 24-FEB-201709:59=>Total:21 [ora@mynwdbtrace]$
可以看到,在09:31的时候,有个连接数的增加,到了一分钟50个连接。
我们可以加-h的参数,breakdown具体是哪些主机连接过来的:
[ora@mynwdbtrace]$lsnr_analyzer.pl-t'24-FEB-201709:[3-5][0-9]'-hmynwdb_cut1.log 24-FEB-201709:30=>Total:37 11.22.3.12337 24-FEB-201709:31=>Total:50 22.3.4.1111 11.22.3.12349 24-FEB-201709:32=>Total:17 10.8.1.2181 11.22.3.12316 24-FEB-201709:33=>Total:16 11.22.3.12316 24-FEB-201709:34=>Total:19 11.22.3.12319 24-FEB-201709:35=>Total:16 11.22.3.12316 24-FEB-201709:36=>Total:17 22.3.4.1111 11.22.3.12316 24-FEB-201709:37=>Total:16 12.5.6.2211 11.22.3.12315 24-FEB-201709:38=>Total:13 11.22.3.12313 24-FEB-201709:39=>Total:29 11.22.3.12329 24-FEB-201709:40=>Total:43 11.22.3.12343 24-FEB-201709:41=>Total:38 22.3.4.1111 11.22.3.12337 24-FEB-201709:42=>Total:17 12.5.6.2211 11.22.3.12316 24-FEB-201709:43=>Total:16 11.22.3.12316 24-FEB-201709:44=>Total:42 11.22.3.12342 24-FEB-201709:45=>Total:21 11.22.3.12321 24-FEB-201709:46=>Total:16 22.3.4.1111 11.22.3.12315 24-FEB-201709:47=>Total:15 12.5.6.2211 11.22.3.12314 24-FEB-201709:48=>Total:12 11.22.3.12312 24-FEB-201709:49=>Total:25 11.22.3.12325 24-FEB-201709:50=>Total:47 11.22.3.12347 24-FEB-201709:51=>Total:41 22.3.4.1111 11.22.3.12340 24-FEB-201709:52=>Total:21 12.5.6.2211 11.22.3.12320 24-FEB-201709:53=>Total:12 11.22.3.12312 24-FEB-201709:54=>Total:17 11.22.3.12317 24-FEB-201709:55=>Total:16 11.22.3.12316 24-FEB-201709:56=>Total:20 22.3.4.1111 11.22.3.12319 24-FEB-201709:57=>Total:17 12.5.6.2211 11.22.3.12316 24-FEB-201709:58=>Total:17 11.22.3.12317 24-FEB-201709:59=>Total:21 11.22.3.12321 [ora@mynwdbtrace]$
可以看到09:30的时候,是来自11.22.3.123有49个连接,来自22.3.4.111有1个连接。
注1:lsnr_analyzer.pl是用来分析已经establish的连接,如果是其他连接,不统计在内。
注2:lsnr_analyzer.pl的统计比较消耗cpu,30万行的listener日志分析大约1秒出结果,300万行的listener日志分析大约6秒出结果。所以如果listener日志比较大,建议tail剪切后统计。
对于已经出现tcpListenDrop的情况,我们分几种情况分析:
(1)如果是突然的连接风暴,需要控制前端的连接,如weblogic中间件逐台启动,而不是同时启动;
(2)如果一直连接数很高,导致listener无法处理,就需要拆分listener,如按照端口拆分。
点击下载:lsnr_analyzer.pl
总结
以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作能带来一定的帮助,如果有疑问大家可以留言交流,谢谢大家对毛票票的支持。