在O2的项目的ccps too many open files 的issue调查的过程中, 出现很多问题,刚才做了一个测试: 1,如何查看当前的进程打开的文件个数(这个数是实时波动) 下面以ccps为例说明
1)取得程序对应的PID(进程号) ps –ef | grep ccps 执行
[root@vvmocmp1 ccps]# ps -ef | grep ccps root 5661 1 0 20:33 pts/2 00:00:00 /bin/sh /opt/OC/ccps/jboss-4.2.3.GA/bin/run.sh -c all -g ccpsgroup -b 0.0.0.0 root 5685 5661 94 20:33 pts/2 00:00:17 /usr/java/jdk1.6.0_13/bin/java -Dprogram.name=run.sh -server -Dcom.sun.management.jmxremote -Djava.awt.headless=true -Xms1024m -Xmx1024m -XX:PermSize=64m -XX:MaxPermSize=256m -Djava.util.logging.config.file=/opt/OC/ccps/jboss-4.2.3.GA/server/all/ccps/applicationContext/logging.properties -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=n -Djava.net.preferIPv4Stack=true -Djava.endorsed.dirs=/opt/OC/ccps/jboss-4.2.3.GA/lib/endorsed -classpath /opt/OC/ccps/jboss-4.2.3.GA/server/all/ccps/applicationContext:/opt/OC/ccps/jboss-4.2.3.GA/bin/run.jar:/usr/java/jdk1.6.0_13/lib/tools.jar org.jboss.Main -c all -g ccpsgroup -b 0.0.0.0
从上可知道,当前的ccps进程所有者是root, pid为5685.
2) 用取得pid号,来实时取得此进程打开的文件数 ls -l /proc/5685 /fd/ | wc -l (注意:网上所说的用lsof -p pid,可以查询进程打开的文件数,但通过实验,不准确 真正的是前者可以实时反应,当超过ulimit -n值后,马上出现 too many open files 错误) 同时root用户也受到ulimit的限制 在当前session用ulimit -n可以查看当前用户的限制,默认为1024
2,查看当前用户的进程最大打开文件数限制(ulimit –n 默认为1024),也可以用limit –a查看 [root@vvmocmp1 ~]# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited pending signals (-i) 1024 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 143359 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
3,修改用户的最大打开文件数 在/etc/security/limits.conf这个文件中添加, root soft nofile 500 root hard nofile 500
说明: 实验目的是为了让root 的ccps进程出现too many open files.这里故意很小500 因为ccps所有者是root, 所设置root用户。 设置后,要重新用root用户登录server, 这时修改已生效,可以用ulimit –n查看
4,一定在3步骤重新登录的session中,启动ccps,这样ccps的同一时打开的文件一旦超过500就会出错。 [root@vvmocmp1 ccps]# tail -f ./log/ccps.log at org.springframework.util.MethodInvoker.invoke(MethodInvoker.java:270) at org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean$MethodInvokingJob.executeInternal(MethodInvokingJobDetailFactoryBean.java:224) at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86) at org.quartz.core.JobRunShell.run(JobRunShell.java:203) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520) Caused by: java.io.IOException: java.io.IOException: error=24, Too many open files at java.lang.UNIXProcess.<init>(UNIXProcess.java:148) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:452) ... 13 more [12-22 20:38:20,564] WARN [DefaultQuartzScheduler_Worker-8] SystemResourceMonitor.updateDiskInfoMap(826) | Can't execute "/bin/df -lP /var/opt/OC/ccps/cdr/local/" command java.io.IOException: Cannot run program "/bin/df": java.io.IOException: error=24, Too many open files at java.lang.ProcessBuilder.start(ProcessBuilder.java:459) at java.lang.Runtime.exec(Runtime.java:593) at java.lang.Runtime.exec(Runtime.java:431) at java.lang.Runtime.exec(Runtime.java:328) at com.hp.opencall.ccps.ccs.performance.SystemResourceMonitor.updateDiskInfoMap(SystemResourceMonitor.java:824) 而这个时候的ls –l查看的在500左右,而用lsof –p查看的却800多了没有出错。 [root@vvmocmp1 ccps]# ls -l /proc/5685/fd/ | wc -l 489
说明:测试完后,一定要把上面添加的两行去掉。
|
|