前言
- 当服务器中CPU或内存负载过大,Java工程师就需要对其监控和定位,是线程死锁了还是内存溢出等等问题。
- 本篇文章记录一些关于在Linux系统中查看CPU、进程/线程、jvm堆栈信息等指令,方便后续查阅。
top
- top是Linux系统中常用于看CPU、进程、内存运行指标的命令:
[root@localhost data/project]$ top
top - 11:17:42 up 177 days, 19:34, 6 users, load average: 6.58, 10.04, 9.68
Tasks: 145 total, 1 running, 144 sleeping, 0 stopped, 0 zombie
%Cpu(s): 19.2 us, 10.0 sy, 0.0 ni, 70.5 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem : 32780448 total, 4776344 free, 26326688 used, 1677416 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 5989168 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12241 root 20 0 8525184 1.8g 14780 S 46.0 5.8 1124:54 java
......
-
第一行系统参数:
- 11:17:42:系统当前时间
- up 177 days, 19:34:系统的运行时间
- 6 users:系统的登录用户数
- load average: 6.58, 10.04, 9.68:系统内1分钟、5分钟、15分钟的平均负载
-
第二行进程参数:
- 系统当前总进程145个
- 其中1个运行状态
- 144个睡眠状态
- 0个暂停状态
- 0个zombie僵尸状态
-
第三行CPU参数:
- us:用户态CPU占比19.2
- sy:内核态CPU占比10.0
- ni:改变nice进程的CPU占比0.0
- id:空闲CPU占比70.5
- wa:等待IO的CPU占比0.0
- hi:硬中断CPU占比0.0
- si:软中断CPU占比0.3
- st:当前虚拟机中被其他虚拟机占用CPU占比0.0
-
注:在多CPU服务器中,默认展示所有CPU的平均值,按1可展示每个CPU的占比。
[root@localhost data/project]$ top
top - 11:17:42 up 177 days, 19:34, 6 users, load average: 6.58, 10.04, 9.68
Tasks: 145 total, 1 running, 144 sleeping, 0 stopped, 0 zombie
%Cpu0 : 21.6 us, 12.2 sy, 0.0 ni, 65.9 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu1 : 20.5 us, 9.6 sy, 0.0 ni, 69.6 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu2 : 25.6 us, 10.2 sy, 0.0 ni, 63.8 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu3 : 20.0 us, 8.6 sy, 0.0 ni, 71.0 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
......
-
第四/五行物理内存和磁盘内存参数:
- total:内存总量32780448k
- free:空闲内存4776344k
- used:占用内存26326688k
- buff/cache:系统缓存/page cache占用1677416k
- avail Mem:虚拟内存5989168k
-
注:系统优先使用物理内存,当物理内存不足就会使用磁盘内存,一般情况尽量使用物理内存以提升性能。
-
第六行进程详细参数:
- PID:进程id
- USER:操作用户
- PR:进程优先级
- NI:进程的nice值
- VIRT:进程所占用的虚拟内存大小
- RES:进程所占用的物理内存大小
- SHR:进程所占用的共享内存大小
- S:进程状态:R=Running,S=interruptible sleeping,D=uninterruptible sleeping,T=Stopped,Z=zombie
- %CPU:进程CPU占用比
- %MEM:进程物理内存占用比
- TIME+:进程使用后占用CPU累计时间
- COMMAND:进程的运行命令
-
注:输入大写P可以对CPU占用率进行排序,输入大写M可以对物理内存占用比排序,输入大写H可以显示进程的线程信息
-
注:top -c 命令可以打印进程的完整运行命令
[root@localhost data/project]$ top -c
top - 14:05:54 up 177 days, 22:22, 6 users, load average: 9.02, 7.19, 6.26
Tasks: 146 total, 2 running, 144 sleeping, 0 stopped, 0 zombie
%Cpu(s): 24.2 us, 10.6 sy, 0.0 ni, 64.8 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem : 32780448 total, 3990592 free, 26324892 used, 2464964 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 5981456 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12241 root 20 0 8525184 1.8g 14780 S 45.0 5.9 1201:04 java -Xmx2024m -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -jar java-demo.jar
2 root 20 0 0 0 0 S 0.0 0.0 0:00.71 [kthreadd]
3 root 20 0 0 0 0 S 0.0 0.0 14:58.90 [ksoftirqd/0]
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 [kworker/0:0H]
......
- 注:top -Hp pid指令可以看某一进程的所有线程信息,其中PID为十进制的线程id
[root@localhost data/project]$ top -Hp 12241
top - 15:25:41 up 177 days, 23:42, 9 users, load average: 5.26, 9.69, 9.37
Threads: 2468 total, 4 running, 2464 sleeping, 0 stopped, 0 zombie
%Cpu(s): 16.5 us, 19.4 sy, 0.0 ni, 63.1 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
KiB Mem : 32780448 total, 4689192 free, 25684352 used, 2406904 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 6612764 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13352 yocd 20 0 8525184 1.8g 14780 S 4.8 5.9 0:24.00 java
13422 yocd 20 0 8525184 1.8g 14780 S 4.8 5.9 8:09.07 java
13499 yocd 20 0 8525184 1.8g 14780 S 4.8 5.9 0:08.02 java
......
jstack
- jstack能获取jvm中线程栈的信息,主要用于查看线程产生死锁问题,使用命令:jstack -help
[root@localhost data/project]$ jstack -help
Usage:
jstack [-l] <pid>
(to connect to running process)
jstack -F [-m] [-l] <pid>
(to connect to a hung process)
jstack [-m] [-l] <executable> <core>
(to connect to a core file)
jstack [-m] [-l] [server_id@]<remote server IP or hostname>
(to connect to a remote debug server)
Options:
-F to force a thread dump. Use when jstack <pid> does not respond (process is hung)
-m to print both java and native frames (mixed mode)
-l long listing. Prints additional information about locks
-h or -help to print this help message
-
一般jstack最常用命令:jstack pid。下列是可选参数,一般不需使用
- F:当pid没有响应时强制打印栈信息
- m:打印java、native、C框架的所有栈信息
- l:打印关于线程锁的附加信息,例如java.util.concurrent同步器列表等,会导致JVM停顿
[root@localhost data/project]$ jstack 12241
......
"logback-9" #23 daemon prio=5 os_prio=0 tid=0x00007ff018b20800 nid=0x2fff waiting on condition [0x00007fef9597d000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000081d49d08> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"logback-8" #22 daemon prio=5 os_prio=0 tid=0x00007fefa4003000 nid=0x2ffe waiting on condition [0x00007fefc49b4000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000081d49d08> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"com.alibaba.nacos.client.Worker.longPolling.fixed-47.111.74.227_18848-04d07ec1-2439-4444-9498-ff5df4d8e368" #29 daemon prio=5 os_prio=0 tid=0x00007fef78003000 nid=0x3007 runnable [0x00007fef95177000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <0x00000000a1c82f30> (a java.io.BufferedInputStream)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587)
- locked <0x00000000a1c82f88> (a sun.net.www.protocol.http.HttpURLConnection)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
- locked <0x00000000a1c82f88> (a sun.net.www.protocol.http.HttpURLConnection)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at com.alibaba.nacos.client.config.impl.HttpSimpleClient.httpPost(HttpSimpleClient.java:119)
at com.alibaba.nacos.client.config.http.ServerHttpAgent.httpPost(ServerHttpAgent.java:143)
at com.alibaba.nacos.client.config.http.MetricsHttpAgent.httpPost(MetricsHttpAgent.java:64)
at com.alibaba.nacos.client.config.impl.ClientWorker.checkUpdateConfigStr(ClientWorker.java:386)
at com.alibaba.nacos.client.config.impl.ClientWorker.checkUpdateDataIds(ClientWorker.java:354)
at com.alibaba.nacos.client.config.impl.ClientWorker$LongPollingRunnable.run(ClientWorker.java:521)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
......
- jstack查找具体线程栈信息命令:jstack 进程id | grep -C 10 线程id(16进制)
- 即:jstack 12241 | grep -C 10 30a5
- 注:上述命令中-C 10为查询到所在行的前后10列,-A 10为查询到所在行的后10列。
[root@localhost data/project]$ jstack 12241 | grep -C 10 30a5
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"ClientHouseKeepingService" #51 daemon prio=5 os_prio=0 tid=0x00007ff0185d3800 nid=0x30a7 in Object.wait() [0x00007fef7d4b5000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.util.TimerThread.mainLoop(Timer.java:552)
- locked <0x0000000087d87890> (a java.util.TaskQueue)
at java.util.TimerThread.run(Timer.java:505)
"AsyncAppender-Dispatcher-Thread-10" #49 daemon prio=5 os_prio=0 tid=0x00007ff01a0f9800 nid=0x30a5 in Object.wait() [0x00007fef7ebf8000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at com.aliyun.openservices.shade.com.alibaba.rocketmq.logging.inner.LoggingBuilder$AsyncAppender$Dispatcher.run(LoggingBuilder.java:386)
- locked <0x0000000087c18538> (a java.util.ArrayList)
at java.lang.Thread.run(Thread.java:748)
"com.alibaba.nacos.client.Worker.longPolling.fixed-47.111.74.227_18848-04d07ec1-2439-4444-9498-ff5df4d8e368" #46 daemon prio=5 os_prio=0 tid=0x00007fef78005800 nid=0x3096 waiting on condition [0x00007fef7dcb7000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
-
其中线程栈中部分指令信息:
- locked <地址> 目标:使用synchronized申请对象锁成功,监视器的拥有者。
- waiting to lock <地址> 目标:使用synchronized申请对象锁未成功,在迚入区等待。
- waiting on <地址> 目标:使用synchronized申请对象锁成功后,释放锁幵在等待区等待。
- parking to wait for <地址> 目标
jmap
- jmap为获取运行进程中内存分配情况信息,可以用于jvm调优监控。使用命令:jmap -help
[root@localhost data/project]$ jmap -help
Usage:
jmap [option] <pid>
(to connect to running process)
jmap [option] <executable <core>
(to connect to a core file)
jmap [option] [server_id@]<remote server IP or hostname>
(to connect to remote debug server)
where <option> is one of:
<none> to print same info as Solaris pmap
-heap to print java heap summary
-histo[:live] to print histogram of java object heap; if the "live"
suboption is specified, only count live objects
-clstats to print class loader statistics
-finalizerinfo to print information on objects awaiting finalization
-dump:<dump-options> to dump java heap in hprof binary format
dump-options:
live dump only live objects; if not specified,
all objects in the heap are dumped.
format=b binary format
file=<file> dump heap to <file>
Example: jmap -dump:live,format=b,file=heap.bin <pid>
-F force. Use with -dump:<dump-options> <pid> or -histo
to force a heap dump or histogram when <pid> does not
respond. The "live" suboption is not supported
in this mode.
-h | -help to print this help message
-J<flag> to pass <flag> directly to the runtime system
-
jmap可选命令如下:
- heap:显示java堆详细信息
- histo:显示java堆中对象统计信息
- clstats:显示类加载信息
- finalizerinfo:打印有关等待完成的对象信息
- dump:以hprof二进制格式将java堆存储在文件中
- F:当pid没有响应时使用-dump或-histo,这模式live参数无效
- J:将参数传递给运行jmap的虚拟机
-
指令示例一:jmap -heap pid
[root@localhost data/project]$ jmap -heap 12241
Attaching to process ID 12241, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.211-b12
using thread-local object allocation.
Garbage-First (G1) GC with 4 thread(s)
Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 2122317824 (2024.0MB)
NewSize = 1363144 (1.2999954223632812MB)
MaxNewSize = 1272971264 (1214.0MB)
OldSize = 5452592 (5.1999969482421875MB)
NewRatio = 2
SurvivorRatio = 8
MetaspaceSize = 21807104 (20.796875MB)
CompressedClassSpaceSize = 1073741824 (1024.0MB)
MaxMetaspaceSize = 17592186044415 MB
G1HeapRegionSize = 1048576 (1.0MB)
Heap Usage:
G1 Heap:
regions = 2024
capacity = 2122317824 (2024.0MB)
used = 656735728 (626.3119964599609MB)
free = 1465582096 (1397.688003540039MB)
30.94426859980044% used
G1 Young Generation:
Eden Space:
regions = 243
capacity = 802160640 (765.0MB)
used = 254803968 (243.0MB)
free = 547356672 (522.0MB)
31.764705882352942% used
Survivor Space:
regions = 8
capacity = 8388608 (8.0MB)
used = 8388608 (8.0MB)
free = 0 (0.0MB)
100.0% used
G1 Old Generation:
regions = 376
capacity = 495976448 (473.0MB)
used = 393543152 (375.31199645996094MB)
free = 102433296 (97.68800354003906MB)
79.3471451289558% used
59711 interned Strings occupying 6045272 bytes.
- 指令实例二:jmap -clstats pid
[root@localhost data/project]$ jmap -clstats 12241
Attaching to process ID 12241, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.211-b12
finding class loader instances ..
......
jstat
-
jstat是一个较为实用的指令,可以监控clasLoader,gc等jvm虚拟机相关信息,可用于JVM调优监控。
-
jstat可选指令如下:
- class:展示加载class数量,占用空间等
- compiler:展示虚拟机编译数量信息
- gc:展示虚拟机GC的统计信息
- gccapacity:统计虚拟机中年轻代、老年代、permanent中对象使用情况信息
- gcnew:展示年轻代GC信息
- gcnewcapacity:统计GC时,年轻代的容量信息
- gcold:展示GC时,老年代信息
- gcoldcapacity:统计GC时,老年代的容量信息
- gcpermcapacity:统计GC时,permanent的容量信息
- gcutil:统计GC时,heap容量情况
[root@localhost data/project]$ jstat -gc 12241
S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT
0.0 8192.0 0.0 8192.0 304128.0 163840.0 214016.0 165812.3 156208.0 143758.2 17252.0 15374.6 7984 397.008 1 1.000 398.008
[root@localhost data/project]$ jstat -gcutil 12241
S0 S1 E O M CCS YGC YGCT FGC FGCT GCT
0.00 100.00 43.92 78.05 92.03 89.12 8055 399.976 1 1.000 400.976
[root@localhost data/project]$ jstat -gccapacity 12241
NGCMN NGCMX NGC S0C S1C EC OGCMN OGCMX OGC OC MCMN MCMX MC CCSMN CCSMX CCSC YGC FGC
0.0 2072576.0 311296.0 0.0 8192.0 303104.0 0.0 2072576.0 215040.0 215040.0 0.0 1187840.0 156208.0 0.0 1048576.0 17252.0 8029 1
-
输出指令详解:
- SO:heap中survivor 0区内存使用占比
- S0C:heap中survivor 0区容量大小
- S0U:heap中survivor 0区已使用容量大小
- S1:heap中survivor 1区内存使用占比
- S1C:heap中survivor 1区容量大小
- S1U:heap中survivor 1区已使用容量大小
- E:heap中Edan 区内存使用占比
- EC:heap中Edan 区容量大小
- eU:heap中Edan 区已使用容量大小
- O:heap中Old 区内存使用占比
- OC:heap中Old 区容量大小
- OU:heap中Old 区已使用容量大小
- P:permanent 区内存使用占比
- PC:permanent 区容量大小
- PU:permanent 区已使用容量大小
- YGC:当前JVM发生young gc次数
- YGCT:当前JVM发生young gc所用时间(秒)
- FGC:当前JVM发生full gc次数
- FGCT:当前JVM发生full gc所用时间(秒)
- GCT:当前JVM发生GC的总耗时,相当于YGCT+FGCT
- NGCMN:年轻代初始最小大小
- NGCMX:年轻代最大容量
- NGC:年轻代当前容量
- OGCMN:老年代初始化大小容量
- OGCMX:老年代最大容量
- OGC:老年代当前容量
- ...
最后
- 上述命令也只是列举了一小部分,更详细的使用方案可以看看官网案例。
- 最后虚心学习,共同进步 -_-