HDFS DirectoryScanner is bothering me

Yen-Onn Hiu

2018-12-05 01:32:38 UTC

hi Wei-Chiu,

Good to hear from you!

I have some facts that I have gathered in hoping that I can show you
something solid.

[image: Screen Shot 2561-12-05 at 08.41.42.png]

[image: Screen Shot 2561-12-05 at 09.21.17.png]

This is the graphite metrics that I have gathered from Java
Jvmmetrics/GcTimeMillis. From time to time, I could see GC hiccup of 3 to 4
seconds from every datanodes, every 6 hours (which is matching the interval
of dfs.datanode.directoryscan.interval, defaulted to 21600 sec). Following
on this hiccup, when I traced the log file I could clearly see this line

INFO org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool
BP-1850109981-192.168.1.1-1413178082983 Total blocks: 3894663, missing
metadata files:122, missing block files:122, missing blocks in memory:57,
mismatched blocks:0

By looking at the GC log which I have enabled, I can clearly seeing there
was a great time spent to do the object copy.

[GC cleanup 4452M->2474M(8192M), 0.0124395 secs]

[Times: user=0.05 sys=0.00, real=0.01 secs]

2018-11-29T01:53:23.204+0700: 40161.800: [GC concurrent-cleanup-start]

2018-11-29T01:53:23.214+0700: 40161.810: [GC concurrent-cleanup-end,
0.0104121 secs]

2018-11-29T01:53:29.107+0700: 40167.702: [GC pause (G1 Evacuation Pause)
(young) 40167.702: [G1Ergonomics (CSet Construction) start choosing CSet,
_pending_cards: 65846, predicted base time: 30.37 ms, remaining time:
169.63 ms, target pause time: 200.00 ms]

40167.702: [G1Ergonomics (CSet Construction) add young regions to CSet,
eden: 131 regions, survivors: 31 regions, predicted young region time:
296.21 ms]

40167.702: [G1Ergonomics (CSet Construction) finish choosing CSet, eden:
131 regions, survivors: 31 regions, old: 0 regions, predicted pause time:
326.58 ms, target pause time: 200.00 ms]

40168.874: [G1Ergonomics (Mixed GCs) do not start mixed GCs, reason:
reclaimable percentage not over threshold, candidate old regions: 138
regions, reclaimable: 496648080 bytes (5.78 %), threshold: 10.00 %]

, 1.1718634 secs]

[Parallel Time: 1166.4 ms, GC Workers: 8]

[GC Worker Start (ms): Min: 40167703.2, Avg: 40167703.3, Max:
40167703.3, Diff: 0.1]

[Ext Root Scanning (ms): Min: 0.6, Avg: 0.8, Max: 1.0, Diff: 0.4,
Sum: 6.2]

[Update RS (ms): Min: 68.1, Avg: 68.2, Max: 68.3, Diff: 0.2, Sum:
545.9]

[Processed Buffers: Min: 24, Avg: 52.6, Max: 74, Diff: 50, Sum:
421]

[Scan RS (ms): Min: 23.9, Avg: 24.2, Max: 24.7, Diff: 0.8, Sum: 193.6]

[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
Sum: 0.0]

*[Object Copy (ms): Min: 1013.7, Avg: 1015.8, Max: 1020.4, Diff: 6.7,
Sum: 8126.1]*

[Termination (ms): Min: 51.8, Avg: 56.8, Max: 58.8, Diff: 7.0, Sum:
454.6]

[Termination Attempts: Min: 53698, Avg: 56791.9, Max: 58670, Diff:
4972, Sum: 454335]

[GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum:
0.5]

[GC Worker Total (ms): Min: 1165.8, Avg: 1165.9, Max: 1166.0, Diff:
0.2, Sum: 9327.0]

[GC Worker End (ms): Min: 40168869.1, Avg: 40168869.2, Max:
40168869.2, Diff: 0.1]

[Code Root Fixup: 0.0 ms]

[Code Root Purge: 0.0 ms]

[Clear CT: 0.2 ms]

[Other: 5.2 ms]

[Choose CSet: 0.0 ms]

[Ref Proc: 2.4 ms]

[Ref Enq: 0.1 ms]

[Redirty Cards: 0.8 ms]

[Humongous Register: 0.2 ms]

[Humongous Reclaim: 0.0 ms]

[Free CSet: 0.8 ms]

[Eden: 524.0M(284.0M)->0.0B(356.0M) Survivors: 124.0M->52.0M Heap:
2809.1M(8192.0M)->2387.1M(8192.0M)]

[Times: user=8.90 sys=0.19, real=1.17 secs]

This is my java opts

-XX:+UseG1GC -Xmx8g -Xms8g -XX:MaxGCPauseMillis=200 -XX:ParallelGCThreads=8
-XX:+UnlockExperimentalVMOptions -XX:+ParallelRefProcEnabled
-XX:InitiatingHeapOccupancyPercent=50 -XX:G1MixedGCLiveThresholdPercent=70
-XX:G1HeapWastePercent=10 -Xloggc:/var/log/hadoop-hdfs/datanode-gc.log
-XX:G1LogLevel=finest -verbose:gc -XX:+PrintGCDetails
-XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
-XX:+PrintTenuringDistribution -XX:+PrintHeapAtGC
-XX:+PrintAdaptiveSizePolicy
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.port=8004

Together with this, I have generated a heapdump, since the dump file is too
big (357MB), I think I could not do a attach to this email. However, I have
attached some print screens. Not sure if what I have guessed is correct?
The big objects that referring from the GC log might could have been
FsDatasetImpl, and ReplicaMap. I am just speculating those java objects.

[image: Screen Shot 2561-12-05 at 08.53.07.png]

[image: Screen Shot 2561-12-05 at 08.53.35.png]
[image: Screen Shot 2561-12-05 at 09.08.10.png]

[image: Screen Shot 2561-12-05 at 08.58.20.png]

Please let me know if anything that I can do? I have gone through this link
(
https://blog.cloudera.com/blog/2016/12/hdfs-datanode-scanners-and-disk-checker-explained/)
many times to understand the behaviour of DirectoryScanner, BlockScanner,
VolumeScanner. Here are some of my questions that I am still want to know.

1. > INFO org.apache.hadoop.hdfs.server.datanode.DirectoryScanner:
BlockPool BP-1850109981-192.168.1.1-1413178082983 Total blocks: 3894663,
missing metadata files:122, missing block files:122, missing blocks in
memory:57, mismatched blocks:0

Also, I would like to know what is the cause of having missing metadata
files, missing block files, missing blocks in memory.

2. What is the result of prolonging the dfs.datanode.directoryscan.interval
from 6 hours to 12 hours. Will it give any harms to the system?

3. Will Block scanning fixing this? But it has been defaulted to 3 weeks to
run it once. Can I trigger the scan explicitly from command line?

Thank you so much for your attention.

Post by Wei-Chiu Chuang
Do you have a heapdump? Without a heapdump it's not easy to definitely
point to DirectoryScanner for GC issues.
That said, I did notice DirectoryScanner holding global lock for quite a
few seconds periodically, but that's unrelated. to GC.

Post by Yen-Onn Hiu
hi all,
I am on hadoop hdfs version of Hadoop 2.6.0-cdh5.8.0. I discovered that
the DirectoryScanner is keep causing java GC and slow down the hadoop
2018-11-29 13:34:37,995 INFO
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool
BP-1850109981-192.168.1.1-1413178082983 Total blocks: 3896197, missing
metadata files:214, missing block files:214, missing blocks in memory:103,
mismatched blocks:0
Reading from internet posting, there are postings saying this is from
DirectoryScanner which will be executed in every 6 hours. This directory
scanning caused GC hiccup in all nodes and caused performance issues on the
cluster.
Question: when I am doing the hdfs dfsadmin -report. It does not say that
I have any corrupted files. Also, I did the hdfs fsck / onto the directory
and it does not yield any problems. How can I know what is the missing
block files, missing blocks in memory and missing metadata files?
Thanks!
--
Hiu Yen Onn

--
Hiu Yen Onn