linux - Why does `change_protection` hog CPU while loading a large amount of data into RAM? -


we have built in memory database, eats 100-150g ram in single vec, populated this:

let mut result = vec::with_capacity(a_very_large_number); while let ok(n) = reader.read(&mut buffer) {     result.push(...); } 

perf top shows time spent in "change_protection" function:

samples: 48k of event 'cpu-clock', event count (approx.): 694742858  62.45%  [kernel]              [k] change_protection  18.18%  iron                  [.] database::database::init::h63748   7.45%  [kernel]              [k] vm_normal_page   4.88%  libc-2.17.so          [.] __memcpy_ssse3_back   0.92%  [kernel]              [k] copy_user_enhanced_fast_string   0.52%  iron                  [.] memcpy@plt 

the cpu usage of function grows more , more data loaded ram:

  pid user      pr  ni  virt  res  shr s %cpu %mem    time+  command 12383 iron      20   0  137g  91g 1372 d 76.1 37.9  27:37.00 iron 

the code running on r3.8xlarge aws ec2 instance, , transparent hugepage disabled.

[~]$ cat /sys/kernel/mm/transparent_hugepage/defrag madvise [never] [~]$ cat /sys/kernel/mm/transparent_hugepage/enabled madvise [never] 

cpuinfo

processor   : 0 vendor_id   : genuineintel cpu family  : 6 model       : 62 model name  : intel(r) xeon(r) cpu e5-2670 v2 @ 2.50ghz stepping    : 4 microcode   : 0x428 cpu mhz     : 2500.070 cache size  : 25600 kb physical id : 0 siblings    : 16 core id     : 0 cpu cores   : 8 apicid      : 0 initial apicid  : 0 fpu     : yes fpu_exception   : yes cpuid level : 13 wp      : yes flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm xsaveopt fsgsbase smep erms bogomips    : 5000.14 clflush size    : 64 cache_alignment : 64 address sizes   : 46 bits physical, 48 bits virtual power management: 

kernel

3.14.35-28.38.amzn1.x86_64 

the real question why there overhead in function?

this seems os issue, rather issue specific rust function.

most oses (including linux) use demand paging. default, linux not allocate physical pages newly allocated memory. instead allocate single 0 page read-only permissions allocated memory (i.e., virtual memory pages point single physical memory page).

if attempt write memory, page fault happen, new page allocated, , it's permissions set accordingly.

i'm guessing seeing effect in program. if try same thing second time, should faster. there ways control policy via sysctl: https://www.kernel.org/doc/documentation/vm/overcommit-accounting.

not sure why disabled thp, in case large pages might since protection change happen once every large page (2mib) instead of once per normal page (4kib).


Comments

Popular posts from this blog

resizing Telegram inline keyboard -

command line - How can a Python program background itself? -

php - "cURL error 28: Resolving timed out" on Wordpress on Azure App Service on Linux -