[컴][db] mongodb 에서 필요한 메모리 size 결정하는데 도움이 되는 정보 확인

 

mongodb 에서 필요한 메모리 size 결정하는데 도움이 되는 정보 확인

MongoDB v3.4이후 cache 관련 기본 설정

  • WiredTiger 내부 cache 크기 : 256MB 또는 “50% * (총ram - 1GB)” 중에 큰 쪽을 택한다.
  • 16GB 의 system 이라면, 15GB 의 50% 인 7.5GB 가 WiredTiger 내부 cache 크기가 된다.

db.serverStatus().wiredTiger.cache 명령어로 현재 설정을 확인할 수 있다. 이 값들의 의미는 serverStatus.wiredTiger — MongoDB Manual 서 확인할 수 있다.

아래의 ‘maximum bytes configured’ 가 ‘wiredTiger.cache.maximum bytes configured’ 라고 보면 된다.

db.serverStatus().wiredTiger.cache
/* 1 */
{
    "application threads page read from disk to cache count" : 108427,
    "application threads page read from disk to cache time (usecs)" : 33973418,
    "application threads page write from cache to disk count" : 7870,
    "application threads page write from cache to disk time (usecs)" : 330718,
    "bytes allocated for updates" : 80145389,
    "bytes belonging to page images in the cache" : NumberLong(3854909989),
    "bytes belonging to the history store table in the cache" : 916,
    "bytes currently in the cache" : NumberLong(5526337681),
    "bytes dirty in the cache cumulative" : 889517611,
    "bytes not belonging to page images in the cache" : 1671427692,
    "bytes read into cache" : NumberLong(3606044753),
    "bytes written from cache" : 686646597,
    "cache overflow score" : 0,
    "checkpoint blocked page eviction" : 0,
    "checkpoint of history store file blocked non-history store page eviction" : 0,
    "eviction calls to get a page" : 4543,
    "eviction calls to get a page found queue empty" : 651,
    "eviction calls to get a page found queue empty after locking" : 50,
    "eviction currently operating in aggressive mode" : 0,
    "eviction empty score" : 0,
    "eviction gave up due to detecting an out of order on disk value behind the last update on the chain" : 0,
    "eviction gave up due to detecting an out of order tombstone ahead of the selected on disk update" : 0,
    "eviction gave up due to detecting an out of order tombstone ahead of the selected on disk update after validating the update chain" : 0,
    "eviction gave up due to detecting out of order timestamps on the update chain after the selected on disk update" : 0,
    "eviction passes of a file" : 259,
    "eviction server candidate queue empty when topping up" : 11,
    "eviction server candidate queue not empty when topping up" : 248,
    "eviction server evicting pages" : 0,
    "eviction server slept, because we did not make progress with eviction" : 1268,
    "eviction server unable to reach eviction goal" : 0,
    "eviction server waiting for a leaf page" : 598,
    "eviction state" : 64,
    "eviction walk most recent sleeps for checkpoint handle gathering" : 0,
    "eviction walk target pages histogram - 0-9" : 0,
    "eviction walk target pages histogram - 10-31" : 0,
    "eviction walk target pages histogram - 128 and higher" : 0,
    "eviction walk target pages histogram - 32-63" : 0,
    "eviction walk target pages histogram - 64-128" : 259,
    "eviction walk target pages reduced due to history store cache pressure" : 0,
    "eviction walk target strategy both clean and dirty pages" : 0,
    "eviction walk target strategy only clean pages" : 0,
    "eviction walk target strategy only dirty pages" : 259,
    "eviction walks abandoned" : 6,
    "eviction walks gave up because they restarted their walk twice" : 0,
    "eviction walks gave up because they saw too many pages and found no candidates" : 0,
    "eviction walks gave up because they saw too many pages and found too few candidates" : 0,
    "eviction walks reached end of tree" : 9,
    "eviction walks restarted" : 0,
    "eviction walks started from root of tree" : 6,
    "eviction walks started from saved location in tree" : 253,
    "eviction worker thread active" : 4,
    "eviction worker thread created" : 0,
    "eviction worker thread evicting pages" : 3840,
    "eviction worker thread removed" : 0,
    "eviction worker thread stable number" : 0,
    "files with active eviction walks" : 0,
    "files with new eviction walks started" : 9,
    "force re-tuning of eviction workers once in a while" : 0,
    "forced eviction - history store pages failed to evict while session has history store cursor open" : 0,
    "forced eviction - history store pages selected while session has history store cursor open" : 0,
    "forced eviction - history store pages successfully evicted while session has history store cursor open" : 0,
    "forced eviction - pages evicted that were clean count" : 0,
    "forced eviction - pages evicted that were clean time (usecs)" : 0,
    "forced eviction - pages evicted that were dirty count" : 4,
    "forced eviction - pages evicted that were dirty time (usecs)" : 38127,
    "forced eviction - pages selected because of a large number of updates to a single item" : 0,
    "forced eviction - pages selected because of too many deleted items count" : 133,
    "forced eviction - pages selected count" : 11,
    "forced eviction - pages selected unable to be evicted count" : 0,
    "forced eviction - pages selected unable to be evicted time" : 0,
    "hazard pointer blocked page eviction" : 0,
    "hazard pointer check calls" : 3851,
    "hazard pointer check entries walked" : 10177,
    "hazard pointer maximum array length" : 1,
    "history store score" : 0,
    "history store table insert calls" : 0,
    "history store table insert calls that returned restart" : 0,
    "history store table max on-disk size" : 0,
    "history store table on-disk size" : 12288,
    "history store table out-of-order resolved updates that lose their durable timestamp" : 0,
    "history store table out-of-order updates that were fixed up by reinserting with the fixed timestamp" : 0,
    "history store table reads" : 0,
    "history store table reads missed" : 0,
    "history store table reads requiring squashed modifies" : 0,
    "history store table truncation by rollback to stable to remove an unstable update" : 0,
    "history store table truncation by rollback to stable to remove an update" : 0,
    "history store table truncation to remove an update" : 0,
    "history store table truncation to remove range of updates due to key being removed from the data page during reconciliation" : 68461,
    "history store table truncation to remove range of updates due to out-of-order timestamp update on data page" : 0,
    "history store table writes requiring squashed modifies" : 0,
    "in-memory page passed criteria to be split" : 14,
    "in-memory page splits" : 7,
    "internal pages evicted" : 0,
    "internal pages queued for eviction" : 0,
    "internal pages seen by eviction walk" : 3490,
    "internal pages seen by eviction walk that are already queued" : 0,
    "internal pages split during eviction" : 0,
    "leaf pages split during eviction" : 471,
    "maximum bytes configured" : NumberLong(8053063680),
    "maximum page size at eviction" : 0,
    "modified pages evicted" : 3844,
    "modified pages evicted by application threads" : 0,
    "operations timed out waiting for space in cache" : 0,
    "overflow pages read into cache" : 248,
    "page split during eviction deepened the tree" : 0,
    "page written requiring history store records" : 0,
    "pages currently held in the cache" : 108693,
    "pages evicted by application threads" : 0,
    "pages evicted in parallel with checkpoint" : 246,
    "pages queued for eviction" : 25900,
    "pages queued for eviction post lru sorting" : 25694,
    "pages queued for urgent eviction" : 378,
    "pages queued for urgent eviction during walk" : 0,
    "pages queued for urgent eviction from history store due to high dirty content" : 0,
    "pages read into cache" : 108871,
    "pages read into cache after truncate" : 47,
    "pages read into cache after truncate in prepare state" : 0,
    "pages requested from the cache" : 34351374,
    "pages seen by eviction walk" : 52348,
    "pages seen by eviction walk that are already queued" : 1459,
    "pages selected for eviction unable to be evicted" : 0,
    "pages selected for eviction unable to be evicted as the parent page has overflow items" : 0,
    "pages selected for eviction unable to be evicted because of active children on an internal page" : 0,
    "pages selected for eviction unable to be evicted because of failure in reconciliation" : 0,
    "pages selected for eviction unable to be evicted because of race between checkpoint and out of order timestamps handling" : 0,
    "pages walked for eviction" : 380926,
    "pages written from cache" : 12981,
    "pages written requiring in-memory restoration" : 3826,
    "percentage overhead" : 8,
    "the number of times full update inserted to history store" : 0,
    "the number of times reverse modify inserted to history store" : 0,
    "tracked bytes belonging to internal pages in the cache" : 51625791,
    "tracked bytes belonging to leaf pages in the cache" : NumberLong(5474711890),
    "tracked dirty bytes in the cache" : 0,
    "tracked dirty pages in the cache" : 0,
    "unmodified pages evicted" : 0
}

메모리 관련 server status 값 [ref.1]

  • wiredTiger.cache.maximum bytes configured : 현재 최대 cache size
  • wiredTiger.cache.bytes currently in the cache
    • 현재 cache에 있는 data 의 size.
    • 일반적으로 “cache size 의 80% + dirty cache 양(아직 disk 에 write 되지 않은 내용)”
    • 이 값이 wiredTiger.cache.maximum 보다 큰 값이 되서는 안된다.
    • 만약 이 값이 wiredTiger.cache.maximum 보다 크게 나온다면 scale out 을 해야한다는 의미다.
  • wiredTiger.cache.tracked dirty bytes in the cache
    • cache 에 있는 dirty data 의 size
    • cache size 값의 5% 보다 작아야 한다. 만약 그것보다 크면 scale out 을 해야 한다.
    • 5% 보다 크면, WiredTiger 는 cache 에서 data 를 지우는데 좀 더 공격적이게 된다.
      어떤 경우에는 성공적으로 write 가 끝나기 전에 client 에게 data 를 cache 에서 제거하도록 강제할 수 있다.
  • wiredTiger.cache.pages read into cache
    • cache 로 읽어오는 page의 수이다.
    • 그리고 초당 평균(per-second average)을 판단하는데 이것을 이용할 수 있다.
      어떤 data 가 cache 로 오는지 알 수 있다.(?)
      (This is the number of pages that are read into cache and you can use this to judge its per-second average to know what data is coming into your cache.)
    • 이 값이 read-heavy application 들에 대한 issue 의 지표가 될 수도 있다.
    • 이 값이 꾸준히 cache size 의 큰 부분을 차지하고 있으면, ’메모리를 늘리는 것’이 전체적인 read 성능을 올릴 수 있을 것이다.
  • wiredTiger.cache.pages written from cache
    • cache 에서 disk 로 write 된 page의 ’수’이다.
    • chckpoint 들이 일어나기 전에 특히 큰 값을 갖는다.
    • 만약 이 값이 계속 증가한다면, 너의 checkpoint 들은 계속 해서 길어질 것이다.

working set memory size 가 충분한지 여부[ref. 2]

top 등의 명령어로 볼때 mongodb 가 동작중에 memory 가 여유분이 있다면, 그것은 working set 이 memory 에 fit 하다는 이야기다.

free memory 가 없는 경우 cache.pages 를 확인하자.

몽고db가 memory 를 거의 다 잡아먹는 경우, 다음 2개의 값이 높으면, server 가 cache 를 통해 data 를 순환시키고 있다는 이야기다. 그말은 working set 이 memory 보다 크다는 이야기다.

disk read 의 양 을 확인한다.

  • drive 에서 많은 read activity 가 있다면, 일반적으로 working set 이 메모리 크기보다 크다는 신호일 수 있다.
  • cache 동작때문에, 높은 cache eviction 지만, 낮은 IO 를 볼 수도 있다.?

allowDiskUse

aggregate 시에 너무 많은 양의 data 처리해서 memory 가 부족하게 되면 QueryExceededMemoryLimitNoDiskUseAllowed 가 나게 된다. 이 경우 allowDiskUse option 을 사용하면 된다.

allowDiskUse 는 <dbPath>/_tmp 에 data 를 write 하게 된다.

각각의 pipeline 의 stage 는 100MB 의 RAM 만을 사용한다. 기본적으로 이 값을 넘으면 error 가 난다. $search 는 100MB 제한이 걸리지 않는다. 왜냐하면 이것은 분리된 process 에서 run 되기 때문이다.

See Also

  1. 쿠…sal: [컴][db] MongoDB WiredTiger의 Memory 사용

References

  1. MongoDB 101: How to Tune Your MongoDB Configuration After Upgrading to More Memory - Percona Database Performance Blog, 2021-01-28
  2. How to Tell if Your MongoDB Server Is Correctly Sized For Your Working Set - Orange Matter

댓글 없음:

댓글 쓰기