Chapter 3: Memory Analysis — Deep Dive
Overview
Memory is the most misunderstood resource. "90% memory used" doesn't mean the server needs more RAM — most of it is probably cache. Understanding what MemAvailable vs MemFree means is crucial.
melisai's MemoryCollector (internal/collector/memory.go) reads 6 data sources and provides a complete memory picture.
Source File: memory.go
- Lines: 274
- Functions: 8
- Data Sources:
/proc/meminfo,/proc/vmstat,/proc/pressure/memory,/proc/buddyinfo,/sys/devices/system/node/
Function Walkthrough
Collect() — Orchestrating Memory Data
func (c *MemoryCollector) Collect(ctx context.Context, cfg CollectConfig) (*model.Result, error) {
data := &model.MemoryData{}
c.parseMeminfo(data) // Main memory breakdown
c.parseVmstat(data) // Page fault counters
// Sysctl settings:
data.Swappiness = readSysctlInt(c.procRoot, "sys/vm/swappiness")
data.OvercommitMemory = readSysctlInt(c.procRoot, "sys/vm/overcommit_memory")
data.OvercommitRatio = readSysctlInt(c.procRoot, "sys/vm/overcommit_ratio")
data.DirtyRatio = readSysctlInt(c.procRoot, "sys/vm/dirty_ratio")
data.DirtyBackgroundRatio = readSysctlInt(c.procRoot, "sys/vm/dirty_background_ratio")
c.parsePSI(data) // Pressure Stall Information
data.BuddyInfo = c.parseBuddyinfo() // Memory fragmentation
data.NUMANodes = c.parseNUMAStats() // Per-NUMA-node stats
return &model.Result{...}
}
Unlike CPU collection, memory doesn't need delta sampling — most metrics are instant snapshots.
parseMeminfo() — The Core Memory Breakdown
func (c *MemoryCollector) parseMeminfo(data *model.MemoryData) {
f, _ := os.Open(filepath.Join(c.procRoot, "meminfo"))
scanner := bufio.NewScanner(f)
for scanner.Scan() {
// "MemTotal: 16384000 kB"
parts := strings.SplitN(line, ":", 2)
key := strings.TrimSpace(parts[0])
valStr := strings.TrimSuffix(strings.TrimSpace(parts[1]), " kB")
val, _ := strconv.ParseInt(valStr, 10, 64)
valBytes := val * 1024 // kB → bytes
switch key {
case "MemTotal": data.TotalBytes = valBytes
case "MemFree": data.FreeBytes = valBytes
case "MemAvailable": data.AvailableBytes = valBytes
case "Cached": data.CachedBytes = valBytes
case "Buffers": data.BuffersBytes = valBytes
case "SwapTotal": data.SwapTotalBytes = valBytes
case "SwapFree": data.SwapUsedBytes = data.SwapTotalBytes - valBytes
case "Dirty": data.DirtyBytes = valBytes
case "HugePages_Total": data.HugePagesTotal = int(val)
case "HugePages_Free": data.HugePagesFree = int(val)
}
}
}
Critical distinction — MemFree vs MemAvailable:
┌─────────────────────────────────────────────────┐
│ Total Memory │
├────────┬───────────┬───────────┬────────────────┤
│ Active │ Cache │ Buffers │ Free │
│ Used │(reclaimable)│ │ │
├────────┴───────────┴───────────┼────────────────┤
│←─────── MemAvailable ─────────→│ │
│ │←── MemFree ───→│
└────────────────────────────────┴────────────────┘
- MemFree: Truly unused pages. On a healthy server, this is often near zero — and that's fine!
- MemAvailable: Free + reclaimable cache + reclaimable buffers. This is what your application can actually use.
Rule: Use MemAvailable, not MemFree, for capacity planning.
parseVmstat() — Page Faults
func (c *MemoryCollector) parseVmstat(data *model.MemoryData) {
// Reads /proc/vmstat
switch fields[0] {
case "pgmajfault": data.MajorFaults = val // Required disk I/O
case "pgfault": data.MinorFaults = val // Resolved in memory
}
}
- Minor fault: Page not in page table but IS in memory (e.g., shared library pages). Cost: ~1μs.
- Major fault: Page not in memory, must be read from disk. Cost: ~1-10ms (SSD) or ~5-20ms (HDD).
High major faults = the system is swapping or has insufficient cache.
parsePSI() — Memory Pressure
func (c *MemoryCollector) parsePSI(data *model.MemoryData) {
// /proc/pressure/memory
// "some avg10=0.00 avg60=0.15 avg300=0.08 total=142395"
// "full avg10=0.00 avg60=0.00 avg300=0.00 total=0"
switch {
case prefix == "some" && parts[0] == "avg10": data.PSISome10 = val
case prefix == "some" && parts[0] == "avg60": data.PSISome60 = val
case prefix == "full" && parts[0] == "avg10": data.PSIFull10 = val
case prefix == "full" && parts[0] == "avg60": data.PSIFull60 = val
}
}
Interpreting PSI: | Value | Meaning | |-------|---------| | some=0% | No memory pressure | | some=5% | Some tasks waiting for memory 5% of the time | | some=25% | Significant memory contention | | full=5% | ALL tasks blocked 5% of the time — severe | | full=25% | System is thrashing |
parseBuddyinfo() — Fragmentation
func (c *MemoryCollector) parseBuddyinfo() map[string][]int {
// /proc/buddyinfo
// "Node 0, zone DMA 1 0 1 0 2 1 1 0 1 1 3"
// Orders 0-10: 4KB 8KB 16KB 32KB 64KB 128KB 256KB 512KB 1MB 2MB 4MB
}
This data shows free blocks at each order (size). When large order counts are zero, the system can't allocate contiguous physical pages. This matters for: - HugePages (2MB blocks) - Network buffers (typically need 64KB+) - DMA allocations
parseNUMAStats() — NUMA Topology
func (c *MemoryCollector) parseNUMAStats() []model.NUMANode {
// Iterates /sys/devices/system/node/node*/
// Reads meminfo (MemTotal, MemFree) and numastat (numa_hit, numa_miss, numa_foreign)
}
NUMA ratios to watch:
- > 95%: Good — most allocations are local
- 80-95%: Acceptable — some remote access
- < 80%: Bad — process should be pinned to a NUMA node with
numactl
Sysctl Parameters Explained
| Parameter | Default | Meaning |
|---|---|---|
vm.swappiness |
60 | How aggressively to swap (0=avoid, 100=aggressive) |
vm.overcommit_memory |
0 | 0=heuristic, 1=always allow, 2=strict accounting |
vm.overcommit_ratio |
50 | When overcommit_memory=2: commit limit = RAM × ratio% + swap |
vm.dirty_ratio |
20 | % of RAM that can be dirty before synchronous write-back |
vm.dirty_background_ratio |
10 | % of RAM dirty before background write-back starts |
Common tuning:
- Database servers: swappiness=1 (avoid swap, let DB manage its cache)
- Web servers: dirty_ratio=5, dirty_background_ratio=2 (faster write-back)
- Elasticsearch: overcommit_memory=1 (avoid OOM on fork)
Diagnostic Examples
Healthy Server
{
"total_bytes": 68719476736,
"available_bytes": 45097156608,
"cached_bytes": 20000000000,
"swap_used_bytes": 0,
"major_faults": 12,
"psi_some_10": 0.0
}
Memory Leak
{
"total_bytes": 68719476736,
"available_bytes": 1073741824,
"cached_bytes": 500000000,
"swap_used_bytes": 8589934592,
"major_faults": 125000,
"psi_some_10": 32.5,
"psi_full_10": 8.2
}