The heap profile shows active memory, memory the runtime believes is in use by the go program (ie: hasn't been collected by the garbage collector). When the GC does collect memory the profile shrinks, but no memory is returned to the system. Your future allocations will try to use memory from the pool of previously collected objects before asking the system for more.
From the outside, this means that your program's memory use will either be increasing, or staying level. What the outside system presents as the "Resident Size" of your program is the number of bytes of RAM is assigned to your program whether it's holding in-use go values or collected ones.
The reason why these two numbers are often quite different are because:
The GC collecting memory has no effect on the outside view of the program
Alternatively, since you are using web-based profiling if you can access the profiling data through your browser at: http://10.10.58.118:8601/debug/pprof/ , clicking the heap link will show you the debugging view of the heap profile, which has a printout of a runtime.MemStats structure at the bottom.
The runtime.MemStats documentation (http://golang.org/pkg/runtime/#MemStats) has the explanation of all the fields, but the interesting ones for this discussion are:
HeapAlloc: essentially what the profiler is giving you (active heap memory)
Alloc: similar to HeapAlloc, but for all go managed memory
Sys: the total amount of memory (address space) requested from the OS
There will still be discrepancies between Sys, and what the OS reports because what Go asks of the system, and what the OS gives it are not always the same. Also CGO / syscall (eg: malloc / mmap) memory is not tracked by go.
As an addition to @Cookie of Nine's answer, in short: you can try the --alloc_space option.
go tool pprof use --inuse_space by default. It samples memory usage so the result is subset of real one.
By --alloc_space pprof returns all alloced memory since program started.
You can also use StackImpact, which automatically records and reports regular and anomaly-triggered memory allocation profiles to the dashboard, which are available in a historical and comparable form. See this blog post for more details Memory Leak Detection in Production Go Applications
I was always confused about the growing residential memory of my Go applications, and finally I had to learn the profiling tools that are present in Go ecosystem. Runtime provides many metrics within a runtime.Memstats structure, but it may be hard to understand which of them can help to find out the reasons of memory growth, so some additional tools are needed.
Run InfluxDB and Grafana within Docker containers:
docker run --name influxdb -d -p 8086:8086 influxdb
docker run -d -p 9090:3000/tcp --link influxdb --name=grafana grafana/grafana:4.1.0
Set up interaction between Grafana and InfluxDBGrafana (Grafana main page -> Top left corner -> Datasources -> Add new datasource):
Import dashboard #3242 from https://grafana.com (Grafana main page -> Top left corner -> Dashboard -> Import):
Finally, launch your application: it will transmit runtime metrics to the contenerized Influxdb. Put your application under a reasonable load (in my case it was quite small - 5 RPS for a several hours).
Memory consumption analysis
Sys (the synonim of RSS) curve is quite similar to HeapSys curve. Turns out that dynamic memory allocation was the main factor of overall memory growth, so the small amount of memory consumed by stack variables seem to be constant and can be ignored;
The constant amount of goroutines garantees the absence of goroutine leaks / stack variables leak;
The total amount of allocated objects remains the same (there is no point in taking into account the fluctuations) during the lifetime of the process.
The most surprising fact: HeapIdle is growing with the same rate as a Sys, while HeapReleased is always zero. Obviously runtime doesn't return memory to OS at all , at least under the conditions of this test:
HeapIdle minus HeapReleased estimates the amount of memory
that could be returned to the OS, but is being retained by
the runtime so it can grow the heap without requesting more
memory from the OS.
For those who's trying to investigate the problem of memory consumption I would recommend to follow the described steps in order to exclude some trivial errors (like goroutine leak).
Freeing memory explicitly
It's interesting that the one can significantly decrease memory consumption with explicit calls to debug.FreeOSMemory():
// in the top-level package
func init() {
go func() {
t := time.Tick(time.Second)
for {
<-t
debug.FreeOSMemory()
}
}()
}
In fact, this approach saved about 35% of memory as compared with default conditions.
I ran a pprof analysis. pprof is a tool that’s baked into the Go language that allows for analysis and visualisation of profiling data collected from a running application. It’s a very helpful tool that collects data from a running Go application and is a great starting point for performance analysis. I’d recommend running pprof in production so you get a realistic sample of what your customers are doing.
When you run pprof you’ll get some files that focus on goroutines, CPU, memory usage and some other things according to your configuration. We’re going to focus on the heap file to dig into memory and GC stats. I like to view pprof in the browser because I find it easier to find actionable data points. You can do that with the below command.
go tool pprof -http=:8080 profile_name-heap.pb.gz
pprof has a CLI tool as well, but I prefer the browser option because I find it easier to navigate. My personal recommendation is to use the flame graph. I find that it’s the easiest visualiser to make sense of, so I use that view most of the time. The flame graph is a visual version of a function’s stack trace. The function at the top is the called function, and everything underneath it is called during the execution of that function. You can click on individual function calls to zoom in on them which changes the view. This lets you dig deeper into the execution of a specific function, which is really helpful. Note that the flame graph shows the functions that consume the most resources so some functions won’t be there. This makes it easier to figure out where the biggest bottlenecks are.
Try GO plugin for Tracy. Tracy is "A real time, nanosecond resolution, remote telemetry" (...).
GoTracy (name of the plugin) is the agent with connect with the Tracy and send necessary information to better understand your app process. After importing plugin You can put telemetry code like in description below:
func exampleFunction() {
gotracy.TracyInit()
gotracy.TracySetThreadName("exampleFunction")
for i := 0.0; i < math.Pi; i += 0.1 {
zoneid := gotracy.TracyZoneBegin("Calculating Sin(x) Zone", 0xF0F0F0)
gotracy.TracyFrameMarkStart("Calculating sin(x)")
sin := math.Sin(i)
gotracy.TracyFrameMarkEnd("Calculating sin(x)")
gotracy.TracyMessageLC("Sin(x) = "+strconv.FormatFloat(sin, 'E', -1, 64), 0xFF0F0F)
gotracy.TracyPlotDouble("sin(x)", sin)
gotracy.TracyZoneEnd(zoneid)
gotracy.TracyFrameMark()
}
}