go-perfbook/performance.md

106 lines
3.7 KiB
Markdown
Raw Normal View History

2016-05-22 14:21:23 +08:00
This document outlines best practices for writing high-performance Go code.
At the moment, it's a collection of links to videos, slides, and blog posts
("awesome-go-performance"), but I would like this to evolve into a longer book
format where the content is here instead of external. The links should be sorted into categories.
2016-05-22 19:14:31 +08:00
All the content will be licensed under CC-BY-SA.
2016-05-23 20:21:18 +08:00
## Optimization Workflow
2016-05-22 14:21:23 +08:00
* All optimizations should follow these steps:
2016-05-22 18:50:16 +08:00
1. determine your performance goals and confirm you are not meeting them
1. profile to identify the areas to improve. This can be CPU, heap allocations, or goroutine blocking.
1. benchmark to determine the speed up your solution will provide using
2016-05-26 16:39:34 +08:00
the built-in benchmarking framework (<http://golang.org/pkg/testing/>)
2016-05-22 18:50:16 +08:00
1. profile again afterwards to verify the issue is gone
1. use <https://godoc.org/rsc.io/benchstat> or
2016-05-26 16:39:34 +08:00
<https://github.com/codahale/tinystat> to verify that a set of timings
2016-05-22 18:50:16 +08:00
are 'sufficiently' different for an optimization to be worth the
added code complexity.
1. use <https://github.com/tsenart/vegeta> for load testing http services
1. make sure your latency numbers make sense: <https://youtu.be/lJ8ydIuPFeU>
2016-05-22 14:21:23 +08:00
2016-05-23 20:21:18 +08:00
The first step is important. It tells you when and where to start optimizing.
More importantly, it also tells you when to stop. Pretty much all
optimizations add code complexity in exchange for speed. And you can *always*
make code faster. It's a balancing act.
2016-05-22 18:44:02 +08:00
2016-05-22 14:21:23 +08:00
The basic rules of the game are:
2016-05-22 18:50:16 +08:00
1. minimize CPU usage
* do less work
* this generally means "a faster algorithm"
* but CPU caches and the hidden constants in O() can play tricks on you
1. minimize allocations (which leads to less CPU stolen by the GC)
1. make your data quick to access
## Introductory Profiling
Techniques applicable to source code in general
1. introduction to pprof
2016-05-26 16:39:34 +08:00
* go tool pprof (and <https://github.com/google/pprof>)
1. Writing and running (micro)benchmarks
* -cpuprofile / -memprofile / -benchmem
1. How to read it pprof output
2016-05-22 18:50:16 +08:00
1. What are the different pieces of the runtime that show up
2016-05-26 16:39:34 +08:00
1. Macro-benchmarks (Profiling in production)
* net/http/pprof
2016-05-22 18:50:16 +08:00
## Advanced Techniques
* Techniques specific to the architecture running the code
* introduction to CPU caches
* building intuition around cache-lines: sizes, padding, alignment
* false-sharing
* OS tools to view cache-misses
2016-05-22 18:50:16 +08:00
* (also branch prediction)
* Comment about Jeff Dean's 2002 numbers (plus updates)
* cpus have gotten faster, but memory hasn't kept up
## Runtime
* cost of calls via interfaces (indirect calls on the CPU level)
* runtime.convT2E / runtime.convT2I
* type assertions vs. type switches
* defer
2016-05-22 22:12:50 +08:00
* special-case map implementations for ints, strings
2016-05-22 18:50:16 +08:00
## Common gotchas with the standard library
* time.After() leaks until it fires
* Reusing HTTP connections...
* ....
## Unsafe
* And all the dangers that go with it
* Common uses for unsafe
* mmap'ing data files
2016-05-23 20:21:18 +08:00
* speedy de-serialization
2016-05-22 18:50:16 +08:00
## Assembly
* Stuff about writing assembly code for Go
2016-05-25 15:25:28 +08:00
* brief into to syntax
* calling convention
* using opcodes unsupported by the asm
* notes about why intrinsics are hard
2016-05-22 18:50:16 +08:00
## Alternate implementations
2016-05-22 18:50:16 +08:00
* Popular replacements for standard library packages:
* encoding/json -> ffjson
* net/http -> fasthttp
* regexp -> ragel (or other regular expression package)
2016-05-23 20:21:18 +08:00
* serialization
* encoding/gob -> <https://github.com/alecthomas/go_serialization_benchmarks>
* protobuf -> <https://github.com/gogo/protobuf>
* all formats have trade-offs; choose one that matches what you need
2016-05-22 18:50:16 +08:00
## Tooling
Look at some more interesting/advanced tooling
* perf (perf2pprof)
* go-torch (+flamegraphs)