go-perfbook/TODO

196 lines
9.5 KiB
Plaintext
Raw Normal View History

2016-05-22 14:21:23 +08:00
* blog posts
- http://jmoiron.net/blog/go-performance-tales/
2016-05-29 06:02:34 +08:00
- use integer map keys if possible
- hard to compete with Go's map implementation; esp. if your data structure has lots of pointer chasing
- aes-ni instructions make string hashing much faster
- prefer structs to maps if you know the map keys (esp. coming from perl, etc)
- channels are useful, but slow; raw atomics can help with performance
- cgo has overhead
- profile before optimizing
2016-05-22 14:21:23 +08:00
- http://slideshare.net/cloudflare/go-profiling-john-graham-cumming ( https://www.youtu.be/_41bkNr7eik )
2016-05-29 06:02:34 +08:00
- don't waste programmer cycles saving the wrong CPU cycles (or memory allocations)
- bash$ time; time.Now()/time.Since(); pprof.StartCPUProfile/pprof.StopCPUProfile; go tool pprof http://.../profile
- bash$ ps; runtime.ReadMemStats(); runtime.WriteHeapProfile(); go tool pprof http://.../heap
- slice operations are sometimes O(n)
- https://golang.org/pkg/runtime/debug/
- sync.Pool (basically)
2016-05-22 14:21:23 +08:00
- https://methane.github.io/2015/02/reduce-allocation-in-go-code
2016-05-29 06:02:34 +08:00
- 1. correctness is important
- 2. BenchmarkXXX with b.ReportAllocs() (or -benchmem when running)
- 3. allocfreetrace=1 produces stack trace on every allocation
- strategies:
- avoid string concat; use []byte+append() (+strconv.AppendInt(), ...)
- benchcmp
- avoid time.Format
- avoid range when iterating strings ([]rune conversion + utf8 decoding)
- can append string to []byte
- write two versions, one for string, one for []byte (avoids conversion+copy (sometimes...))
- reuse existing buffers instead of creating new ones
2016-05-22 14:21:23 +08:00
- http://bravenewgeek.com/so-you-wanna-go-fast/
2016-05-29 06:02:34 +08:00
- performance fast vs. delivery fast; make the right decision
- lock-free ring buffer vs. channels: faster except with GOMAXPROCS=1
- defer has a cost (allocation+cpu)
BenchmarkMutexDeferUnlock-8 20000000 96.6 ns/op
BenchmarkMutexUnlock-8 100000000 19.5 ns/op
- reflection+json
- ffjson avoids reflection
- msgp avoids json
- interfaces have dynamic dispatch which can't be inlined
- => use concrete types (+ code duplication)
- heap vs. stack; escape analysis
- lots of short-lived objects is expensive for the gc
- sync.Pool reuses objects *between* gc runs
- you need your own free list to hold onto things between gc runs
(but now you're subverting the purpose of a garbage collector)
- false sharing
- custom lock-free data structures: fast but *hard*
- "Speed comes at the cost of simplicity, at the cost of development time, and at the cost of continued maintenance. Choose wisely."
- https://software.intel.com/en-us/blogs/2014/05/10/debugging-performance-issues-in-go-programs
- http://blog.golang.org/profiling-go-programs
2016-05-22 14:21:23 +08:00
- https://medium.com/%40hackintoshrao/daily-code-optimization-using-benchmarks-and-profiling-in-golang-gophercon-india-2016-talk-874c8b4dc3c5
- If you're writing benchmarks, read http://dave.cheney.net/2013/06/30/how-to-write-benchmarks-in-go
- cache line explanation: http://mechanitis.blogspot.com/2011/07/dissecting-disruptor-why-its-so-fast_22.html
- avoiding false sharing: http://www.drdobbs.com/parallel/eliminate-false-sharing/217500206
- how does this translate to go? http://www.catb.org/esr/structure-packing/
2016-05-23 20:21:18 +08:00
- https://en.wikipedia.org/wiki/Amdahl%27s_law
- https://github.com/ardanlabs/gotraining/tree/master/topics/profiling
- https://github.com/ardanlabs/gotraining/tree/master/topics/benchmarking
- http://dave.cheney.net/2015/11/29/a-whirlwind-tour-of-gos-runtime-environment-variables
2016-09-20 15:20:19 +08:00
- https://github.com/davecheney/high-performance-go-workshop
- Mutex profile: https://rakyll.org/mutexprofile
2016-05-22 14:21:23 +08:00
2016-05-29 06:02:34 +08:00
cgo:
cgo has overhead
(which has only gotten more expensive over time) -- ~200 ns/call
2018-01-04 02:36:24 +08:00
(reduced in 1.8 to <100ns; still not free)
2016-05-29 06:02:34 +08:00
ssa backend means less difference in codegen
2017-04-24 15:05:29 +08:00
really think if you want cgo: http://dave.cheney.net/2016/01/18/cgo-is-not-go
2018-01-04 02:36:24 +08:00
https://www.youtube.com/watch?v=lhMhApWQp2E : cgo gophercon
cgo performance tracking bug: https://github.com/golang/go/issues/9704
2016-05-29 06:02:34 +08:00
2016-05-22 14:21:23 +08:00
videos:
https://gophervids.appspot.com/#tags=optimization
-- figure out which of these are specifically worth listing
"Profiling and Optimizng Go" (Uber)
https://www.youtube.com/watch?v=N3PWzBeLX2M
https://go-talks.appspot.com/github.com/davecheney/presentations/writing-high-performance-go.slide
https://www.youtube.com/watch?v=zWp0N9unJFc
Björn Rabenstein
https://docs.google.com/presentation/d/1Zu0BdbhMRar7ycEwDi8jepGokTXTDXlKFf7C13tusuI/edit
https://www.youtube.com/watch?v=ZuQcbqYK0BY
https://go-talks.appspot.com/github.com/mkevac/golangmoscow2016/gomeetup.slide
CppCon 2014: Chandler Carruth "Efficiency with Algorithms, Performance with Data Structures"
https://www.youtube.com/watch?v=fHNmRkzxHWs
Performance Engineering of Software Systems
http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-172-performance-engineering-of-software-systems-fall-2010/
https://talks.golang.org/2013/highperf.slide#1
Machine Architecture: Things Your Programming Language Never Told You
https://www.youtube.com/watch?v=L7zSU9HI-6I
2016-09-20 15:20:19 +08:00
7 Ways to Profile Go Applications
https://www.youtube.com/watch?v=2h_NFBFrciI
2017-04-26 06:01:14 +08:00
dotGo 2016 - Damian Gryski - Slices: Performance through cache-friendliness
https://www.youtube.com/watch?v=jEG4Qyo_4Bc
2018-01-02 23:20:14 +08:00
Performance Bugs
https://www.youtube.com/watch?v=89qiHoDjeDg
2016-05-23 20:21:18 +08:00
asm:
https://golang.org/doc/asm
https://goroutines.com/asm
http://www.doxsey.net/blog/go-and-assembly
2016-09-20 15:30:46 +08:00
https://www.youtube.com/watch?v=9jpnFmJr2PE
2017-04-24 15:05:29 +08:00
https://blog.gopheracademy.com/advent-2016/peachpy/
https://blog.sgmansfield.com/2017/04/a-foray-into-go-assembly-programming/
2017-04-25 11:57:47 +08:00
http://lemire.me/blog/2016/12/21/performance-overhead-when-calling-assembly-from-go/
2018-01-04 02:36:24 +08:00
minio posts + tooling
2016-05-23 20:21:18 +08:00
2016-05-22 14:21:23 +08:00
posts:
http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html
2018-01-04 02:36:24 +08:00
https://arxiv.org/abs/1509.05053 (array layouts for comparison-based searching)
http://grokbase.com/t/gg/golang-nuts/155ea0t5hf/go-nuts-after-set-gomaxprocs-different-machines-have-different-bahaviors-some-speed-up-some-slow-down
http://grokbase.com/t/gg/golang-nuts/14138jw64s/go-nuts-concurrent-read-write-of-different-parts-of-a-slice
2016-05-22 14:21:23 +08:00
2016-05-27 06:48:51 +08:00
Escape Analysis Flaws
https://docs.google.com/document/d/1CxgUBPlx9iJzkz9JWkb6tIpTe5q32QDmz8l0BouG0Cw/preview
2017-04-24 15:05:29 +08:00
https://hackernoon.com/optimizing-optimizing-some-insights-that-led-to-a-400-speedup-of-powerdns-5e1a44b58f1c
http://leto.net/docs/C-optimization.php
2016-05-22 14:21:23 +08:00
2018-01-05 14:18:15 +08:00
http://www.stochasticlifestyle.com/algorithm-efficiency-comes-problem-information/
2016-05-22 14:21:23 +08:00
tools:
https://godoc.org/github.com/aclements/go-perf
2017-04-25 11:57:38 +08:00
https://godoc.org/x/perf/cmd/benchstat
2016-05-22 14:21:23 +08:00
https://github.com/rakyll/gom
https://github.com/tam7t/sigprof
https://github.com/aybabtme/dpprof
https://github.com/wblakecaldwell/profiler
https://github.com/MiniProfiler/go
https://perf.wiki.kernel.org/index.php/Main_Page
https://github.com/dominikh/go-structlayout
http://www.brendangregg.com/perf.html
2017-01-14 07:49:46 +08:00
https://github.com/davecheney/gcvis
https://github.com/pavel-paulau/gcterm
2017-12-31 10:45:41 +08:00
https://github.com/jonlawlor/benchls
2017-12-30 00:04:50 +08:00
pprof:
https://rakyll.org/pprof-ui/
https://rakyll.org/profiler-labels/
https://rakyll.org/custom-profiles/
2017-04-24 15:05:29 +08:00
trace:
https://making.pusher.com/go-tool-trace/
https://www.youtube.com/watch?v=mmqDlbWk_XA
https://www.youtube.com/watch?v=nsM_m4hZ-bA
2017-12-30 00:04:39 +08:00
https://blog.gopheracademy.com/advent-2017/go-execution-tracer/
2017-04-24 15:05:29 +08:00
papers:
https://www.akkadia.org/drepper/cpumemory.pdf
2016-05-25 15:25:28 +08:00
https://software.intel.com/sites/default/files/article/392271/aos-to-soa-optimizations-using-iterative-closest-point-mini-app.pdf
2017-04-24 15:05:29 +08:00
optimization guides:
http://developer.amd.com/resources/developer-guides-manuals/
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.uan0015b/index.html
https://www-ssl.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html
stackoverflow:
https://stackoverflow.com/questions/19397699/why-struct-with-padding-fields-works-faster/19397791#19397791
https://stackoverflow.com/questions/10017026/no-speedup-in-multithread-program/10017482#10017482
2017-04-24 15:05:29 +08:00
practice:
https://twitter.com/dgryski/status/584682584942194689
2017-12-31 10:45:41 +08:00
2018-01-04 02:36:24 +08:00
distributed system design: (out of scope for this book)
http://highscalability.com/blog/2010/12/20/netflix-use-less-chatty-protocols-in-the-cloud-plus-26-fixes.html
books:
2017-12-31 10:45:41 +08:00
Writing Efficient Programs
2018-01-03 02:19:42 +08:00
Algorithm Engineering: https://www.springer.com/gp/book/9783642148651
2018-01-04 02:36:24 +08:00
http://www.cs.tufts.edu/~nr/cs257/archive/don-knuth/empirical-fortran.pdf
2018-01-07 06:21:29 +08:00
Usborne: Programming Tricks and Skills
https://drive.google.com/file/d/0Bxv0SsvibDMTdElPMHF5NVpmU0U/view
2018-01-07 05:51:59 +08:00
Quotes: (Bumper Sticker Computer Science)
[The First Rule of Program Optimization] Don't do it.
[The Second Rule of Program Optimization---For experts only] Don't do it yet.
Michael Jackson
Michael Jackson Systems Ltd.
The fastest algorithm can frequently be replaced by one that is almost as fast and much easier to understand.
Douglas W. Jones
University of Iowa