Computers (not people!) are good at doing many things quickly. Very many things and very quickly. For example, here's a function that does something: it sleeps for a second:
func doSomething() {
time.Sleep(time.Second)
}
Doing many things
When you want to do something n
times you can do it like this:
func doManyThings(n int) {
for range n {
doSomething()
}
}
func main() {
doManyThings(10)
}
But how long does it take to do this something, let's say ten times?
$ time go run main.go
real 0m10.256s
As expected, around 10 seconds ... What if we wanted to do this something a million times? That would take at least a million seconds which is almost a dozen of days. Wait, what is this, computers are supposed to be fast!
Doing many things fast
You can make them faster by programming them to do things concurrently instead of sequentially - one after another - as above. And if you have multiple CPUs (which nowadays you almost certainly have) these things are done in parallel (definitely something a man should not try to simulate - read Johann Hari's book Lost Focus to learn more)!
You can run many things in Go concurrently by spawning new goroutines like this:
func doManyThings(n int) {
var wg sync.WaitGroup // create concurrency-safe counter
for range n {
wg.Add(1) // increment counter by one
go func() { // launch function on new goroutine and continue
defer wg.Done() // decrement counter on return from function
doSomething()
}()
}
wg.Wait() // wait till counter is 0 (all functions are done)
}
func main() {
doManyThings(1e6)
}
So how long does it take now to do million times something that takes one second? On my laptop it's less than three seconds:
$ time go run main.go
real 0m2.667s
That's much faster then twelve days!
Doing many things producing output fast
When doing something we often want to see some output. If nothing else we at least want to know whether the task was done without problems:
// Now doSomething returns an error value. It's nil when all is good.
func doSomething() error {
time.Sleep(time.Second)
if rand.Intn(100) == 0 { // circa one percent of tasks will fail
return errors.New("we got a problem")
}
return nil
}
Now, how do we get output from tasks being concurrently executing on goroutines? For this we use channels:
func doManyThings(n int) []error {
ch := make(chan error, n) // create channel of errors that can hold n items
for range n {
go func() {
ch <- doSomething() // send function's output to channel
}()
}
var errs []error
for range n {
err := <-ch // receive value from channel and assign it to err
if err != nil {
errs = append(errs, err)
}
}
return errs
}
func main() {
n := 1_000_000
errs := doManyThings(n)
fmt.Printf("Doing %d things there where %d errors.\n", n, len(errs))
}
As you can see we don't need a sync.WaitGroup
counter anymore. That's because we do n
loops (via for range n
) one more time to receive all the values from the channel ch
. This second loop effectively waits for all goroutines to complete - each goroutine sends exactly one value to the channel, so by reading n
values from the channel, we know that all n
goroutines have finished their work. The buffered channel (with capacity n
) ensures that all goroutines can send their results without blocking, even if we haven't started reading from the channel yet.