Go Concurrency With Goroutines and Channels

Some introductory notes on Go concurrency, goroutines, and channels, largely serving as a slightly-more-indepth supplement to a brief 2018 overview and example.

Goroutines

Normally, calling a function — foo(), for example — is a blocking operation. This means that program execution waits for it to finish before proceeding.

However, invoking a function with the go keyword — go foo(), for example — is non-blocking. When invoked as go foo(), Go runs foo() as a separate task managed by Go. The separate task is called a goroutine. The original Go task — the one Go creates when operating on a program’s main function — is called the main goroutine. In this case, when foo is invoked as go foo(), the main goroutine does not wait for foo() to finish; it proceeds, as foo() runs concurrently in a separate goroutine.

Channels

In Go, the chan keyword defines a channel. According to A Tour of Go, “Channels are a typed conduit through which you can send and receive values with the channel operator, <-.” A channel can transport data of only one type.

The <- operator indicates the channel direction: either send or receive. If no <- direction is specified, the channel is bi-directional.

For example:

chan Foo      // can be used to send & receive values of type Foo
chan<- string // send only; can be used to send strings
<-chan int    // receive only; can be used to receive ints

So, to elaborate:

ch <- s   // Send s to channel ch.
s := <-ch // Receive from ch, and assign value to s.

make is used to create a channel:

strCh := make(chan string) // a channel of strings
intCh := make(chan int)    // a channel of ints

In the case of the above channels, sends and receives block until the other side is ready. However, channels can also be buffered:

strCh := make(chan string, 100) // a buffered channel of length 100
intCh := make(chan int, 100)    // a buffered channel of length 100

Sends to a buffered channels block only when the buffer is full. Receives block only when the buffer is empty.

close is used to close a channel, indicating no more values will be sent. For example:

close(strCh) // close strCh
close(intCh) // close intCh

Receivers can check whether a channel is closed like so:

s, ok := <-ch

if ok {
  fmt.Println("s channel is not closed")
} else {
  fmt.Println("s channel is closed")
}

(Although, closing is really only necessary if the receiver must be explicitly told no more values will come, as might be necessary to terminate a loop, for example.)

Using channels to communicate between goroutines

Channels offer a mechanism through which separate goroutines can communicate, in effect offering a useful construct for concurrent programming. Performing multiple concurrent HTTP requests offers a common use case. A program that performs the HTTP requests serially — one at a time — is slower than one that performs the HTTP requests concurrently.

A non-concurrent Go program

For example, the following non-concurrent program — let’s call it fetch_urls.go — performs a series of HTTP requests, reports the time it took to perform each request, and reports the program’s total execution time:

package main

import (
	"fmt"
	"net/http"
	"time"
)

// main is the main goroutine
func main() {
	// Save the start time to a variable
	start := time.Now()

	// Create a slice of URLs to request
	urls := []string{
		"http://mikeball.info",
		"http://mikeball.me",
		"http://github.com/mdb",
	}

	// Call `fetch` with each URL in `urls` and print the results
	for _, url := range urls {
		fmt.Println(fetch(url))
	}

	// Print the total seconds spent in `main`
	fmt.Printf("Total time: %.2fs\n", time.Since(start).Seconds())
}

// fetch performs an HTTP request to the URL it's passed.
//
// If it does not encounter an error performing the HTTP
// request, returns a string like the following,
// reporting the total seconds required to perform the
// request, as well as the response status code:
//
// "http://foo.com  0.26s  200"
//
// If it encounters an error, it returns a string like the
// following:
//
// "http://foo.com request encountered error: some error"
func fetch(url string) string {
	start := time.Now()

	resp, err := http.Get(url)
	if err != nil {
		return fmt.Sprintf("%s request enountered error: %s", url, err.Error())
	}

	secs := time.Since(start).Seconds()

	// Return a summary string containing the URL, its request response time, and its HTTP status code
	return fmt.Sprintf("%s \t %.2fs \t %d", url, secs, resp.StatusCode)
}

When the program is run via go run fetch_urls.go, it prints the following:

go run fetch_urls.go
http://mikeball.info     0.96s   200
http://mikeball.me       0.35s   200
http://github.com/mdb    0.92s   200
Total time: 2.23s

Note that the program’s total execution time of 2.23s is the sum of the times consumed by each HTTP request.

A concurrent Go program

So, how might channels and goroutines be used to perform multiple concurrent HTTP requests? The following offers an example — similar to one I wrote about in 2018 — illustrating how the serial version of fetch_urls.go could be refactored to leverage concurrency:

package main

import (
	"fmt"
	"net/http"
	"time"
)

// main is the main goroutine
func main() {
	// Store the current time in a variable
	start := time.Now()

	// Create a channel of strings
	ch := make(chan string)

	urls := []string{
		"http://mikeball.info",
		"http://mikeball.me",
		"http://github.com/mdb",
	}

	// Call `fetch` in a new goroutine for each URL in `urls`
	for _, url := range urls {
		go fetch(url, ch)
	}

	// Receive and print each string sent to the `ch` channel from `fetch`
	for range urls {
		fmt.Println(<-ch)
	}

	// Print the total seconds spent in `main`
	// The individual request response times reported by `fetch` equal a sum greater than the total
	// seconds spent in `main`, thus illustrating that the `fetch` requests occurred concurrently.
	fmt.Printf("Total time: %.2fs\n", time.Since(start).Seconds())
}

// fetch performs an HTTP request to the URL it's passed.
//
// If it does not encounter an error performing the HTTP
// request, it sends a string like the following to the
// ch channel it's passed, reporting the total seconds
// required to perform the request, as well as the response
// status code:
//
// "http://foo.com  0.26s  200"
//
// If it encounters an error, it sends a string like the
// following too the ch channel it's passed:
//
// "http://foo.com request encountered error: some error"
func fetch(url string, ch chan<- string) {
	// Store the start time in a variable
	start := time.Now()

	resp, err := http.Get(url)
	if err != nil {
		ch <- fmt.Sprintf("%s request enountered error: %s", url, err.Error())

		return
	}

	// Store the seconds since start time in a variable
	secs := time.Since(start).Seconds()

	// Send a summary string to the `ch` channel containing the URL, its request response time, and its HTTP status code
	ch <- fmt.Sprintf("%s \t %.2fs \t %d", url, secs, resp.StatusCode)
}

Now, when the program is run via go run fetch_urls.go, it prints the following:

go run fetch_urls.go
http://mikeball.info     0.33s   200
http://mikeball.me       0.34s   200
http://github.com/mdb    0.93s   200
Total time: 0.93s

Note that the program’s total execution time of 0.93s is less than the sum of the times consumed by each HTTP request.

A real world example

A similar, real world example can be viewed in gossboss, a tool I recently wrote for collecting Goss test results from multiple Goss servers.

gossboss can be run as a server or used as a CLI. For example, gossboss healthzs reports the Goss test results from each Goss server --server specified:

gossboss healthzs \
  --server "http://some-goss-server/healthz" \
  --server "http://another-goss-server/healthz"

✔ http://some-goss-server/healthz
✔ http://another-goss-server/healthz

gossboss fetches the test results concurrently, as its gossboss.Client#CollectHealthzs method leverages a channel and goroutines:

func (c *Client) CollectHealthzs(urls []string) *Healthzs {
	hzs := &Healthzs{
		Summary: &Summary{
			Failed:  0,
			Errored: 0,
		},
	}
	ch := make(chan *Healthz)

	for _, url := range urls {
		go c.collectHealthz(url, ch)
	}

	// wait until all goss server test
	// results have been collected.
	for {
		hz := <-ch
		hzs.Healthzs = append(hzs.Healthzs, hz)

		if hz.Error == nil && hz.Result.Summary.Failed != 0 {
			hzs.Summary.Failed += hz.Result.Summary.Failed
		}

		if hz.Error != nil {
			hzs.Summary.Errored++
		}

		if len(hzs.Healthzs) == len(urls) {
			close(ch)
			break
		}
	}

	return hzs
}

For more insight on implementation details, check out github.com/mdb/gossboss. Do you have some ideas for how gossboss could be improved? Create a pull request.