How to encode/marshal a big slice into json in Golang
The problem
Say you have an io.Writer
and you want to encode/marshal your very big slice of
data in json
string and write into it; it can be a file, an S3 upload stream,
http response, or any form of a writer.
Marshaling a big chunk of data, let’s say 20GB, can be a heavy workload for the system, as it requires a significant amount of memory and processing power regardless of the language you use. The performance of the system will depend on several factors, such as specifications of the machine, the available memory and processing power, and the efficiency of the code used to perform the marshaling operation.
Assuming that the system has enough memory to handle the 20GB data, the marshaling operation can still take a considerable amount of time, especially if the data contains complex nested structures or arrays. It’s worth noting that the size of the resulting JSON string may be much larger than the original data, due to the overhead of JSON formatting and encoding. This can also impact the performance and memory usage of the system, especially when transmitting or storing the data.
I’m gonna show you a technique called streaming that improves a performance and reduces the memory usage.
Short answer
Streaming works like this. You manually create the json
array structure in your io.Writer
, then you iterate over your data, encode each element, write the json string in writer and manually take care of the json array syntax to be valid.
This method will give you a lower transfer rate since between each iteration the encoding process happens and each write on the output is a io-bound workload, but it only uses the amount of memory required for a single element in the array. Take a look at this example:
func encode(posts []Post, w io.Writer) {
// Write the beginning of the JSON array
w.Write([]byte("["))
encoder := json.NewEncoder(w)
// Encode the first element and write it to the response
if len(posts) > 0 {
encoder.Encode(posts[0])
}
// Encode and write the rest of the posts to the response while
// being careful not to forget commas in between the elements
for i := 1; i < len(posts) {
w.Write([]byte(","))
encoder.Encode(posts[i])
}
// Write the end of the JSON array
w.Write([]byte("]"))
}
This method really shines when you plan to get your data from a channel. And you can reduce the effect of blocking io-bound workload if you buffer the output by wrapping your response writer by a bufio.Writer
. Go to the best solution for more details
Always benchmark the solutions for your case. Depending on the size of your data this solution might be slower due to the lower transfer rate.
Long answer
Say you have a large slice, or a stream of objects from a channel and you need to encode them into json
and write it to a file or return it back as a http response. The following piece of code is a simplified empty http handler that needs to return the posts
where were seeded by the init
function.
1package main
2
3import (
4 "encoding/json"
5 "log"
6 "net/http"
7 "time"
8)
9
10var posts []Post
11
12type Post struct {
13 Date string
14}
15
16init() {
17 // seed data (just for testing purpose)
18 const count = 50_000_000
19 posts = make([]Post, count)
20 for i := 0; i < count; i++ {
21 posts[i] = Post{Date: time.Now().String()}
22 }
23}
24
25func main() {
26 http.HandleFunc("/", handler)
27 log.Fatal(http.ListenAndServe(":8080", nil))
28}
29
30func handler(w http.ResponseWriter, r *http.Request) {
31
32 // Set the Content-Type header to application/json
33 w.Header().Set("Content-Type", "application/json")
34
35 // TODO: <---- need to return the posts in json format here
36 // the encoded json response is going to be about 4GB
37}
How do you do it?
The common solution
The most straight forward way to do it is to encode it into a string and then write it in the response like this.
35// This string allocates 4GB of memory!!!
36// I ignored the error for the sake of simplicity. please don't do it.
37jsonString, _ := json.Marshal(posts)
38
39// writing the JSON response to the http.ResponseWriter
40w.Write(jsonString)
Why is it bad?
First of all, the jsonString
requires 4GB of memory. In some cases having the required memory is not feasible.
The I followed this this post that shows how to generate a graph for transfer rate, but I changed it to get a chart for Total data received since the script doesn’t work well with the http calls that are short.
$ time curl http://localhost:8080 2>&1 |tr -u '\r' '\n'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:00:02 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:00:03 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:00:04 --:--:-- 0
...
0 0 0 0 0 0 0 0 --:--:-- 0:00:43 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:00:44 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:00:45 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:00:46 --:--:-- 0
100 1400k 0 1400k 0 0 30517 0 --:--:-- 0:00:47 --:--:-- 297k
100 179M 0 179M 0 0 3835k 0 --:--:-- 0:00:48 --:--:-- 38.2M
100 340M 0 340M 0 0 7108k 0 --:--:-- 0:00:49 --:--:-- 72.3M
100 450M 0 450M 0 0 9222k 0 --:--:-- 0:00:50 --:--:-- 95.7M
100 975M 0 975M 0 0 19.1M 0 --:--:-- 0:00:51 --:--:-- 207M
100 1608M 0 1608M 0 0 30.9M 0 --:--:-- 0:00:52 --:--:-- 321M
100 2301M 0 2301M 0 0 43.4M 0 --:--:-- 0:00:53 --:--:-- 424M
100 2942M 0 2942M 0 0 54.4M 0 --:--:-- 0:00:54 --:--:-- 520M
100 3660M 0 3660M 0 0 66.5M 0 --:--:-- 0:00:55 --:--:-- 642M
100 3991M 0 3991M 0 0 71.9M 0 --:--:-- 0:00:55 --:--:-- 678M
________________________________________________________
Executed in 55.48 secs fish external
usr time 0.80 secs 0.48 millis 0.80 secs
sys time 1.74 secs 2.87 millis 1.74 secs
Here is the curl response and the chart for it. As you can see the worst downside is that the API doesn’t return any response until the whole slice is converted to string and then it starts a fast pace stream. My case took 55.48 seconds to receive the whole response body overall but the first 46 seconds there was no data incoming.
An alternative solution
Well, you might think giving the writer directly to the encoder might solve this issue…
35json.NewEncoder(w).Encode(posts)
But underneath the Encoder
method there is a buffer which waits for the completion of
of encoding process before writing anything into the writer. So, there is going to
be no difference between this an the previous method.
A better solution
A possible solution is to iterate over the slice and create the json
string for each
item separately while writing each of them in the response writer one by one manually.
By doing that we make sure we’re not allocating any unnecessary memory and keeping
the footprint as low as possible.
35encoder := json.NewEncoder(w)
36
37w.Write([]byte("[")) // <--- w is the http.ResponseWriter
38
39if len(posts) > 0 {
40 encoder.Encode(posts[0])
41}
42
43for i := 1; i < len(posts) {
44 w.Write([]byte(","))
45 encoder.Encode(posts[i])
46}
47
48w.Write([]byte("]"))
What’s the catch?
The downside is that the transfer rate is significantly lower than the common solution, because attempting to write to IO is slow and we are doing it many times more that the previous time.
Every time we try to write something in a output writer, the operating system normally needs to perform several steps to write the data to the underlying platform which can be disk or network. These steps can include copying the data from user space to kernel space, allocating memory for the data in kernel space, and scheduling disk I/O operations to write the data to disk.
If data is written in small chunks, each of these steps may need to be repeated for each chunk of data, which can result in a lot of overhead and slow performance.
$ time curl http://localhost:8080 2>&1 |tr -u '\r' '\n'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 64.7M 0 64.7M 0 0 51.6M 0 --:--:-- 0:00:01 --:--:-- 52.6M
100 161M 0 161M 0 0 71.6M 0 --:--:-- 0:00:02 --:--:-- 72.3M
100 258M 0 258M 0 0 79.4M 0 --:--:-- 0:00:03 --:--:-- 79.9M
100 368M 0 368M 0 0 86.6M 0 --:--:-- 0:00:04 --:--:-- 87.1M
100 471M 0 471M 0 0 89.7M 0 --:--:-- 0:00:05 --:--:-- 92.9M
100 578M 0 578M 0 0 92.5M 0 --:--:-- 0:00:06 --:--:-- 102M
100 689M 0 689M 0 0 95.0M 0 --:--:-- 0:00:07 --:--:-- 105M
100 786M 0 786M 0 0 95.2M 0 --:--:-- 0:00:08 --:--:-- 105M
100 892M 0 892M 0 0 96.4M 0 --:--:-- 0:00:09 --:--:-- 104M
100 1019M 0 1019M 0 0 99.4M 0 --:--:-- 0:00:10 --:--:-- 109M
100 1120M 0 1120M 0 0 99.5M 0 --:--:-- 0:00:11 --:--:-- 108M
100 1247M 0 1247M 0 0 101M 0 --:--:-- 0:00:12 --:--:-- 111M
100 1377M 0 1377M 0 0 103M 0 --:--:-- 0:00:13 --:--:-- 118M
100 1506M 0 1506M 0 0 105M 0 --:--:-- 0:00:14 --:--:-- 122M
100 1636M 0 1636M 0 0 107M 0 --:--:-- 0:00:15 --:--:-- 123M
100 1765M 0 1765M 0 0 108M 0 --:--:-- 0:00:16 --:--:-- 128M
100 1891M 0 1891M 0 0 109M 0 --:--:-- 0:00:17 --:--:-- 128M
100 2025M 0 2025M 0 0 110M 0 --:--:-- 0:00:18 --:--:-- 129M
100 2157M 0 2157M 0 0 112M 0 --:--:-- 0:00:19 --:--:-- 130M
100 2281M 0 2281M 0 0 112M 0 --:--:-- 0:00:20 --:--:-- 128M
100 2386M 0 2386M 0 0 112M 0 --:--:-- 0:00:21 --:--:-- 124M
100 2519M 0 2519M 0 0 113M 0 --:--:-- 0:00:22 --:--:-- 125M
100 2651M 0 2651M 0 0 114M 0 --:--:-- 0:00:23 --:--:-- 125M
100 2780M 0 2780M 0 0 114M 0 --:--:-- 0:00:24 --:--:-- 124M
100 2921M 0 2921M 0 0 115M 0 --:--:-- 0:00:25 --:--:-- 128M
100 3057M 0 3057M 0 0 116M 0 --:--:-- 0:00:26 --:--:-- 134M
100 3199M 0 3199M 0 0 117M 0 --:--:-- 0:00:27 --:--:-- 135M
100 3339M 0 3339M 0 0 118M 0 --:--:-- 0:00:28 --:--:-- 137M
100 3462M 0 3462M 0 0 118M 0 --:--:-- 0:00:29 --:--:-- 136M
100 3600M 0 3600M 0 0 119M 0 --:--:-- 0:00:30 --:--:-- 135M
100 3716M 0 3716M 0 0 118M 0 --:--:-- 0:00:31 --:--:-- 131M
100 3845M 0 3845M 0 0 119M 0 --:--:-- 0:00:32 --:--:-- 129M
100 3984M 0 3984M 0 0 119M 0 --:--:-- 0:00:33 --:--:-- 129M
100 4039M 0 4039M 0 0 120M 0 --:--:-- 0:00:33 --:--:-- 131M
________________________________________________________
Executed in 33.67 secs fish external
usr time 5.96 secs 0.38 millis 5.96 secs
sys time 10.48 secs 1.37 millis 10.48 secs
Despite whe io-bound workload blockers we face the chart shows a huge improvement over waiting for encoding the big data chunk and we finish the transfer faster than before. Almost in 33 seconds instead of 55 seconds. But there is a way to improve this further.
Best way
Remember when I say every attempt on writing to an output takes time? What if we buffer our output and write bigger chunks each time?
35// Wrap the http.ResponseWriter with a bufio.Writer with the size of 64kb
36buffer := bufio.NewWriterSize(w, 65536)
37
38encoder := json.NewEncoder(buffer)
39
40buffer.WriteByte('[')
41
42if len(posts) > 0 {
43 encoder.Encode(posts[0])
44}
45
46for i := 1; i < len(posts) {
47 buffer.WriteByte(',')
48 encoder.Encode(posts[i])
49}
50
51buffer.WriteByte(']')
52
53// Flush the bufio.Writer to ensure all data is written to the http.ResponseWriter
54buffer.Flush()
What we did here is wrapping the response writer with buffered io writer and use it as output. I will explain how it improves the processHere is the best result I got by setting the buffer to 64KB.
$ time curl http://localhost:8080 2>&1 |tr -u '\r' '\n'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 50.1M 0 50.1M 0 0 205M 0 --:--:-- --:--:-- --:--:-- 212M
100 272M 0 272M 0 0 218M 0 --:--:-- 0:00:01 --:--:-- 220M
100 500M 0 500M 0 0 223M 0 --:--:-- 0:00:02 --:--:-- 223M
100 736M 0 736M 0 0 226M 0 --:--:-- 0:00:03 --:--:-- 227M
100 975M 0 975M 0 0 229M 0 --:--:-- 0:00:04 --:--:-- 230M
100 1198M 0 1198M 0 0 228M 0 --:--:-- 0:00:05 --:--:-- 229M
100 1437M 0 1437M 0 0 230M 0 --:--:-- 0:00:06 --:--:-- 233M
100 1664M 0 1664M 0 0 229M 0 --:--:-- 0:00:07 --:--:-- 232M
100 1904M 0 1904M 0 0 230M 0 --:--:-- 0:00:08 --:--:-- 233M
100 2144M 0 2144M 0 0 232M 0 --:--:-- 0:00:09 --:--:-- 233M
100 2382M 0 2382M 0 0 232M 0 --:--:-- 0:00:10 --:--:-- 236M
100 2601M 0 2601M 0 0 231M 0 --:--:-- 0:00:11 --:--:-- 232M
100 2831M 0 2831M 0 0 231M 0 --:--:-- 0:00:12 --:--:-- 233M
100 3062M 0 3062M 0 0 231M 0 --:--:-- 0:00:13 --:--:-- 231M
100 3294M 0 3294M 0 0 231M 0 --:--:-- 0:00:14 --:--:-- 229M
100 3518M 0 3518M 0 0 230M 0 --:--:-- 0:00:15 --:--:-- 227M
100 3732M 0 3732M 0 0 229M 0 --:--:-- 0:00:16 --:--:-- 226M
100 3962M 0 3962M 0 0 229M 0 --:--:-- 0:00:17 --:--:-- 226M
100 4038M 0 4038M 0 0 229M 0 --:--:-- 0:00:17 --:--:-- 225M
________________________________________________________
Executed in 17.59 secs fish external
usr time 0.88 secs 0.43 millis 0.88 secs
sys time 1.62 secs 1.53 millis 1.62 secs
As you can see by streaming the data one by one, we got the full transfer done in 17.59 seconds plus reduction of memory usage, isn’t that good?
How to decide the buffer size?
The default buffer size is 4KB which it can make the transfer faster, or slower depending on the element you have in the array without being optimized! The effective buffer size will be different case to case and it’s hard to say which size works for each case without testing. So, don’t rush and do tests and compare results.