Structure Data with Protocol Buffers in GoLang

When building distributed services, you’re communicating between the services over a network.

To send data (such as your structs) over a network, you need to encode the data in a format to transmit, and lots of programmers choose JSON.

When you’re building public APIs or you’re creating a project where you don’t control the clients, JSON makes sense because it’s accessible—both for humans to read and computers to parse.

But when you’re building private APIs or building projects where you do control the clients, you can make use of a mechanism for structuring and transmitting data that is compared to JSON, and that makes you

more productive
create faster services
have more features
have fewer bugs

So what is this mechanism?

Protocol buffers

Protocol buffers (also known as protobuf), which is Google’s language and platform-neutral extensible mechanism for structuring and serializing data.

The advantages of using protobuf :

Guarantees type-safety;
Prevents schema-violations;
Enables fast serialization;
Offers backward compatibility.

yeah, I heard you,

what's backward compatibility?

in software or technology, backward compatibility means that new versions or updates can still work with older versions or systems. For example, if you upgrade your phone's operating system, backward compatibility ensures that your existing apps still work without any issues. It's like ensuring that the new fits seamlessly with the old, allowing for smooth transitions and continued use of older components or systems alongside newer ones.

what you can do with Protobuf?

define your data structure,
compile protobuf into code in many languages,
read and write structured data to and from different data streams.

Protocol buffers are good for communicating between two systems (such as microservices), which is why Google used protobuf when building gRPC to develop a high-performance remote procedure call (RPC) framework.

Why Use Protocol Buffers?

Protobuf offers all kinds of useful features:

Consistent schemas
- define your data schemas once and share them across services where a central repository ("structs") is housed in microservices, ensuring a consistent data model throughout your system
Versioning for free -- maintain backward compatibility
- By allowing developers to number fields in messages, ensuring backward compatibility as new features and changes are rolled out, while also providing mechanisms to mark deprecated fields as reserved, preventing their use and prompting compiler errors if attempted.
Less boilerplate
- handle encoding and decoding for you, which means you don’t have to handwrite that code yourself.
Extensibility
- through compiler-supported extensions, enabling the generation of custom code logic during compilation, such as automatically generating common methods across multiple structs.
Language agnosticism
- Protobuf is implemented in many languages
Performance
- highly performant, has smaller payloads, and serializes up to six times faster than JSON

Protobuf Vs JSON

key differences between Protobuf and JSON

When choosing between JSON and Protobuf, consider the specific needs of your project.

JSON is often preferred for its ease of use, human readability, and broad compatibility, making it a good choice for web APIs and configurations.

Protobuf, on the other hand, offers advantages in performance, efficiency, and type safety, making it better suited for internal microservices communication, especially in performance-critical applications.

Aspect	JSON	Protobuf
Format	Text-based, human-readable	Binary, not human-readable
Size	Larger due to text format, which increases payload	Smaller, efficient encoding, which reduces payload
Speed	Generally slower serialization/deserialization due to parsing text	Faster serialization/deserialization due to compact binary format
Compatibility	Broadly supported across many programming languages and systems	Requires specific support; while widely supported, it's not as universal as JSON
Schema	Schema-less, flexible structure	Requires predefined schema, which enforces structure and data types
Versioning	Less formal support for versioning; changes in data structure can lead to issues	Strong support for backward and forward compatibility through explicit versioning
Ease of Use	Easy to use directly with minimal setup, great for debugging	Requires initial setup to define schema, less straightforward for beginners
Interoperability	Excellent for APIs where interoperability with web technologies is crucial	Ideal for internal communication where efficiency is critical and environments are controlled
Type Safety	Less type-safe, relies on runtime interpretation	Highly type-safe, with compile-time checks
Tooling and Support	Extensive tooling is available due to its ubiquity in web development	Good tooling, especially within systems designed for high efficiency, but can be more complex to set up

Benchmark Results between Protobuf and JSON

Here's a hypothetical example illustrating the kind of differences you might see in a benchmark comparing JSON and Protobuf:

Metric	JSON	Protobuf	Improvement with Protobuf
Serialization Time (ms)	2.0	0.5	4x faster
Deserialization Time (ms)	2.5	0.6	~4.2x faster
Payload Size (KB)	10	3	~3.3x smaller
CPU Usage during Serialization	High	Low	More efficient
Memory Usage during Serialization	Moderate	Low	More efficient

Serialization/Deserialization Speed

Protobuf is generally much faster than JSON for both serialization and deserialization. Protobuf can outperform JSON by a factor of 3x to 6x in speed. This is because Protobuf uses a binary format and a predefined schema, which allows for more efficient parsing.

Payload Size

Protobuf payloads are significantly smaller than JSON, often by 50% to 80%. This reduction in payload size leads to lower bandwidth usage and can be critical in network-constrained environments or for mobile applications where data usage is a concern.

System Resource Utilization

Protobuf uses fewer CPU resources than JSON due to its binary format and efficient parsing and serialization mechanisms. This can lead to lower server costs and better scalability in distributed systems

These results are illustrative and can vary based on the complexity of the data structure, the programming language, and the specific implementation. In general, Protobuf's advantages in speed and efficiency make it a preferred choice for internal service communication in distributed systems where performance and resource utilization are critical concerns. However, for public APIs or scenarios where human readability and ease of debugging are paramount, JSON might still be the preferred format.

Install the Protocol Buffer Compiler

install the compiler, Go to the Protobuf release page on GitHub Link
download the relevant release for your computer, I'm using Linux Ubuntu

download and install in your terminal like so:

$ wget https://github.com/protocolbuffers/protobuf/\
releases/download/v25.3/protoc-25.3-linux-x86_64.zip
$ unzip protoc-25.3-linux-x86_64.zip -d /usr/local/protobuf

Then add the binary to your PATH env var using your shell’s configuration file. If you’re using ZSH for instance, run something like the following to update your configuration:

$ echo 'export PATH="$PATH:/usr/local/protobuf/bin"' >> ~/.zshenv

At this point, the protobuf compiler is installed on your machine. To test the installation, run protoc --version.

$ protoc --version
----------------------------------
output :
libprotoc 25.3

If you do see errors, don’t worry: few installation problems, Google them you will find answers right away.

Define Your Domain Types as Protocol Buffers

In the previous tutorial, we defined our Record type in Go as this struct:

type Record struct {
    Value []byte `json:"value"`
    Offset uint64 `json:"offset"`
}

let's turn that into a protobuf message

create an api/v1 directory and create a file called log.proto

syntax = "proto3";
package log.v1;
option go_package = "github.com/user/api/log_v1";

message Record {
    bytes value = 1;
    uint64 offset = 2;
}

protobuf messages are equivalent to the Go structs
use the repeated keyword to define a slice of some type, so repeated Record records mean the records field is a []Record in Go.
These field numbers identify your fields in the marshaled binary format, and you shouldn’t change them once your messages are in use in your projects.

Compile Protocol Buffers

To compile protobuf into the code you need the runtime. The compiler itself doesn’t know how to compile protobuf into every language—it needs a language-specific runtime to do so.

inside the ~/.bashrc file write the two lines and run the file source ~/.bashrc

export GOPATH=$HOME/go
export PATH=$PATH:$GOPATH/bin

Install
____________
go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest
go get google.golang.org/protobuf

Let's compile your protobuf by running that command at the root of the project

protoc api/v1/*.proto \
--go_out=. \
--go_opt=paths=source_relative \
--proto_path=.

Now look at the api/v1 directory and you’ll see a new file called log.pb.go. Open it up to see the Go code that the compiler generated from your protobuf code.

check this issue in Stackoverflow if you get that error**
Error "protoc-gen-go: program not found or is not executable"**

link: https://stackoverflow.com/questions/57700860/error-protoc-gen-go-program-not-found-or-is-not-executable#:~:text=Go%201.17%2B,file.go

Work with the Generated Code

Although the generated code in log.pb.go is a lot longer than your handwritten code in log.go, use the code as you handwritten it. For example, you’ll create instances using the & operator (or new keyword) and access fields using a dot.

The compiler generates various methods on the struct, but the only methods you’ll use directly are the getters. they a useful when you have multiple messages with the same getter(s) and you want to abstract those method(s) into an interface.

For example, imagine building an e-commerce shop that sells books and games, every item has a different price, now you want to find the total of the items in the user’s cart. You’d make a Pricer interface (the abstraction) and a Total function that takes in a slice of Pricer interfaces and returns their total cost. Here’s what the code would look like

// book.go
type Book struct {
    Price uint64
}
func(b *Book) GetPrice() uint64 { // ... }

// game.go
type Game struct {
    Price uint64
}
func(b *Game) GetPrice() uint64 { // ... }

// calculate-price-service.go
type Pricer interface {
    GetPrice() uint64
}
func Total(items []Pricer) uint64 { // ... }

By this way, you can pass any item to the total function if it implements the GetPeice method, and that's how interfaces work in Golang.

the thing here what if you want to change the price of all your inventory? books, games, or others.

if we just had setters, we could use an interface like the following to set the price on the different kinds of items in your inventory:

type PriceAdjuster interface {
    SetPrice(price uint64)
}

When the compiled code isn’t quite what you need, you can extend the code and add your customization to it.