Deep dive into gRPC: Client implementation in Java, and Why we lose initial requests at the server side if the Deadline is less than a second?

Vikesh Yadav

7 min readAug 16, 2020

This a follow up article of my previous article. Please go through it for a better understanding.

Points covered in this article are-

The architecture of gRPC
Types of RPC
Channel
Types of stub
Deadline
Java Client example with Deadline
Why we lose initial requests at the server if the Deadline is less than a second?

The architecture of gRPC.

gRPC architecture is divided into 3 layers Surface(contains Surface API and Filters), Transport and IO-Manager.

Surface API exposes the underline C-core APIs as the applications do not directly talk to the core APIs of gRPC.
Filters provide an extension to the core functionality of the gRPC for example-
auth_filter — this API performs the authentication for RPC calls.
deadline_filter — this filter implements the deadline for RPC calls.
client_channel_filter — this is a very important filter that does name resolution and load balancing etc. We will discuss this filter in detail.
Transport implements the wire protocol, by default it uses HTTP/2 but we can use other protocols like QUIC and Cronet, etc.
IO-Manager layers provide the abstraction for gRPC read/write(i.e. endpoints)
This layer also provides the network utilities, Timers and a bunch of other primitives that are platforms specific(Windows-based or POSIX based implementation)

Types of RPC.

There are four types of RPC, Unary, Client Streaming, Server Streaming, and Bidirectional Streaming we are going to discuss the first two.
Unary RPC is used where the client sends exactly one message and receives one response just like a normal function call
Client Streaming is where the client writes a sequence of messages and sends then to the server using the provided stream

Let’s discuss Channel.

A Channel is a communication pipe or tunnel between the client and the target.
A call is bound to a Channel, it has many to one relation meaning multiple calls can share the same Channel.
Client and Server perform ops(Calls) in batches, they can perform all the operations in batch or split it into multiple batches based on the RPC.
Once a batch of ops is finished it’s notified in the Completion Queue.
A Completion Queue is a queue that keeps the record of the batches, once a batch is done with the ops its notified in the Completion Queue so that the next batch can be executed.
Messages exchanged between Client and Server are as following
Send the initial metadata. Exactly once per Call by server and client
Receive initial metadata. Exactly once per Call by server and client
Send message. Zero or more by server and client(only one message sent for Unary RPC)
Receive message. Zero or more by server and client(only one message sent for Unary RPC)
Send trailing metadata. Exactly once per Call by server and client
Receive trailing metadata. Exactly once per Call by server and client
Channel-Creation
— A Completion Queue get created and a thread waits to watch this queue forever
— Calls API to create a Channel to the server and specify a server URI
— Client channel filter chooses a resolver based on the URI scheme of the server
— The resolver returns a list of addresses theses addresses will the server addresses
— Client channel filter then initializes LB policy, currently, there are 3 LB policies- Pick First(used by default), Round Robin, gRPC LB
— This is called client channel Load Balancing
— Client channel filter then creates Channel with all the backends returned by the resolver

The client created a channel with all the servers returned by the resolver

Channel creation happens in a lazy fashion, it doesn’t create the Channel immediately, it waits until a Call(request) is made to the server(please keep it in mind)
Once the server acknowledges the channel creation by the client, the client creates Sub-Channels.
Channel is a collection of Sub-Channels, it’s the Sub-Channels that actually maps the calls to the server
Channel creation can take up to 3 seconds(please keep it in mind we have a use case to discuss)

Types of Stub

On the client-side client has an object known as a stub, there are two types of stub- Blocking, Non-Blocking, and Listenable Future Stub.
Blocking Stub is when the client sends a request to the server and waits for the response.
Non-Blocking Stub is used to make a streaming call, it makes the calls asynchronously and response is returned asynchronously from the server. This stub can be used for Client-side streaming RPC calls.
Listenable Future Stub is when the client doesn’t wait for the response and the call returns a ListenableFuture. The client should add a callback to get the response for the requests.
Refer here for more info

What is Deadline?

A Deadline is a time duration that allows a client to specify for how long it is ready to wait for and RPC to get completed. The deadline is absolute in time. If the server takes more time than the Deadline, the RPC is terminated with DEADLINE_EXCEEDED error.

How the deadline is propagated across RPC?

A deadline is calculated before making an RPC calls to the server stub
Deadline deadline = current timestamp + deadline duration
When current timestamp becomes greater than the deadline, RPC gets terminated
The server always knows if an RPC is valid

Java Client example with Deadline for the server that we had created in the previous article.

One more class GrpcClient.java is added in the same package where the server is present(check the image for structure). Get the code from this GitHub repository.

Project structure after adding gRPC Java Client

Let’s discuss gRPC client. In order to create a gRPC client, we need to create the channel first by providing the host and the port. After that, We are going to create a Blocking and a Listenable Future Sub(explained earlier) to explain the working.

Once the client stub is created we can now make an RPC call and get the response from the server. We have used a 5 ms deadline for the RPC call.

In case if the server is not able to reply within 5ms client will terminate the request Stream and a DEADLINE_EXCEEDED exception is thrown. If there is no server-side Deadline, the server will process the request for sure.

Response from both Blocking and Future stubs

Exception when Deadline exceeded

Why gRPC servers don’t receive the first few packets if the Deadline is smaller?

Scenario: We had to implement a gRPC client with a 5ms Deadline. We noticed that the server was not getting the some of the initial requests sent by the client.

Now that the server didn’t receive any request from the client but the client made requests and threw DEADLINE_EXCEEDED exception, what happened to request?

Read the error message carefully that says waiting_for_connection. Let’s dig deep into it and will explain what is going on

The client sent a request to the server to establish a channel.
As we know channel creation is a lazy job, it's not created immediately until we make an RPC call.
Client starts sending a request(by this time channel is not established) request waits for the channel to get established.
By the time deadline exceeds and client terminates the RPC and throws DEADLINE_EXCEEDED exception.
So basically the RPC call didn’t reach the server stub and hence we miss the initial requests at the server-side.
To conclude this let’s look at the TCPdump in Wireshark, The IP ending with .224 is the client stub and that ending with .41 is the server stub.

We can see that the first few attempts failed to establish the channel, hence any RPC call made in this duration will not reach the server.
We can also see that it took 2.72 seconds to establish the channel and then a successful RPC call was made and it reached the server as well.
Look at this screenshot, here the first call with a 1-second deadline got served by the server and later request throws the DEADLINE_EXCEEDED exception with message [remote_addr=/127.0.0.1:8080]. It means that channel is established.