This blog post describes the project which I worked on for my Fall 2020 Deep Reinforcement Learning class. The topic of my project was to explore learning RL policies from pixels, these policies were trained to solve the OpenAI Fetch Robotic environments. My work wanted to explore the possible answers to the following questions:

  1. What are effective RL algorithms to train these OpenAI algorithms ?
  2. What are the differences in the learning dynamics of state based and image based learning ?

Before we dive in the tldr of this blog post would be:

  1. My experiences training the Asymmetric Actor Critic

In the previous blog post we went through the process of quantizing our neural net and were successfully able to run inference in 8 bits with next to no loss in accuracy. I bet the question of “can we do better” must have come to your mind. Well, the same came to me and I tried to figure out the limits of how much we could quantize our network.

As always this blog post comes with the code linked in a Google Colab notebook and I encourage everyone to look through the code as it really is a very simple…

Update: The blog post for quantization aware training is online and linked here, through this we can train and quantize our model to run in 4 bits !

Hello, I wanted to share my journey into how I was able to run inference of a neural network using fixed point arithmetic (8 bit arithmetic). The state of Pytorch as of today allows for only 32 bit or 16 bit floating point training and inference. …

This is an excerpt from the Object detection software wiki. I am currently in the process of implementing some of the most cutting edge object detection algorithms here and am going to blog my experience implementing them.

Today we talk about how to train neural networks much faster than the norm by using 2 techniques. Cyclic Learning rates and Superconvergence.

The training methodology for the project seeks to make training easy and fast. Recently, techniques like superconvergence and cyclic learning rates have led to great improvements to the time for model convergence.

The object detection software seeks to use the…

How many times have you implemented a tree or a graph in C++ to solve a tricky algorithmic problem ?

Or how many time have you implemented a new data structure like an exciting Segment Tree or a Trie to brush up on those computer science skills ?I’m guessing quite a bit.

Learning these data structures and algorithms is very important. However, one thing that tutorial sites don’t teach is how to persist that data structure.

Persisting a data structure means keeping the data structure synced with your main memory. So ,if your application crashes, you don’t lose your data.

This post talks about how tensorflow executes your machine learning models. We shall briefly overview the components of the tensorflow graph , and then delve into how this graph is executed across single and multiple devices.

The tensorflow graph has the following properties. Each node has zero or more inputs , and represents the instantiation of an operation.

Values that flow from edges of the graph are known as tensors. These tensors undergo various transformations when they go through these nodes.

Tensors are arbitrary dimensionality arrays , where the underlying element type is inferred during graph construction time. This is…

Welcome to the explanation of one of the most important algorithms of distributed systems. Paxos is a protocol to gain consensus in distributed systems.It was invented by Leslie Lamport in 1980 and continues to shape the distributed systems of today.

What is Paxos

The goal of using Paxos is to create a replicated state machine. A state machine could be an application, database or a program. We want the same application to run concurrently on different servers.To get a replicated state machine, we need a replicated log. A log tells the actions that the state machine to take to arrive at various states…

This post talks about how the DHT is implemented for hydra. We are assuming that the reader is already aware of the Kademlia protocol. If not read this to get upto date.


We set out to construct the following components:

  1. An in memory data structure that is fault tolerant. It supports failure by maintaining a transaction log and periodic snapshotting. We shall go into the exact meaning of these two later in this post.
  2. A concurrent data structure that supports multiple requests.
  3. An in memory cache to keep track of dead nodes. Tjis cache helps reduce unnessesary network requests.


Karanbir Chahal

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store