## fix blank plot when using plotly in jupyter lab

this helped: https://github.com/plotly/plotly.py/issues/2508#issuecomment-907338746

## pylance stuck on “Searching for source files”

while using vscode, I noticed that pylance wasn’t running even after forcing pylance server restart. I checked the output logs and it was stuck at this line: solution add pyrightconfig.json on project root dir. Here is a link to github Read more…

## paper review: “Graph Attention Networks”

arxiv: https://arxiv.org/abs/1710.10903 key points introduce “graph attention network(GAT)” which consists of “graph attention layers” which incorporate the “self-attention” idea from transformers to graph neural network the graph attention layers work by calculating weights of a node’s neighbors from the features Read more…

## fix “Certificate verification failed” error in apt update of docker container

Background While working on a raw docker base image that didn’t even have basic tools installed, I tried to call apt update but the following error came up This error occurred even after setting “http_proxy” and “https_proxy” environment variables. Solution Read more…

## fixing “Could not handshake: An unexpected TLS packet was received” error while apt update in docker container behind corporate proxy

Background While trying to build a docker image from a very raw apache spark base image, since it didn’t have any basic packages such as vim, ssh, wget, etc, I entered a running container of this image and typed apt Read more…

## python interrupt, sigterm, sigkill, exception handling experiments

When thinking both Exceptions and interrupts at the same time, things can get confusing so here I write down some simple experiments that I did to clear some confusing concepts.

(more…)

## Properly setting dataloader and callback for validation in pytorch DDP

pytorch distributed data parallel(DDP) is very useful and relatively well provided for creating a distributed training setup. However, the provided documentations and tutorial are mostly about “training” part and didn’t talk much about validation callbacks that run during training.

It is easy to think just using DistributedSampler for the validation dataloader would do all the work for you like it did in training dataloader, but it doesn’t. There are two main problems.

(more…)

## add tokens to huggingface tokenizer

Here’s a sample code of how to load huggingface tokenizer, add some custom tokens and save it.

## pytorch implementation of sinusoidal position encoding

There are existing sinusoidal position encoding modules out there, but the ones that I confronted were mostly assuming the position to be incrementing from 0 to the size of sequence. For example, when a token embedding sequence with shape of (B, L, D_token) is given then the sinusoidal position encoding module will take this tensor as input and manually create a tensor (B,L) where the values for each row is (0,1,2,3, …., L-1) and then apply sinusoidal encoding on this.

(more…)

## relu, gelu , swish, mish activation function comparison

RELU(2018) arxiv: https://arxiv.org/abs/1803.08375 f(x) = max(0,x) GELU(2016) despite introduced earlier than relu, in DL literature its popularity came after relu due to its characteristics that compensate for the drawbacks of relu. Like relu, gelu as no upper bound and bounded Read more…