error fix
fix blank plot when using plotly in jupyter lab
this helped: https://github.com/plotly/plotly.py/issues/2508#issuecomment-907338746
this helped: https://github.com/plotly/plotly.py/issues/2508#issuecomment-907338746
while using vscode, I noticed that pylance wasn’t running even after forcing pylance server restart. I checked the output logs and it was stuck at this line: solution add pyrightconfig.json on project root dir. Here is a link to github Read more…
arxiv: https://arxiv.org/abs/1710.10903 key points introduce “graph attention network(GAT)” which consists of “graph attention layers” which incorporate the “self-attention” idea from transformers to graph neural network the graph attention layers work by calculating weights of a node’s neighbors from the features Read more…
Background While working on a raw docker base image that didn’t even have basic tools installed, I tried to call apt update but the following error came up This error occurred even after setting “http_proxy” and “https_proxy” environment variables. Solution Read more…
Background While trying to build a docker image from a very raw apache spark base image, since it didn’t have any basic packages such as vim, ssh, wget, etc, I entered a running container of this image and typed apt Read more…
When thinking both Exceptions and interrupts at the same time, things can get confusing so here I write down some simple experiments that I did to clear some confusing concepts.
(more…)pytorch distributed data parallel(DDP) is very useful and relatively well provided for creating a distributed training setup. However, the provided documentations and tutorial are mostly about “training” part and didn’t talk much about validation callbacks that run during training.
It is easy to think just using DistributedSampler
for the validation dataloader would do all the work for you like it did in training dataloader, but it doesn’t. There are two main problems.
Here’s a sample code of how to load huggingface tokenizer, add some custom tokens and save it.
There are existing sinusoidal position encoding modules out there, but the ones that I confronted were mostly assuming the position to be incrementing from 0 to the size of sequence. For example, when a token embedding sequence with shape of (B, L, D_token) is given then the sinusoidal position encoding module will take this tensor as input and manually create a tensor (B,L) where the values for each row is (0,1,2,3, …., L-1) and then apply sinusoidal encoding on this.
(more…)RELU(2018) arxiv: https://arxiv.org/abs/1803.08375 f(x) = max(0,x) GELU(2016) despite introduced earlier than relu, in DL literature its popularity came after relu due to its characteristics that compensate for the drawbacks of relu. Like relu, gelu as no upper bound and bounded Read more…