error fix
fix blank plot when using plotly in jupyter lab
this helped: https://github.com/plotly/plotly.py/issues/2508#issuecomment-907338746
this helped: https://github.com/plotly/plotly.py/issues/2508#issuecomment-907338746
while using vscode, I noticed that pylance wasn’t running even after forcing pylance server restart. I checked the output logs and it was stuck at this line: solution add pyrightconfig.json on project root dir. Here is a link to github explaning about what this file is. populate exclude field with Read more…
arxiv: https://arxiv.org/abs/1710.10903 key points introduce “graph attention network(GAT)” which consists of “graph attention layers” which incorporate the “self-attention” idea from transformers to graph neural network the graph attention layers work by calculating weights of a node’s neighbors from the features of the neighbors by taking other neighbor’s existence into account. Read more…
When thinking both Exceptions and interrupts at the same time, things can get confusing so here I write down some simple experiments that I did to clear some confusing concepts.
(more…)pytorch distributed data parallel(DDP) is very useful and relatively well provided for creating a distributed training setup. However, the provided documentations and tutorial are mostly about “training” part and didn’t talk much about validation callbacks that run during training.
It is easy to think just using DistributedSampler
for the validation dataloader would do all the work for you like it did in training dataloader, but it doesn’t. There are two main problems.
Here’s a sample code of how to load huggingface tokenizer, add some custom tokens and save it.
arxiv: https://arxiv.org/abs/2111.15664 Key Points visual document understanding model which does OCR + downstream task in one step with a single end-to-end model outputs are generative, and formatted to be convertible to JSON, which makes this architecture highly compatible to various downstream tasks. present SynthDoG, a synthetic document image generator used in Read more…
arxiv: https://arxiv.org/abs/1910.13461 key points propose autoregressive model named BART, which is architecturally similar to standard transformer encoder + decoder Check out 5 pretraining tasks, and experiment which pretraining task is most helpful test BART performance with large scale pretraining on downstream tasks Model Architecture This work introduces BART, which is fundamentally Read more…
arxiv: https://arxiv.org/abs/2107.14795 Key points developing upon the Perceiver idea, Perceiver IO proposes a Perceiver like structure but where output size can be much larger and still keep overall complexity linear. (Checkout summary on Perceiver here) same with Perceiver, this work use latent array to save input information and run this through multiple self Read more…
arxiv: https://arxiv.org/abs/2103.14030 Key points multi scale feature extraction. Could think of as adoption of FPN idea. restrict transformer operation to within each window and not the entire feature map → allows to keep overall complexity linear instead of quadratic apply shifted window to allow inter-window interaction fuse relative position information in Read more…