There are existing sinusoidal position encoding modules out there, but the ones that I confronted were mostly assuming the position to be incrementing from 0 to the size of sequence. For example, when a token embedding sequence with shape of (B, L, D_token) is given then the sinusoidal position encoding module will take this tensor as input and manually create a tensor (B,L) where the values for each row is (0,1,2,3, …., L-1) and then apply sinusoidal encoding on this.

But I wanted a sinusoidal position encoding module that can handle when position values are not incremental. For example, when I have a token embedding tensor (B,L,D_token), I also have a position array shape (B,L) where the position values are not incremental (e.g. (0,1,2,3,0,1,2,0,1,0,1,2,3,4,5,…) ).

To handle such cases, I coded my own sinusoidal position encoding module in pytorch where the input will be a tensor containing position integer values.

class SinusoidalPositionEncoding(torch.nn.Module):
def __init__(self, dim, max_period=5000):

assert dim % 2 == 0
self.dim = dim

self.max_period = max_period

super().__init__()

w_arr = torch.arange(0, self.dim // 2)
w_arr = 1 / (max_period) ** (w_arr * 2 / dim)

self.register_buffer("w_arr", w_arr)

def forward(self, x):
"""
assume x has shape (B,T) where B=batch size, T=token size(or sequence length)
and values of x are integers >=0.
"""

_x = torch.unsqueeze(x, -1)  # (B,T,1)

v = _x * self.w_arr  # (B,T,dim//2)

sin = torch.sin(v)
sin = torch.unsqueeze(sin, -1)  # (B,T,m,1)

cos = torch.cos(v)
cos = torch.unsqueeze(cos, -1)  # (B,T,m,1)

m = torch.cat([sin, cos], -1)  # (B,T,m,2)

b, t, _, _ = m.shape

y = m.reshape(b, t, -1)  # (B,T,dim) where 2m=dim

return y


If you are just using implicit incremental positions, then you don’t have to use this and just use one from any framework that goes with that approach.

Categories: deep learningpytorch