1 min readOct 8, 2020
Hello! Sorry if my statement there has not been clear :) So what I mean is that you have a duplicate token in a sequence, i,e "I was eating an apple, and all of a sudden my apple pc crashed!" The token "apple" appears 2 times in the sequence and clearly it has different meaning in the two cases. In that context, when generating embeddings using the code provied in my post, the dictionary holding the embeddings will only contain one "apple" entry and that will refer to the last "apple" occurence in the sequence. Namely, referring to the apple PC.