Andreas Pogiatzis
1 min readOct 8, 2020

--

Hello! Sorry if my statement there has not been clear :) So what I mean is that you have a duplicate token in a sequence, i,e "I was eating an apple, and all of a sudden my apple pc crashed!" The token "apple" appears 2 times in the sequence and clearly it has different meaning in the two cases. In that context, when generating embeddings using the code provied in my post, the dictionary holding the embeddings will only contain one "apple" entry and that will refer to the last "apple" occurence in the sequence. Namely, referring to the apple PC.

--

--

Andreas Pogiatzis
Andreas Pogiatzis

Written by Andreas Pogiatzis

☰ PhD Candidate @ UoG ● Combining Cyber Security with Data Science ● Writing to Understand

Responses (1)