Andreas Pogiatzis
1 min readJun 11, 2019

--

Thank you!

Well if you don’t want to train it from scratch then it you would have to go along with the existing vocabulary of BERT. The default BERT model, has a vocabulary of 30522 words which is significantly less than, let’s say… GLoVe, but keep in mind that BERT uses the wordpiece technique for its vocabulary. Meaning that, words are split into pieces which allows it to cover words that not in the vocabulary in the first place.

Additionally, a good resource for understanding BERT is this:

Hope this helps. :)

--

--

Andreas Pogiatzis
Andreas Pogiatzis

Written by Andreas Pogiatzis

☰ PhD Candidate @ UoG ● Combining Cyber Security with Data Science ● Writing to Understand

Responses (1)