1 min readJun 11, 2019
Thank you!
Well if you don’t want to train it from scratch then it you would have to go along with the existing vocabulary of BERT. The default BERT model, has a vocabulary of 30522 words which is significantly less than, let’s say… GLoVe, but keep in mind that BERT uses the wordpiece technique for its vocabulary. Meaning that, words are split into pieces which allows it to cover words that not in the vocabulary in the first place.
Additionally, a good resource for understanding BERT is this:
Hope this helps. :)