Thanks for your questions!

1 min readApr 26, 2019

So basically BERT is a language model which can be fine-tuned for many NLP tasks. The extra input data like sentence and positional embeddings are used for fine-tuning these tasks mainly (i.e. SQuAD).

In this case, these extra inputs are simply ignored as I just want to extract word features from BERT.

I have used this approach for a downstream feature-based approach and it seems that is working so I hope this answers your question.

Best,

Antreas

Written by Andreas Pogiatzis

Responses (1)