I was trying out a text classification problem. Text can be made up of 2+ sentences. I am having feeling that I should obtain BERT embedding of each of them separately. Then run pass them through neural network to obtain features of each sentence and then concatenate all of them together. Finally I will classify using binary classifier.
I was thinking to do this in pytorch. Based on task at hand, I have intuition that obtaining BERT embedding of each sentence separately will be more benefitial.
I have following doubts:
- Is there any example taking such approach?
- I am thinking of what should I do? Should my model inherit
nn.Modulefrom pytorch orBertPreTrainedModelfrom huggingfacetransformers. - How I can implement this concatenation logic in
__init__method for building model and inforwardmethod for passing input sentences through corresponding neural network one by one? - What line
pooled_output = outputs[1]is doing inBertPreTrainedModel? Is it obtaining reference to embedding corresponding to[CLS]token?