Hi,
Can we use ElectraForMaskedLM for causal language modeling?
It seems BertForMaskedLM was split into and BertLMHeadModel and BertForMaskedLM, one model for causal LM, one for masked LM.
https://github.com/huggingface/transformers/pull/4874
Is ELECTRA’s Masked LM class can be applied for causal LM?
Or is there a separate class?
Or is it not possible to use ELECTRA as Causal LM in the first place?
Thank you in advance.