\[P(⋅|x;θPLM,W)=softmax(h⋅W)=softmax(PLM(x;θPLM)⋅W),where h∈Rhidden_size and W∈Rhidden_size×\#classes.\]
Evaluations
\[D={(xi,si,ei)}Ni=1,where xi is input and si is start index of span with end index of span ei.Lspan(θPLM,S,E)=−N∑i=1logP(si|xi;θPLM,S)−N∑i=1logP(ei|xi;θPLM,E)P(si|xi;θPLM,S)=exp(S⊺\]