Proceedings of NAACL-HLT 2019
We present an adaptation of RNN sequence models to the problem of multi-label classifi- cation for text, where the target is a set of la- bels, not a sequence. Previous such RNN mod- els define probabilities for sequences but not for sets; attempts to obtain a set probability are after-thoughts of the network design, includ- ing pre-specifying the label order, or relating the sequence probability to the set probability in ad hoc ways.
Our formulation is derived from a princi- pled notion of set
probability, as the sum of probabilities of corresponding
permutation se- quences for the set. We provide a new training
objective that maximizes this set probability, and a new
prediction objective that finds the most probable set on a test
document. These new objectives are theoretically appealing be-
cause they give the RNN model freedom to discover the best label
order, which often is the natural one (but different among
documents).
We develop efficient procedures to tackle the computation
difficulties involved in training and prediction. Experiments on
benchmark datasets demonstrate that we outperform state-
of-the-art methods for this task.