Adapting RNN Sequence Prediction Model to Multi-label Set Prediction

Kechen Qin, Cheng Li, Virgil Pavlu, and Javed Aslam.

Proceedings of NAACL-HLT 2019

Downloads

[Paper] [BibTex]

Abstract

We present an adaptation of RNN sequence models to the problem of multi-label classifi- cation for text, where the target is a set of la- bels, not a sequence. Previous such RNN mod- els define probabilities for sequences but not for sets; attempts to obtain a set probability are after-thoughts of the network design, includ- ing pre-specifying the label order, or relating the sequence probability to the set probability in ad hoc ways.

Our formulation is derived from a princi- pled notion of set probability, as the sum of probabilities of corresponding permutation se- quences for the set. We provide a new training objective that maximizes this set probability, and a new prediction objective that finds the most probable set on a test document. These new objectives are theoretically appealing be- cause they give the RNN model freedom to discover the best label order, which often is the natural one (but different among documents). We develop efficient procedures to tackle the computation difficulties involved in training and prediction. Experiments on benchmark datasets demonstrate that we outperform state- of-the-art methods for this task.