Zhu, Jun and Xing, Eric and Zhang, Bo
Learning sparse Markov networks based on the maximum margin principle remains an open problem in structured prediction. In this paper, we proposed the Laplace max-margin Markov network ($\mathrm{LapM^3N}$), and a general class of Bayesian M$^3$N (BM$^3$N) of which the $\mathrm{LapM^3N}$ is a special case and enjoys a sparse representation. The BM$^3$N is built on a novel \textit{Structured Maximum Entropy Discrimination} (SMED) formalism, which offers a general framework for combining Bayesian learning and max-margin learning of log-linear models for structured prediction, and it subsumes the unsparsified M$^3$N as a special case. We present an efficient iterative learning algorithm based on variational approximation and existing convex optimization methods employed in M$^3$N. We show that our method outperforms competing ones on both synthetic and real OCR data.
Discussion