NeurIPS 2020

Cross-lingual Retrieval for Iterative Self-Supervised Training


Meta Review

The paper proposes a novel approach for unsupervised parallel corpus mining and unsupervised machine translation, improving on the SoTA on both tasks by significant margins. Experiments are conducted on the Tatoeba retrieval task and a 25 language translation task based on a combination of a few academic benchmark datasets. Careful experiments to demonstrate how using parallel data from just one language pair significantly improves the cross-lingual embedding alignment in a multilingual de-noising auto-encoder. All reviewers support acceptance, as does the AC. Please make sure to incorporate the clarifications from the author response in the final version of the paper.