Extract from Rob Robinson’s article “Machine Translation: The Importance of Document-Level Evaluation”
Research suggests that when it comes to evaluating entire documents, human translations are rated as more adequate and more fluent than machine translations. Human raters assessing adequacy and fluency show a stronger preference for human over machine translation when evaluating documents as compared to isolated sentences. This suggests that the way machine translation is evaluated needs to evolve away from a system where machines consider each sentence in isolation.