Microsoft has significantly improved the quality of production translation models by adopting a new artificial intelligence (AI) technology that significantly improves its translation software and services.
With its XYZ-code initiative, the software giant hopes to eventually incorporate AI models for text, vision, audio, and language. Z-code supports the development of artificial intelligence systems that are capable of understanding, seeing, hearing, and speaking.
The new Z-code models have been updated for Microsoft Translator and other Azure AI services. Using Nvidia GPUs and Triton Inference Server, the software giant is deploying and scaling the models quickly.
Furthermore, Microsoft Translator is the first machine translation provider to implement Z-code Mixture of Experts models live in the customer environment.
A Mixture of Experts (MoE) architecture is used with Z-code models, compared to previous AI models. The models are designed to learn different tasks independently. These models can learn to translate from one language into another simultaneously.
In addition, the newly introduced MoE models based on Z-code take advantage of transfer learning techniques that enable knowledge to be shared easily among similar languages, such as English and French. During the training process, the models also use both parallel and monolingual data, which allows them to produce high-quality machine translation, even beyond the languages with high-resource counts.
The company then partnered with Nvidia to optimize faster engines for GPU deployment of its new Z-code and MoE models. Nvidia, on the other hand, developed a custom CUDA kernel implementation that leverages FasterTransformer and CUTLASS libraries in order to develop MoE layers on a single V100 GPU.
Customers using Microsoft’s Document Translation feature can now use the new Z-code models by invitation, which can translate entire documents or even volumes in a variety of different file formats while maintaining their original format.