The Translation Memory Product That Microsoft Didn’t Want Us to Release

Written By:

Martin Laplante

I was cleaning up some old project files when I came across a product that we had worked on ten years ago that I had forgotten about.  It was an awesome translation memory for SharePoint on premise.  It was never released, essentially at the request of Microsoft.  

It's been a while, and maybe it's time to tell the story.

Image credit: Vecteezy

SharePoint Machine Translation Service: A Forgotten Feature

At the time, SharePoint had a Machine Translation Service.  It still does on premise but it's been mostly deprecated in SharePoint Online.  Our products used it, and still do on premise, to translate documents, classic pages, and other things.

How MTS Secretly Relied on Microsoft Research

The Machine Translation Service (MTS), at the time, relied in the background on the Microsoft Translator API, at that time run and hosted by Microsoft Research in Building 99 of the Microsoft Campus.  MTS had its own API, and some configuration parameters via the API, most of which was not documented at all.  

Based on some names and parameters of MTS functions that were named in an obscure document, not in the API reference, we guessed at some possible functionality that we could take advantage of.  At the time, most translation engines were statistical ones, but Microsoft Research also had some neural translation engines.  

Unlocking Hidden Parameters in Microsoft MTS

Because we knew how to configure those engines when using the Microsoft Translator API, we guessed that it might work in a similar way with the MTS, by providing a magic value to the undocumented "Category" parameter.  We tried it and it worked!  Instantly the free MTS was using the same engine that Microsoft normally charged extra for, but for free.  I checked with my contacts at Microsoft Research and confirmed that this was legitimate and there was no billing mechanism built into MTS so it was actually free. So we implemented it for our clients.

Building a SharePoint Translation Memory on Premise

One other feature of the Microsoft Translator API at the time was the Translation Memory.  In your user account, you could attach a list of words and phrases and their desired translations, and it would use that stored list to override the translation that Microsoft Translator API would use.  Microsoft implemented it via a slightly different API, the Collaborative Translation Framework (CTF) and the Translator Web Widget.

Early experiments showed that we could, in fact, specify a replacement translation by using undocumented functions that the MTS seemed to pass through to the Microsoft Translator API, and the MTS would use it.  Not only that, our tests showed that it applied to all users in that SharePoint farm.  So we set about building this capability, a translation management system with translation memory, built on the SharePoint Server platform.

When a Joke Translation Went Too Far

Then at some point during the testing something strange happened.  One of the terms that had previously been used in the initial testing and replaced with a joke translation now gave this joke translation, even before I had set it up.  Hang on, it's not even on the same server farm.  Continued testing confirmed this.  We could change the translation of a term on one server farm and that translation would appear on another farm.  

How did it know?

The Discovery: One Translator Account for Every SharePoint User

I talked with my Microsoft Research contacts and a quick call with Microsoft was scheduled.  The worst-case scenario was confirmed.  All SharePoint installations, everywhere on earth were using a single user account through a compatibility layer between the old API, which the MTS was using, and the latest version of the Microsoft Translator API.  Every SharePoint user on earth was potentially seeing our joke translations, which luckily were nonsense phrases not common ones.  You can imagine what a security nightmare that would have turned out to be.  Microsoft politely requested that we not release this product, and of course we had no intention of doing this.

Shutting Down the Translation Memory Project

We used it only the rest of that day, to restore the translation memory to its original empty state, and we mothballed the entire project.  I don't know when or whether Microsoft patched the vulnerability, but less than two years later the entire CTF API was retired, and everything related to translation memory was removed from Microsoft Translator API, officially to comply with GDPR.  A new upgrade of the MTS a few years ago properly integrated it into the Azure infrastructure and the exploit is no longer possible.

Clunky Successors

Microsoft later introduced a clunky mechanism for overriding translations, but was in the form of tags within the page or document to be translated, and by then SharePoint no longer allowed strange tags to be easily entered into pages.  Go figure.  The successor of the Microsoft Translator API, Azure AI Translator, got a still-clunky new mechanism last year, using CSV files that you must send to an Azure Blob container every time you translate a document.

Looking Back

If, ten years ago, you saw some strange joke translations in your translated SharePoint pages or documents when you used Variations, that might have been me.  Sorry.

By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.