Markdown Translator: Multilingual Documentations in Seconds
by Denis Augsburger
The translation of Markdown files is commonly needed in technical documentations and headless content management systems, where you want to reach a target audience that speaks different languages. I'm gonna show you how you can translate Markdown easy and fast without compromising on quality. If you'd like to get a head start and try out the Markdown Translator just sign up.
Table of Contents
Markdown Translation Tool
More and more tools use Markdown to structure their content. Some examples are:
- Docusaurus, Gitbook for documentations
- Hugo, Jekyll, GatsbyJS as static site generators (SSG)
- Contentful, Strapi, SquareSpace as content management systems (CMS)
Depending on the project, it is necessary to generate multilingual content and update it regularly. The traditional translation process can be time-consuming and waiting on (human) translations can block your release cycles. Therefore we were looking for a fast and reliable solution.
Common Challenges
We've tried out several translation tools and inserted Markdown but we were not satisfied with the translation results they provided. Common problems we encountered:
- Broken Markdown Syntax
- Translation of things that should not be translated, like Code Snippets, Emoji's
- Different styles in translation results
- Setup/Installion necessary
Let's take a look at how a simple Markdown file is translated from English to German if you use it directly in DeepL or Google Translate and compare it to the Simpleen Markdown Translator
The file contains a list, some emoji's and headers.
## Setup
Install the CLI to **translate** files from source to target path.
```shell
yarn add simpleen
yarn run simpleen init
```
You can search for files in `./blog/posts/en/*.md` and translate them to `./blog/posts/$locale/$FILE.md`.
## Additional support :smile:
- PO-Files
- JSON
- YAML
DeepL Markdown Example
With DeepL the result looks like the following.
## Einrichtung
Installieren Sie die CLI zum **Übersetzen** von Dateien vom Quell- in den Zielpfad.
``Shell
yarn add simpleen
yarn run simpleen init
```
Sie können nach Dateien in `./blog/posts/de/*.md` suchen und sie in `./blog/posts/$locale/$FILE.md` übersetzen.
## Zusätzliche Unterstützung :smile:
- PO-Dateien
- JSON
- YAML
As you can see the code snippet is broken because the fenced code block now starts with two backticks instead of three. Also, the name of the language Shell is now upper-cased. The list, the emoji and the paths are handled correctly in this simple case. The bold text is also marked correctly.
Google Translate Markdown Example
Let's compare this with Google Translate:
## Einrichten
Installieren Sie die CLI, um Dateien von der Quelle in den Zielpfad zu übersetzen.
`` `Shell
Garn hinzufügen einfach
Garn laufen einfach init
`` `
Sie können nach Dateien in "./blog/posts/en / *. Md" suchen und diese in ". / Blog / posts / $ locale / $ FILE.md" übersetzen.
## Zusätzliche Unterstützung: smile:
- PO-Dateien
- JSON
- YAML
The result with Google Translate is worse than DeepL. The code snippet is broken because the backticks of the code snippet are seperated with a space. Also, the content is translated with is not desirable. The paths are splitted and differently marked. The Emoji is also broken.
Simpleen Markdown Translator
Let's see how Simpleen handles this Markdown example in comparison to DeepL and Google Translate (in this case with DeepL).
## Einrichtung
Installieren Sie die CLI, um Dateien vom Quell- in den Zielpfad **zu übersetzen**.
```shell
yarn add simpleen
yarn run simpleen init
```
Sie können nach Dateien in `./blog/posts/de/*.md` suchen und sie in `./blog/posts/$locale/$FILE.md` übersetzen.
## Zusätzliche Unterstützung :smile:
- PO-Dateien
- JSON
- YAML
Because we love Markdown we wanted to deliver better and more consistent results with an online translator that let's you translate Markdown into many languages.
Simpleen provides better results because we handle Markdown differently than other services. Instead of just handling Markdown as Text or convert it to HTML, which is supported by most MT services, we go deeper to understand the whole document structure of your Markdown files.
Furthermore, Simpleen understands the most common Markdown extensions and flavors and applies the provided styles from your file to the translation result. For example, if you use two spaces at the end of a line to break a line, we also use two spaces in the translated result.
Supported Flavors & Extensions
Markdown comes in different flavors, and therefore supports different syntax to write your documentations, blog posts and more. The most common flavors that are used and supported for translations by Simpleen are:
- CommonMark
- GFM Github Flavor Markdown
with the following extensions:
- Emoji's (:smile: or 👍)
- Footnotes (partial)
- Frontmatter
- Math
CommonMark is a Markdown flavor that many frameworks and libraries support or build upon, for example GatsbyJS with their remark transformer. Also many headless content management system do support CommonMark.
Better Style Support
There are different valid ways to mark your headers, bold text, lists and more. Simpleen detects your style and reproduces the translated Markdown file in a consistent way. For example if you use a dash for your lists
My shopping list:
- Dictionary
- Paper
- Pencil
then this Markdown example is translated to German like this:
Meine Einkaufsliste:
- Wörterbuch
- Papier
- Bleistift
Or if you use a star for your list instead it's getting translated to:
Meine Einkaufsliste:
* Wörterbuch
* Papier
* Bleistift
Both results are valid in most Markdown flavors, but we want to consistently apply the styles from the provided Markdown file. As a result you can use the translated Markdown file directly in your Markdown documentation tool. Furthermore, the editor or human translator is not getting confused by different styles in case of post-editing.
Translate .md/.mdx Files
A Markdown file contains multiple parts that need to be localized. Other parts - like code segments and frontmatter fragments (meta data) - need to be excluded from translation.
Not translated:
- Code Fences (`
)
- Emoji's
- Frontmatter
- Math Expressions
- MDX (not yet supported, drop us a line if you like to use it)
Translated:
- Headers (atx, setext)
- Paragraphs with bold, italic styles, links, images
- List Items
- Table Headers
- Table Entries
- ToDo List Entries
- Footnotes (partial, #fn-1 instead of ^1)
We have plans to improve the Markdown translation tool even more. Quick Roadmap:
- We want to handle internal links correctly (adapt to translated result)
- Handle footnote links
- Adapt the Simpleen CLI to support Markdown files in your local project
You can use and try it out directly by signing up for an account. If you need help or you want to provide us with some feedback, please contact us.