The popularity of PDF documents
You want to share a text-based document but you don’t want the content to be changed, re-appropriated, or altered. What type of document format do you use? Most likely you decide to go with PDF.
PDF stands for Portable Document Format and was created precisely for securing data distribution. Data compression makes PDF files lightweight and an ideal format to send by email or upload to the web. The type of content shared in PDFs varies. In most cases, it contains text, images, tables, links, buttons, and other resources.
Given its practicality, PDF has become one of the most popular file formats worldwide, and naturally, translation agencies receive requests to translate PDF files on a daily basis. Yet, translating a PDF is not as simple as a Word document, simply because PDF was designed fundamentally for content distribution, not for editing.
Different Types of PDF Documents
Translating a PDF document is a team effort between professional translators and graphic designers. In the translation industry, the designers’ in charge of dealing with document structure and layout constitute what is known as a Desktop Publishing (DTP) team. They manipulate the document to guarantee the translated piece resembles the original.
The amount of DTP work required for each document depends on numerous factors. First, you need to know what kind of PDF file you are dealing with. Here we share a simple way to find out:
- If you slide your mouse over the text in the PDF document and the cursor remains an arrow, then the file is not editable.
- If you slide through the document and a text cursor appears when you highlight text and images, then it is an editable PDF.
But what does this mean? Each PDF type requires different preparation work translation. Let’s see how to work out each case.
Scanned PDF
Scanned PDFs are not editable because they are images. The fastest way to access the content (other than manually typing it!) is through OCR (Optical Character Recognition) software which can convert the scanned information into editable text. However, these programs are not foolproof! Conversions are often not 100% accurate and you may find strange characters, broken lines, additional spacing, and many other formatting issues in the new editable file.
The most well known OCR programs are ABBYY’s FineReader and Trans PDF. To use these programs you will need a paid license.
PDF with Live Text
In this case, you can easily extract the text so there is no need for additional software. Simply select the text and paste it into a Word document. However, bear in mind that when copying text from PDF to Word one may also encounter formatting issues like changes in font type and size, or unnecessary line breaks.
Preparing PDF documents for translation
Now you have the text you need to translate. So what’s next? Before getting started with translation, somebody will need to fix all the formatting issues we mentioned above. This is the kind of work that a DTP team should look after. Otherwise, the translated piece will retain the same problems.
Once the file is clean, it should be imported into a translation tool so language professionals can start the translation process. Some translation tools — like memoQ or Trados Studio — feature PDF to Word filters and native integrations with OCR software. Still, it is always recommended that a DTP expert prepares the file for translation.
Post-translation Adjustments
After the translation is concluded, there follows another round of DTP efforts. You may wonder why? One of the most typical post-translation issues DTP experts have to deal with is the expansion and contraction of documents, which refers to the increase or decrease in the total number of characters when translating from one language into another, and the effect this has on lines, paragraphs, formatting, etc.
For example, a document in Spanish is 20 – 30% longer than its equivalent in English, mainly because expressing ideas in Spanish requires more words. In such cases, the DTP team will need to work out a new layout for the translated document.
Dealing with Embedded Images
If the PDF file is loaded with images, maintaining document layout and formatting requires extra effort from DTP teams — especially if there are images with embedded text! Unless the client can provide the original images in high resolution, the only alternative is working with the images available in the PDF provided. But the thing is these images are usually compressed to reduce the file’s size and facilitate their exchange.
If you only need to distribute the translated PDF online, then recreating the images will not be a major problem. However, this approach is not recommended if the PDF will go to print. Image resolution will end up being too low to produce the desired quality in the printed document.
Oceans’ Translation & DTP Services
Did you know that Ocean Translations has a team of experienced DTP experts? You can simply send your PDF files to our team receive a fully translated PDF document back without noticing the difference between the original PDF and the newly translated PDF.