Two More Bilingual Data Resources for Your CAT Tool

Several months ago I wrote a  blog post  listing a couple of online data resources for your CAT tool. Although there is an abundance of resources online, most of them are pricy and sometimes are not fit for the specialized translator. For this reason, I am always researching  for online  resources that I don’t  have to pay or download . Whenever  I am asked to download  a program I do get quite weary of it. Many of these sites are not  often registered sites and others as in the case of E-Type  which appeared  to be legitimate program, are actually computer viruses ( a dangerous one if I may add). I found the  following website  that resembled the resources I previously discussed in my original post and could quite possibly be the sixth bilingual data resource you could work with in your translations. The other I found is  not necessarily a TM  but rather cloud translation tool, which I found quite convenient and resourceful as well.

Glosbe

I came across Glosbe as another translation memory online tool. Not only is it an online dictionary but  it acts pretty much like MyMemory and Linguee, which I personally love and use often. It is a relatively new website ( early 2012)  and as the other two, will search phrases and  gives you samples of compound words and  context using the desired words. What a true TM supposed to be, a collection of human translated segments or clusters of text that is not based on Google translate but rather on translators contributions that you can upload through the TMX  file.  It also gives you  the possibility to contribute a translation through a TMX  file as well.

Ebiwrite

I can compare this cloud translation tool as a mix between Evernote and Dropbox . It is a web based translation tool in which you can translate directly online, save and  tag your translations according to subject and access them online whenever you need to.  I do love the idea of tagging my files, after many translations  your files do get quite cluttered and sometimes I forget how I actually named the file. This feature helps you find them quickly (at least in Evernote). Since they are in the cloud, like Dropbox you can access your files by logging into your account, sharing your files with other Ebiwrite users, building your own dictionaries and personalizing your account.  Given that is is web based it can be accessed by your smartphone and actually type in sample translations to be edited later. This is a paid feature but they have free plans as well.

Ok, there you have it.  I did not intend to have this post be a follow-up  of my original post but somehow they were quite related to each other.  I hope you found them useful as I have.

STAR TRANSIT and TRADOS Compared

It has been some time since my last post but I had been quite busy with a project that to me seemed eternal. Yet,through the experience I was able to gather quite a bit of information to use in this blog post. Given the highly technical matter of the project, I had to research for CAT tools that would help me get this project done effectively and efficiently. I generally don’t use CAT tools( my specialty is in marketing) but this job warranted it. Therefore, there were two that I looked into for the completion of this project: STAR TRANSIT and TRADOS. I ended up using the latter, but to those not familiar with TRANSIT (as I was),this is a brief overview and comparison of both programs.

It is no surprise as to why agencies now require you to know TM software TRADOS for most of the projects(an although this job did not come through an agency), it is the one most requested followed by Wordfast. But what has made TRADOS the leader? With its large capacity of storing information, its large terminology base, its easy integration with Word and PowerPoint,its easy conversion of dates, measurements,and ease of use has made this software the industry leader in CAT tools. TRADOS which is particularly designed for highly technical translations because of its large capacity to store information and ability to populate repetitive content, you can clearly run through these translations quite efficiently and accurately. In addition, with its terminology feature you surely can cut large part of your research in half. Yet, it must be said that it does mean that you develop these terminology databases and translation memories by your own translation work; however, once you have these memories stored,you can populate the term, phrase, or sentence again and eventually save time in the long run. So I still had to do quite a bit of research and have many specialized glossaries and dictionaries handy in the translation process but once it is saved on memory, they populate easily in your translation.

STAR TRANSIT on the other hand, is a very complex and difficult to use program yet with the added benefits of the ability to import large amounts of translation memories from other systems (including TRADOS), as well as the ability to use THEIR existing translation memory (at a price) which can be useful if the information they have is relevant to the subject you are translating, so one has to see how extensive and valuable it can be to you. It also has extensive dictionary (Term Star), which can be incorporated into the CAT tool or used as a stand-alone feature. Yet its setup, layout,and overall function is not as friendly as TRADOS. It also has a very annoying feature to me- the pop-up window- whenever it finds a word in the memory or suggests one for you, it pops up on-screen. This feature can be tuned off yet as you are translating you want to be efficient as well so the pop up suggests and populates it, while in TRADOS it is populated automatically. Overall, it is designed for very precise, technical translations. (aviation and mechanics). Which in essence these are the biggest users of this translation memory. Furthermore, unlike TRADOS where you see the source text and target text side by side, in TRANSIT it is a split screen where source text in the upside of the screen and target downside of screen. To me this could possibly add to missed  translated text. Yet for editing large amounts of text,I do see the reasoning behind the split screen. It becomes easier to correct line by line text when you are translating intricate , technical material that needs to be precise and accurate. Personally, it is more intuitive to me to have both texts side by side. For this reason, should I ever do a technical translation again, I would say I lean towards TRADOS.

5 Bilingual Data Resources That Work With Your CAT Tool.

The use of translation memories has become somewhat of a standard in our industry. If you work for agencies they are now expecting that you at least have a command of TRADOS. Although I have used TRADOS, and  see that our productivity increases using it, I don’t personally own a copy of the software yet. I won’t be discussing about TM programs per se, I believe too many articles and blogs have discussed the pro’s and con’s of it. Rather, I would discuss the resources that can work well with  the use of a TM program or CAT tool or use  as a standalone resource.  They are large amounts of bilingual data (corpora) extracted from websites in all particular areas.  The reason I love these resources is that they are not the result of machine translation but rather large sources of data that have been translated by translators in a specialized field. So the result is an accurate copy of what you search for. The following list is but a few that I have in my arsenal of resources and some that I have used so far. They are in no particular order.

Linguee
www.linguee.com

This is the corpus that I use the most and like the most.  I do find it the most complete and it is continuously  improving its site. It first started with an English German language  pairing. It then included Spanish and Portuguese and now French.  It gathers  large bodies of information from websites and the internet and matches it with your search. It also tells you the link to the website where this information appeared and gives you several contexts in which the query appeared  and  the website link  ( This can also be downloaded and aligned to your TM). On the side bar there is a dictionary that will help you to understand the phrase and it also allows you to add information to improve the corpora itself. Of all the TM’s I use I find this one the most useful and most complete. You may even download their dictionaries as a GTL file.

My Memory
 mymemory.translated.net

Established in 1999, My Memory is another  resource that I use extensively. It gives you both, internet searches done by human translators and when none is available it  gives you  machine translated text (from Google Translate). Like Linguee, it tells you where the source came from but here it allows you to rate and improve the entry. Furthermore, you can also use their existing memory by downloading  a TMX file  of your  document into their system. It searches the memory for you so you can work with it through your  CAT tool.  You may also contribute your memory to improve the site. They do protect the identity of the material and only use the memory they need. It is available in many language combinations and you can either search by phrase or word. It does not have a dictionary entry like Linguee, but it is also another resource . It is free to join and given that they can provide massive information for you , this is not quite bad at all.

TAUS Data Association
www.tausdata.org

This corpora is a paid subscription that allows you to download the memory directly into your CAT tool. Through their website you can browse their extensive catalog, some are public and others are for paid members. Categorized by subject, you have access to a vast variety of subjects and through this website www.taustracker.com you will soon be able to have access to directories from a specific translation memory.

Finally, there is always Google, but they to have a particular bilingual engine.

2lingual
2lingual.com

Powered by Google and now Bing search engines this bilingual engine provides side-by-side  samples of websites in the language pair that you are looking for by word, phrase, or keyword. Just another free source.

And there you have them, these are just a few that I have found useful to me. There are many others of course , just google “translation memory” or “bilingual data” and  hundreds of pages will appear. However, you have to select the ones that are more useful to you. This is my sample.