Machine translation and context

February 6th, 2012

A sentence I’ve just translated is an excellent example of the advantages and disadvatages of machine translation. The original sentence said this:

Los rankings se basan en indicadores sociales y económicos.

Google Translate offered this:

The rankings are based on social and economic indicators.

This is a good example of how machine translation can speed up the translation process. The translation is almost perfect. I say almost, because the translation doesn’t quite work in my context. The reader is left asking “Which rankings?”.

The original sentence is actually talking about rankings in general, rather than any specific rankings. Unfortunately Spanish does not make this distinction in the use of articles, so the word “los” is needed whether talking about rankings in general or specific rankings referred to earlier in the text. Google Translate works sentence by sentence, so it has no way of knowing whether the word “the” should appear at the beginning of the English translation.

Another similar problem comes up when I translate biographical texts. Imagine a sentence in Spanish that says the following:

Nació en Tolosa en 1960, pero desde 1970 vive en Roma.

Is Tolosa referring to the city of Toulouse in the Languedoc region (Tolosa is the traditional Spanish spelling of the city) or the small town in the Basque Country? OK, so I’ve deliberately come up with an ambiguous place name, but the other problem in this example does occur more often: is the text talking about a male or female? Google Translate has no way of knowing, since it only looks at the context of the sentence. It normally chooses a sex seemingly randomly. In this particular example it has produced a translation that does not specified the sex of the person:

Born in Toulouse in 1960, but since 1970 living in Rome.

Although it has avoided assigning a sex to the person the text is talking about, the translation is unacceptable and would need considerable editing. By changing the sentence slightly I can force Google Translate to assign a sex:

Nació en Tolosa en 1960, pero desde 1970 vive con sus padres en Roma.

Born in Toulouse in 1960, but since 1970 living with his parents in Rome.

The pronoun “his” is used, but the person we’re talking about could just as well be female. As additional evidence that Google Translate doesn’t use context I will now ask it to translate the following two sentences together:

Julia es una ilustradora francesa. Nació en Tolosa en 1960, pero desde 1970 vive con sus padres en Roma.

Google Translate provides the following translation:

Julia is a French illustrator. Born in Toulouse in 1960, but since 1970 living with his parents in Rome.

Google Translate still uses the word his, yet to any human translator it is blatantly obvious, thanks to the context of the first sentence, that the correct pronoun is her.

Running Dragon 11 on 64-bit Windows XP

June 6th, 2011

Nuance say Dragon is not compatible with 64-bit Windows XP. The installer aborts if you try to install it on this system. But there’s a workaround! Simply follow the instructions found here, but omit the part telling you to change the properties of Audio.exe to run in XP compatibility mode, since you’re already running XP (the blog was written to explain how to run Dragon in 64-bit Vista).

Many thanks to Larry Henry for explaining this workaround.

Invertext glossaries

May 20th, 2011

For professional translations, visit www.timtranslates.com.

Today I stumbled across this little beauty of a webpage. It contains a bilingual Spanish<>English glossary with 6,000 definitions in the fields of banking, stock markets, accounting, money and currencies, corporate banking, retail banking and money laundering. I’ve not really tested it yet, but it looks promising.

The glossary is produced by Ediciones Verba, who have produced what looks like a very interesting set of technical dictionaries. I’d be very interested in hearing from anyone who has bought the dictionaries.

I’m a little hesitant about buying them because I wonder whether at some point they’ll put everything online. Also, the dictionaries are not cheap (though given the number of entries, that’s not surprising). I do hope that at some point they will produce digital versions, because that’s what most translators prefer searching in. I’d rather select the terminology I’m looking for an activate a search on my computer or online than get a bulky paper dictionary and look for it manually. In fact, I’d be willing to pay more for an electronic dictionary than for a paper version.

Eurodiputats en contra de volar en “business class”

April 18th, 2011

Avui he enviat la següent carta als diputats de Catalunya de les Illes Balears que van trencar la disciplina de partit en la votació sobre els vols en “business class” dels eurodiputats. Els diputats en qüestió són Rosa Estaràs (PP) de les Illes i Oriol Junqueras (ERC), Raül Romeva (ICV) i Ramon Tremosa (CDC).

Senyors diputats

Només volia felicitar-vos per haver trencat la disciplina de partit quant a la votació sobre l’ús de la business class pels eurodiputats. La reacció de la societat a aquesta votació posa en evidència que els vostres vots reflecteixen el punt de vista de la societat. Els diputats dels parlaments europeu, català, balear i espanyol haurien de votar més sovint segons la seva consciència, en comptes de seguir cegament la línia oficial del seu partit.

Si tinguéssim un sistema de llistes obertes a les eleccions, jo tindria aquest gest en compte a l’hora de votar a les pròximes eleccions. Com que no tenim aquest sistema, l’únic que puc dir és que tindré en compte qualsevol decisió d’un partit per a eliminar de la seva llista algú que hagi votat segons la seva consciència.

Atentament

Timothy Barton

La guia de CCOO i la manipulació de l’IEC

March 18th, 2011

Ja tenim una altra de les moltes guies del llenguatge no sexista. Aquestes guies les escriuen persones que a la pregunta “com estan els teus pares?” deuen respondre “només tinc un pare”. La gent normal sabem que, segons el context, els noms masculins plurals de vegades es refereixen només a homes i de vegades es refereixen a homes i dones.

El pitjor de la guia de CCOO és la manipulació que fa de la Gramàtica de la llenguage catalana que està preparant l’Institut d’Estudis Catalans (IEC). Argumentant que no s’hauria d’emprar noms masculins plurals com “els treballadors” per a referir-se tant a homes com a dones, la guia de CCOO ens cita un fragment d’aquesta gramàtica:

La quasi totalitat de les categories nominals tenen gènere i nombre, i únicament mostren especificitats morfològiques els noms, els possessius i els pronoms. Els noms, perquè, a diferència de la resta de categories nominals, tenen gènere inherent: els noms són lèxicament masculins o femenins, enfront de les altres categories nominals que poden ésser masculines o femenines segons el context en què apareixen. Els possessius, perquè també presenten la categoria gramatical de persona, i els pronoms, perquè tenen persona i cas.

A partir d’aquest fragment, la guia de CCOO conclou: “Així doncs, cal tenir clar que s’està utilitzant el gènere masculí i no pas el neutre.”

Quina manipulació! Aquest capítol està parlant dels gèneres de les paraules. Diu que tots els noms tenen una gènere: masculí o femení. Però aquí no diu res de la relació entre el gènere i el sexe. Per això, els autors d’aquesta guia haurien d’haver llegit la secció 10.1.2 (”La relació entre gènere i sexe en els noms que designen éssers sexuats”) del mateix document:

En els noms en què el gènere estableix oposicions de sexe, el masculí és la forma no marcada semànticament. El caràcter no marcat del masculí es pot constatar fàcilment en el valor extensiu d’aquest gènere en contextos plurals. Efectivament, un nom masculí plural (com els avis o els llops) pot designar un grup sexualment heterogeni de mascles i femelles, però un nom femení ha de fer referència necessàriament a un grup de femelles (com les àvies o les llobes). El caràcter no marcat del masculí es pot constatar igualment en el fet que la concordança amb mots de gènere diferent es fa en masculí (El meu pare i la meva àvia eren rossellonesos; tots els avantpassats meus són italians menys ells dos).

Els autors d’aquesta guia (per cert, oi que quan he dit “els autors” tots heu entès que podria incloure dones?) no poden utilitzar la gramàtica de l’IEC per a justificar l’ús de les formes dobles com “els autors i les autores”, perquè aquesta gramàtica ho deixa molt clar que aquestes formes són innecessàries.

Bizarre dream

March 3rd, 2011

I had a rather bizzare dream last night: I was working as a primary-school teacher. I popped out for lunch, then was waiting for a tram on Knavesmire Road in York to get a tram back to work. (In real life there are no trams in York). A tram finally arrived but not all of us got on as it was full. I pleaded with one of the people to let me on, saying “Think of the poor children waiting for their teacher”. I got the tram to Fulford. Then suddenly my job had changed and I ended up being told “you’re reading the news today” by Núria Soler (who in real life normally reads the lunchtime news on Catalan TV). I told her I couldn’t because I was wearing jeans, but she said it didn’t matter because I was sat behind the table so the jeans wouldn’t be seen. Then I popped out for a bit and got distracted by some bizarre cross-bred animals in the street (I can’t remember what, but I think one was a cat-pig hybrid), before I heard the music that comes on during the countdown to the news and I realised I had to dash back. Then I woke up.

Should Catalan be banned from football press conferences?

February 23rd, 2011

For professional translations, visit www.timtranslates.com.

It seems some people think that football coaches shouldn’t be allowed to speak Catalan in press conferences. Two Catalan coaches — Raül Agné (Girona FC) and Pep Guardiola (FC Barcelona) — have recently been criticised for speaking Catalan in their press conferences.

The criticism would perhaps be justified if the coaches spoke exclusively in Catalan. But both Agné and Guardiola reply to journalists in both Catalan and Spanish, depending on the language in which the question is asked. Pep Guardiola also answers questions in English and Italian.

On 12 February, Raül Agné was interrupted while replying in Catalan to a question in Catalan, and was asked to speak in Spanish. He replied in Spanish to the journalist who interrupted him, telling that he was answering the question in Catalan, and that afterwards he would gladly speak in Spanish. When the journalists continued to insist that Agné spoke in Spanish, the coach left the press conference. The incident can be seen here.

On 16 February, following the wonderful spectacle between Arsenal and Barça, Pep Guardiola courteously answered the questions posed to him in the post-match press conference, responding in the language in which each journalist spoke to him. But according to the presenter on the Madrid region’s public broadcaster, TeleMadrid: “Guardiola spoke…in all languages except Spanish. Listen carefully: he spoke English, French and Spanish…He’ll speak everything except Spanish.” They went on to say that they would return to the press conference when Guardiola had a moment of “Hispanic lucidity”. The clip in question can be seen here.

Anybody who has ever listened to a press conference by Pep Guardiola will know that the TeleMadrid presenter was lying. As I have already said, Pep Guardiola replies to all the questions in the language in which they are posed, provided that he speaks the language. This includes replying in Spanish to Spanish journalists. This clip includes an audio recording of the same press conference, and Guardiola can be heard replying to various questions in Spanish. Indeed he answered more questions in Spanish than in English, and he did not answer any in French (as far as I’m aware, he doesn’t even speak French). So the TeleMadrid presenter, like so many other Spanish journalist, was using pure lies to attach the Catalan language. When Catalan people see these kinds of clips from the Spanish media, is it any surprise that so many wish to separate from Spain.

As far as I’m aware, the approach adopted by Guardiola is the same as that adopted by all other managers. José Mourinho can often be heard answering questions in various languages in a press conference, and I’ve never heard anyone complain, especially not anyone on TeleMadrid. Like Raül Agné seemed to imply, the journalists’ problem is not that Guardiola was not speaking Spanish, but rather that he was speaking Catalan. In fact, the only language I seem to hear people deliberately avoiding speaking is Catalan: not once have I heard Johan Cruijff reply to a journalist in Catalan, even though he has lived so long in Catalonia. TeleMadrid don’t seem to mind that though.

This situation does not only occur in football: a few years ago a colleague posted a link in Catalan on the Traducción en España distribution list. Links are often posted on the list in English, and nobody ever complains, but one member of the list — who on her website claimed she translated from Catalan — accused the poster of “navel gazing”, which is rather ironic considering the poster was sharing a resource with other colleagues. Fortunately most members of the list did not share the unhappy translator’s viewpoint.

May I suggest to TeleMadrid that if they want to broadcast Guardiola’s press conferences, or that of any other football coach apart from monolingual Spanish coaches (or perhaps they think that’s what the other coaches should aspire to become?), they should hire interpreters. They should easily be able to find an interpreter who can interpret from both Catalan and English if they think it’s unfair that they have to hire an interpreter just for a “pointless language” like Catalan.

If TeleMadrid do not want to hire an interpreter, they should simply record the press conference and get somebody to translate it afterwards.

Also, I would like the male presenter to explain why he thinks that Guardiola can only show his “Hispanic lucidity” by speaking Spanish and not Catalan. Catalan is an official language of Catalonia, so I presume he considers that Catalonia is not, or should not, be part of Spain? If that is what he believes he is entitled to his opinion, and it should be respected. But if he believes Catalonia is or should be part of Spain then perhaps he — as well as the television station for which he works and anyone else who believes that Catalonia is part of Spain — should start to treat the Catalan and Spanish languages with equal respect.

Finding acronyms in MS Word

December 21st, 2010

To find acronyms written in all caps in MS Word, use the following search line (including the angle brackets) with wildcards activated:

<[A-Z]{2,}>

To find plural acronyms, change it to this:

<[A-Z]{2,}s>

And to find acronyms with an apostrophe followed by an s, change it to this:

<[A-Z]{2,}[’']s>

UKIP and the smoking ban

October 26th, 2010

UKIP’s policy on smoking in pubs is another reason not to vote for them (as if there weren’t enough already). Open this and scroll down to “Pubs and the smoking ban”.

UKIP’s proposed policy is pretty much identical to the current Spanish law. They propose allowing pubs to have separate smoking areas, but make the following exception:

Where it is not possible, desirable or affordable to construct contained smoking rooms, UKIP would give pub landlords and club managers the right to declare their pub or club ‘all smoking’ (or ‘all non-smoking’). However, the pub or club must show clear warnings and notices outside the building highlighting this special status.

(You should take the phrase “all smoking” literally. Everyone would be smoking: some actively, some passively.)

There is a similar exemption under Spanish law for bars that are smaller than 100 sq m (just over 1000 sq feet). Unfortunately the vast majority of bars fall into this category. I have often found myself in an area where it is impossible to find a smoke-free bar, since so many bars fall into this category. The ambiguity of the UKIP’s proposal means that the vast majority of pubs would probably be exempt. Also, they make no mention of how employees working in pubs would be protected from the lethal effects of passive smoking.

So if you want to go back to the bad old days of not being able to breathe fresh air in the pub, and having to wash clothes you’ve only been wearing for an hour in the pub, and thousands of bar staff being forced to work in highly toxic air, vote UKIP. But if you want clean air in the pub, think again before voting UKIP.

Don’t judge a book by its inside cover

October 12th, 2010

Don’t judge a book by its cover, goes the saying. If that includes the inside cover, never could this be more true than with the 2009 edition of the Diccionario politécnico de las lenguas española e inglesa. This very good dictionary has been spoilt by a spelling mistake on the inside cover.

Inside cover

I suspect spelling mistakes on the front and inside covers are more often than not the fault of the publisher, rather than the author. It’s not the first time I’ve come across one. In the Humanities library at the university where I teach there is an English dictionary with “Lenguage Dictionary” on the front cover (I seem to have lost the photo I used to have). And then there was the infamous case of the Catalan translation of the Asterix book Le Ciel nous Tombe sur la Tête, which erroneously appeared as “El cel s’ens cau al damunt”. Like the French tomber, caure is not a reflexive verb. In Spanish, however, they use the reflexive verb caerse, hence the mistake. But the mistake is twofold, since even if Catalan did use the reflexive form caure’s, the correct combination of the pronouns se and ens is se’ns, not s’ens.

The English title of this technical dictionary is also spelt incorrectly inside the book.

Front cover

Despite these mistakes, this is a very good dictionary and I would highly recommend it to anybody translating technical texts between Spanish and English. I just wonder how many people have read the online fragment provided on the publisher’s website and judged it by its inside cover.