Need help creating Ultimate Automatic Translator

Forum for English and all other languages.

Moderators: kokoyaya, Beaumont

Post Reply
Runa27
Guest

Need help creating Ultimate Automatic Translator

Post by Runa27 »

Hello everyone. Recently, while I was talking to a friend (on another forum) who was German and learning English (I'm American, a native English-speaker with an interest in several languages, by the way), we were having a good laugh over how bad the German-English translators seemed to be (I would put something "complex" into a translator site, then back-translate it and see how funny it got). Then we had a fascinating idea (well, I sort of hit on the idea itself first - she's the one who encouraged me and said "But that'll take MASSIVE amounts of work!", so I think she deserves just as much credit!):

Why not create an automatic translator that took into account that certain words have specific connotations, multiple meanings, multiple spellings, spellings close to other words, could be used in different contexts, have similar meanings to other words, etc?

For instance, "unwanted" and "undesirable" have the same basic denotation (definition), but can mean two VERY different things - "unwanted" implies more that something simply isn't wanted, whereas "undesirable" implies that something really, really, really isn't wanted.

And "may" can have several different meanings as well - as in, "May I...?" (a questioning word), "May you ___." (Expressing a hope that the described thing afterwards happens), and of course, the merry month of May (Mai in German, Mayo in Spanish, etc.). Similarly, "can" can have a similar meaning to one of "may"'s meanings, but also can mean a metal container, or even "to fire someone from a job" (additionally, "fire", can mean to terminate someone's employment, to use a projecticle weapon, to bake something in an oven such as ceramics, or of course, to mean "flames")!

This is the single biggest cause that I see for the inaccuracies so often found in automatic translators - the multiple meanings of words, the different contexts they can be used, even the types of words they can be ("fire" can be a noun or a verb; so can "may" or "can"). The translation pages/programs I've used in the past almost always fail to take these things into account, and therefore, they provide some VERY screwed-up translations.

Why did this bug me so much? Well, it's mostly the fact that something like this:

"My baby is a basket case, all decked out in leather and lace." (those are lyrics from a rock song, by the way)

Becomes something like this once back-translated:

"My baby is an arm amputee missing both legs, all decorated in leather and shoelaces."

(Yes, that's roughly the same as I once got in the back-translation when I tried those lyrics in an English-German translator)

And I'd like to be a part of something that you know, didn't do that? ;)

So anyway, I was wondering if people fluent in other languages, or people good with Javascript, would be willing to help out with this kind of translator?

I was thinking that every translation it brought up would have footnotes that explained certain parts and how they could be translated differently.

Like, in the lyrics "My baby is a basket case, all decked out in leather and lace", it would note that baby and lace have multiple meanings, etc. And of course, give you a back-translation, so you could see if anything turned out really, really squirrely like that.

What do you guys think? I'm especially interested right now in a English-Spanish and Spanish-English translator, as it's the only foreign language whose grammar I have much familiarity with, it's fairly simple (relatively speaking; no language is 100% easy of course, but for me, I've noticed that Spanish is the easiest for me to learn so far), and I'm planning to take it in college this fall anyway. ;)

Plus, with so many native speakers of the language, especially with how many of them are bilingual, it makes sense for the beta version to be Spanish-English/English-Spanish, because it'll be a little easier to test it.


-Runa27
User avatar
SubEspion
Membre / Member
Posts: 3705
Joined: 02 May 2003 22:53
Location: Si vous saviez !

Post by SubEspion »

Runa27 wrote:Why not create an automatic translator that took into account that certain words have specific connotations, multiple meanings, multiple spellings, spellings close to other words, could be used in different contexts, have similar meanings to other words, etc?
Because in a sentence, there are so many possible meanings! There are special sayings, there are special connotations. I think that it is important to do the differences between an automatic translator and a human. The human beings are able to understand the metaphors, the insinuations. I don't think that there will be a real ultimate automatic translator someday ;)

Anyway, I am sure that the other members will give you better answers! Welcome on the forum!

:hello:
User avatar
pc2
Membre / Member
Posts: 5299
Joined: 18 Feb 2005 13:21
Location: Rio de Janeiro, Brasil
Contact:

Post by pc2 »

salutations,

we have a good knowledge about Visual Basic, and Assembly.
in VB, we've done many translators from Esperanto to a constructed language, just to take advantage of VB's power.
we should say that we know a German translator would have to have a very big database, morphologic analyzer, verb conjugator, and lexicon dictionary, syntatic function analyzer, and many other things.
it would be a really big work and don't think it would take a little time to be finished. let's say, like months, or even years.
the translator function would not only have to recognize the sentence's meaning, but also recognize what does the English sentence want to say, and the English syntax does not help very much.
not even the expensive and professional translators, like L&H Power Translator, translate perfectly.
besides, the translators do search for different meanings of words, for example, in "this is like the other one", "like" would translate differently from "we like to eat".
you would have to formulate good and accurate translation algorithms. and what about recognizing what's a verb, a noun, etc.?
developing an ultimate automatic translator is not an 1 or 2 person job. it's a job for a team of development, from developing the database (with a good number of words, let's say, 500 thousand of them, including the translation fields, like science, etc.), the translator algorithms, all the word's inflections, to developing the translator interface (i. e. the program that will ask for the user's input and return the translation).
but anyway, if you're gonna build a translator, good luck.

best regards,
Merci de corriger notre français si nécessaire.
Paulo Marcos -- & -- Claudio Marcos
Brasil/Brazil/Brésil
User avatar
kokoyaya
Admin
Posts: 31645
Joined: 10 Oct 2002 14:12
Location: Moissac (82)
Contact:

Post by kokoyaya »

I personnally will not re-open for the 100th time on this forum the debate on automatic translation :-?
Runa27
Guest

Post by Runa27 »

I personnally will not re-open for the 100th time on this forum the debate on automatic translation
It wasn't meant to be a debate on the use of automatic translators. I just wanted to create a program or page that would be a better translation tool. User input I guess actually makes it more just a "mechanical" translator than "automatic"; but the idea is more for a tool that makes translating a little easier and more accurate.

I do realize that nothing's better than a native speaker to help with translation; however, if one can get MOST of it translated with fair accuracy (as opposed to the ludicrously inaccurate translations provided by most online translator pages), then it makes it easier to complete more quickly.

Also, what if I wanted to have a story of mine translated into say, Romanian? I don't speak enough Romanian to do this; even if I asked a native speaker, chances are, there's going to be certain lines that will give them trouble. However, if most of the grunt-work, as it were, were done beforehand (and done better than most free online translators can right now), you could ask a native-speaker to help you fine-tune it more easily. It would be less time-consuming, in other words, and because it would also be more accurate (because of user input and all), make translation more efficient. Plus, it makes it easier to ask for such editing help if you can put your request in the language, too (again, this would be easier to do with the kind of page or program I'm thinking of, because the user input would allow you to express yourself a little more coherently).


-Runa27
Post Reply