Help translate movie credits from various languages into Eng
Moderators: kokoyaya, Beaumont
Help translate movie credits from various languages into Eng
Hello!
I am working on a project to try to make it easier to find videos in libraries. We are trying to teach a computer to analyze some free text statements from movie credits found in library records (e.g., "directed by Clint Eastwood" --> "directed by [= director] + Clint Eastwood" Some of these statements aren't in English and I was hoping that some of you reading this forum might be willing to help out with translating a few. There are credits in many languages from Arabic to Chinese to Spanish to Urdu.
If you would like to help, please go to our web form at http://olac-annotator.org/ and choose a language
If you can help with Bambara, Basque, Burmese, Modern Greek, Hindi, Khmer, Lao, Oriya, Pashto, Kazakh, Sindhi, Tagalog, Tajik, Uighur, Uzbek, or Wolof please contact me. My contact information is at http://olac-annotator.org/#/about.
Thank you very much!
Kelley McGrath
I am working on a project to try to make it easier to find videos in libraries. We are trying to teach a computer to analyze some free text statements from movie credits found in library records (e.g., "directed by Clint Eastwood" --> "directed by [= director] + Clint Eastwood" Some of these statements aren't in English and I was hoping that some of you reading this forum might be willing to help out with translating a few. There are credits in many languages from Arabic to Chinese to Spanish to Urdu.
If you would like to help, please go to our web form at http://olac-annotator.org/ and choose a language
If you can help with Bambara, Basque, Burmese, Modern Greek, Hindi, Khmer, Lao, Oriya, Pashto, Kazakh, Sindhi, Tagalog, Tajik, Uighur, Uzbek, or Wolof please contact me. My contact information is at http://olac-annotator.org/#/about.
Thank you very much!
Kelley McGrath
Re: Help translate movie credits from various languages into
Hi Kelley, it would help if we had an idea of the global volume to translate, and more info on the project itself, as we are here to provide free help for short texts only, or it would be unfair competition to the professional translators.
I had a look at Thai and the phonetical transcription that is being used makes it hard to understand (there is no standard transcription for Thai, so all the different transcription systems are useless).
I had a look at Thai and the phonetical transcription that is being used makes it hard to understand (there is no standard transcription for Thai, so all the different transcription systems are useless).
Time is an illusion. Lunchtime doubly so.
Re: Help translate movie credits from various languages into
Hi Beaumont,
Thank you for your questions.
RE: project and volume of credits
This is part of a larger project that is trying to build better displays and provide more effective search options for people looking for film and video in library catalogs. Library catalogs have trouble providing the kind of functionality that people have come to expect from e-commerce sites like Amazon. Part of the problem is that the form of the data in library records was designed in the late 1960s. These records have lots of free-text statements like "directed by Steven Spielberg" and we want to map these to a form that a computer can manipulate as data. Our main goal with this part of the project is to compile a set of correct answers for machine learning and evaluation. By training the computer to be sufficiently accurate on a known subset of records, we hope to then be able to apply that code to a larger set of records rather than having people interpret the text. I have tried to explain this in more detail at http://olac-annotator.org/#/about.
There are actually several thousand credits needing annotation on our site, including many in English. For example, there are several hundred in the Thai file. However, there are not nearly so many unique translations as terms like "kamkap kansadǣng" appear numerous times. We need multiple variations on the same type of credit for the machine-learning aspect of the project.
I was not intending for anyone on this forum to do all or even very many translations. Even one or two would be helpful. Or we have some ongoing volunteers who try to do five or ten per week. Someone had suggested to me that perhaps your group would be a good place to find people who might be willing to help with the translation part of the project. However, it might not actually be a good fit. I do not have any funds to pay professional translators to help with this, which is why we decided to try crowdsourcing. Perhaps you (or someone on the forum) might have a suggestion for a better place to recruit volunteers.
Many of these translations don't require very advanced knowledge of the language in question. The problem is that we can't predict in advance which ones are going to be problems. Some of them are grammatically incorrect or have spelling errors that make them hard to interpret. Some of them lack context. Some of them use unusual constructions. Another challenge is that film, like many areas, has some specialized vocabulary, which has caused problems even for some native speakers of various languages who have been helping.
RE: transliteration
Unfortunately, we were not able to import the original texts for non-Roman scripts into our website even when they were available. The romanization systems that are supposed to be used in these records can be found at http://www.loc.gov/catdir/cpso/roman.html, but you are likely to encounter errors or old data (e.g., U.S. libraries switched from Wade-Giles to Pinyin for Chinese, but some records never got updated).
Kelley
Thank you for your questions.
RE: project and volume of credits
This is part of a larger project that is trying to build better displays and provide more effective search options for people looking for film and video in library catalogs. Library catalogs have trouble providing the kind of functionality that people have come to expect from e-commerce sites like Amazon. Part of the problem is that the form of the data in library records was designed in the late 1960s. These records have lots of free-text statements like "directed by Steven Spielberg" and we want to map these to a form that a computer can manipulate as data. Our main goal with this part of the project is to compile a set of correct answers for machine learning and evaluation. By training the computer to be sufficiently accurate on a known subset of records, we hope to then be able to apply that code to a larger set of records rather than having people interpret the text. I have tried to explain this in more detail at http://olac-annotator.org/#/about.
There are actually several thousand credits needing annotation on our site, including many in English. For example, there are several hundred in the Thai file. However, there are not nearly so many unique translations as terms like "kamkap kansadǣng" appear numerous times. We need multiple variations on the same type of credit for the machine-learning aspect of the project.
I was not intending for anyone on this forum to do all or even very many translations. Even one or two would be helpful. Or we have some ongoing volunteers who try to do five or ten per week. Someone had suggested to me that perhaps your group would be a good place to find people who might be willing to help with the translation part of the project. However, it might not actually be a good fit. I do not have any funds to pay professional translators to help with this, which is why we decided to try crowdsourcing. Perhaps you (or someone on the forum) might have a suggestion for a better place to recruit volunteers.
Many of these translations don't require very advanced knowledge of the language in question. The problem is that we can't predict in advance which ones are going to be problems. Some of them are grammatically incorrect or have spelling errors that make them hard to interpret. Some of them lack context. Some of them use unusual constructions. Another challenge is that film, like many areas, has some specialized vocabulary, which has caused problems even for some native speakers of various languages who have been helping.
RE: transliteration
Unfortunately, we were not able to import the original texts for non-Roman scripts into our website even when they were available. The romanization systems that are supposed to be used in these records can be found at http://www.loc.gov/catdir/cpso/roman.html, but you are likely to encounter errors or old data (e.g., U.S. libraries switched from Wade-Giles to Pinyin for Chinese, but some records never got updated).
Kelley
Re: Help translate movie credits from various languages into
Hi BeaumontBeaumont wrote:Hi Kelley, it would help if we had an idea of the global volume to translate, and more info on the project itself, as we are here to provide free help for short texts only, or it would be unfair competition to the professional translators.
I had a look at Thai and the phonetical transcription that is being used makes it hard to understand (there is no standard transcription for Thai, so all the different transcription systems are useless).
Isn't most Thai using the American turist transscribing, even though it lacks the signs for tone, like the Chinese pinyan
My brother tried to help the Danish university make a transscribing that carried the tone signs too, I guess it helped, but my brother also says that Thailand lacks an official transscribing mode, and that many until this moment are quite bad/insufficient.
Cheers
solbjerg
Re: Help translate movie credits from various languages into
Yes many people worked on new transliteration systems or ways to improve the existing ones. However most Thai people themselves are not able to read transliterated Thai, whatever the system, so it's always pretty useless at the end, as a communication tool. As for family names or city names, the transcriptions still vary a lot. My wife's maiden name was spelt differently on her ID card and on her passport!
Time is an illusion. Lunchtime doubly so.
Re: Help translate movie credits from various languages into
hi BeaumontBeaumont wrote:Yes many people worked on new transliteration systems or ways to improve the existing ones. However most Thai people themselves are not able to read transliterated Thai, whatever the system, so it's always pretty useless at the end, as a communication tool. As for family names or city names, the transcriptions still vary a lot. My wife's maiden name was spelt differently on her ID card and on her passport!
My brother and his wife (we write it as Nidnoi) has for many years translated official forms and consequently prefer to ask the person how they like their name transscribed.
Cheers
solbjerg
Re: Help translate movie credits from various languages into
They could start by transcribing พร (a very common name or part of a name) by Pon or Pohn or Phon or whatever, instead of the too common "Porn"!solbjerg wrote:My brother and his wife (we write it as Nidnoi) has for many years translated official forms and consequently prefer to ask the person how they like their name transscribed.
Time is an illusion. Lunchtime doubly so.
- Maïwenn
- Modératrice Arts & Litté.
- Posts: 17522
- Joined: 14 Nov 2003 17:36
- Location: O Breiz ma bro
- Contact:
Re: Help translate movie credits from various languages into
No, Thailand wouldn't be Thailand without all the Porn's salons and Porn's shops.
Penn ar Bed
The end of the land
Le commencement d'un monde
The end of the land
Le commencement d'un monde
Re: Help translate movie credits from various languages into
Since this thread popped up again, I have a couple questions about Thai movie credits if you don't mind. Pardon the unhelpful transliteration--I have no control over it.
1. Does
Sahamongkon Fīm ʻIntœ̄nēchannǣn rūamkap Wœ̄kph̨ōi ʻEnthœ̄tēnmēn sanœ̄ Hūa Fīm Thāi Fīm sāng
mean something like Sahamongkon Fīm ʻIntœ̄nēchannǣn presents in association with Wœ̄kph̨ōi ʻEnthœ̄tēnmēn and how does Hūa Fīm Thāi Fīm sāng fit in?
2. How would you translate this one?
Sahamongkhon Fim sanœ̄ phāpphayon rak thǣ doi Chœ̄t Songsī
3. How about this one?
Samākhom Wāngph̄æn Khr̨ōpkhrūa h̄æng Prathēt Thai nai phrarāchūpatham nai Somdet Phra Sīnakharintharā B̨ōrommarātchachonnanī rūamm̄ư kap Kōngkāmmarōk læ Sūn Pongkan læ Khūapkhum Rōkʻēt Krom Khūapkhum Rōktitt̨ō, Krasūang Sāthāranasuk
I hope this isn't too many questions and thanks in advance for any insight.
Kelley
1. Does
Sahamongkon Fīm ʻIntœ̄nēchannǣn rūamkap Wœ̄kph̨ōi ʻEnthœ̄tēnmēn sanœ̄ Hūa Fīm Thāi Fīm sāng
mean something like Sahamongkon Fīm ʻIntœ̄nēchannǣn presents in association with Wœ̄kph̨ōi ʻEnthœ̄tēnmēn and how does Hūa Fīm Thāi Fīm sāng fit in?
2. How would you translate this one?
Sahamongkhon Fim sanœ̄ phāpphayon rak thǣ doi Chœ̄t Songsī
3. How about this one?
Samākhom Wāngph̄æn Khr̨ōpkhrūa h̄æng Prathēt Thai nai phrarāchūpatham nai Somdet Phra Sīnakharintharā B̨ōrommarātchachonnanī rūamm̄ư kap Kōngkāmmarōk læ Sūn Pongkan læ Khūapkhum Rōkʻēt Krom Khūapkhum Rōktitt̨ō, Krasūang Sāthāranasuk
I hope this isn't too many questions and thanks in advance for any insight.
Kelley
Re: Help translate movie credits from various languages into
'Intœ̄nēchannǣn' is probably 'International' and 'Enthœ̄tēnmēn' 'Entertainment', which already gives an idea of how ridiculous the transcription is. So 'Sahamongkon Film(?) International, together with Woekphoi(?) Entertainment, present 'Hūa Fīm Thāi Fīm sāng' (sorry I have no clue what it means).kelley2 wrote: Sahamongkon Fīm ʻIntœ̄nēchannǣn rūamkap Wœ̄kph̨ōi ʻEnthœ̄tēnmēn sanœ̄ Hūa Fīm Thāi Fīm sāng
mean something like Sahamongkon Fīm ʻIntœ̄nēchannǣn presents in association with Wœ̄kph̨ōi ʻEnthœ̄tēnmēn and how does Hūa Fīm Thāi Fīm sāng fit in?
'phapayon' is 'movie', 'rak theu' is 'I love you'... so maybe 'Sahamongkhon Film presents the movie "I love you" by Chet(?) Songsi(?)'.kelley2 wrote:2. How would you translate this one?
Sahamongkhon Fim sanœ̄ phāpphayon rak thǣ doi Chœ̄t Songsī
Sorry, I have no idea, it's too much of a guesswork, especially as it all seems to be proper names. I just recognize 'Somdet Phra Sīnakharintharā B̨ōrommarātchachonnanī', who is the Princess Mother, mother of the current king.kelley2 wrote:Samākhom Wāngph̄æn Khr̨ōpkhrūa h̄æng Prathēt Thai nai phrarāchūpatham nai Somdet Phra Sīnakharintharā B̨ōrommarātchachonnanī rūamm̄ư kap Kōngkāmmarōk læ Sūn Pongkan læ Khūapkhum Rōkʻēt Krom Khūapkhum Rōktitt̨ō, Krasūang Sāthāranasuk
Time is an illusion. Lunchtime doubly so.
Re: Help translate movie credits from various languages into
Thank you for giving it a try despite the disorienting transliteration. There were a couple of new pieces of info in your reply that may be helpful.
Kelley
Kelley