Cannot translate 'sixteen' on home page

ConversesTranslating LibraryThing? (General Talk)

Afegeix-te a LibraryThing per participar.

Cannot translate 'sixteen' on home page

Aquest tema està marcat com "inactiu": L'últim missatge és de fa més de 90 dies. Podeu revifar-lo enviant una resposta.

1Anneli
jul. 12, 2007, 11:15 am

Finnish:
http://fi.librarything.com/
Jäsenet ovat luetteloineet yli sixteen miljoonaa kirjaa.

Swedish:
http://se.librarything.com/
Medlemmar har katalogiserat fler än sixteen miljoner böcker.

French:
http://www.librarything.fr/
Les membres ont catalogué plus de sixteen millions de livres.

Dutch:
http://www.librarything.nl/
Leden hebben al meer dan sixteen miljoen boeken ingevoerd.

There are other untranslatable things on other pages, but has anyone time to look at them?

2boekerij
jul. 12, 2007, 12:33 pm

>1 Anneli:

This has been predicted and a handy/necessary solution has been proposed, too, but hasn't rolled out--yet.

Cf. e.g. : Translating LibraryThing? (General Talk) : Tranlator's blues, i.a. Message 7 (Oct 7, 2006)

Alas, most of the problems (and proposed solutions) pointed at in the latter topic's Message 8 (Nov 10, 2006) (!) haven't been solved, neither even been addressed--yet.

This way, alas, it seems that is was a spill indeed to try and care by hinting and pointing LT at--then as well as now--present and future problems.

You might think that this is rather disappointing. I 'd agree.

3timspalding
jul. 12, 2007, 6:05 pm

This piece is now translateable. It's translated elsewhere (see the top of the page if you're not logged in). It involves a translated piece within another translated piece. I'm worried that this approach will break down with some languages, as "sixteen" may have to agree in gender with the books.

4boekerij
jul. 13, 2007, 11:46 am

>3 timspalding:

I am afraid that I do not get this.

Though context is cardinal, if not vital indeed, in this specific case--i.e. : natural numbers--AFAIK there is no danger at all.

Your concern in this matter baffles all the more while when Thingamabrarians are pointing at real an acute problems, the latter are neglected in full.

Have an example with intranslatable (English language) homographs as e.g. "read != read". Cf. i.a. : Translating LibraryThing? (General Talk) : Tranlator's blues Message 7 : "6. Untranslatable translatables : homographs (to be split!)".

The latter example is affecting each and every language. It assures that user experience must be disappointing. LT knows about this, from the very beginning--and, alas, doesn't care, it seems.

Therefore, one might wonder if LibraryThing is explicitly buggy and erroneous by design.

5royalhistorian
jul. 13, 2007, 12:50 pm

Quote #4 Therefore, one might wonder if LibraryThing is explicitly buggy and erroneous by design.

Hm, I don't see those bugs and errors. Everything is working fine since the bugs from the new features were solved. And those bugs are in fact more important then problems with languages, especially when dealing with an server that couldn't keep up with all the requests.

Now the new server is up and running and everything is running smoothly again the team has the time to get to these problems.

6timspalding
Editat: jul. 13, 2007, 1:11 pm

>Therefore, one might wonder if LibraryThing is explicitly buggy and erroneous by design.

Pardon me for noticing this, but you've made similar claims in the Dutch language group. The people you disagree with are not merely wrong, they're actively trying to hurt the site. From looking at these individuals' libraries and their contributions, I'm sure you're wrong. They may be idiots. They may be fools. They may be unworthy to tie your sandal. But they're not actively trying to hurt anyone.

The same is true here. We are not evil, but merely (if you choose to believe it) incompetent. We feel that translation is difficult, and that we have to continually make choices with scarce resources.

Homophones present a problem because the database stores translations by means of (basically)
English phrase -> other languages' phrase

If "read" as an imperative and an adjective are spelled the same, there's a problem. Probably the answer is to have some sort of code in the text, like "read {{imperative}}" which shows in the translation page, but not when the text is printed. But that would require running a regular expression on EVERY SINGLE PIECE OF DATA printed to the screen, to parse out curly brackets. That's a sledge hammer to hurt a fly. Put another way, buy me another web server and I'll do it.

Another way would be to allow users to mark a piece of text as homophonic. And this would split it into two, and there would be a field to record homophony, and that field would affect how things worked and blah blah blah blah. Four hours work, at least.

LibraryThing is not perfect. Making it takes effort and resources and time, and it is made by people. That means that sometimes a small bug will persist for quite some time.

Ultimately, while I understand that confusion between the two meanings of "read" in Dutch is a problem, I also note that there is no Dutch alternative to LibraryThing. Why? Because it probably wouldn't be a viable business. LibraryThing has a Dutch version—and a Welsh, etc.—because making them took less effort than making them from scratch.

There is some data. Alternatives have popped up in France and Germany and, quite frankly, they haven't done very well. A Swedish one—introduced long before LibraryThing—never got off the ground. Even where there's been some succes, LibraryThing's service is larger than the native one. Yes, the native one has no language problems, but language problems don't negate everything else good about LibraryThing. For starters, the presence of so many non-Dutch users makes LibraryThing more fun for Dutch people too.

7timspalding
jul. 13, 2007, 1:03 pm

I don't have much in the way of specifics, but SOME languages treat SOME numbers as undeclineables—Lat. Unus una unum. Gk. Eis mia hen.

8xtien
jul. 14, 2007, 12:41 am

but merely (if you choose to believe it) incompetent.

I do not think so. Creating a website, or a software system, is an iterative process. Once you have a full blown system, you regret many of the architectural choices you have made earlier on. Yet, earlier on you couldn't possibly have known better.
At this time, features are added one by one, while as I assume, modules of the system are rewritten from scratch one by one. At some point, there will be a new architecture alltogether, and the current drawbacks will vanish. Only to be replaced by new user wishes and potential shortcomings at a higher level.

I think Tim's team is doing a great job, given the limited number of people they have and the speed LT is growing, and the number of features and fixes they add every week.

9timspalding
jul. 14, 2007, 12:44 am

Thanks. Speaking of root-up changes, the whole way we handle Amazon images is changing.

10LA2
jul. 14, 2007, 3:40 am

Homophones present a problem because the database stores translations by means of (basically) English phrase -> other languages' phrase

"So, don't do that then."

11timspalding
jul. 14, 2007, 4:43 am

The advantage of that approach is that if someone translates "show me more" on page 1, it can also be used on 2, 3, 4, 5, 6, and 7. Treating every piece of text as it's own little island would multiply the total amount of text enormously, and piss people off. "I just translated that" would be a familiar refrain.

12kantelier
jul. 14, 2007, 5:48 am

I've seen one another cataloging site that allowed users to translate. But it offered all phrases in one big spreadsheet and would use it only if about 70% or so was translated. That is a big discouraging threshold. I like Tim's approach much better, though sometimes I wish I could have such a spreadsheet too, to look up how certain concepts were translated in other situations. Or not just change the most prominent occurrence of some concept, but change it consistently everywhere.

13Anneli
jul. 14, 2007, 7:50 am

Tim wrote:
"Alternatives have popped up in France and Germany and, quite frankly, they haven't done very well. A Swedish one—introduced long before LibraryThing—never got off the ground. Even where there's been some succes, LibraryThing's service is larger than the native one. Yes, the native one has no language problems, but language problems don't negate everything else good about LibraryThing. For starters, the presence of so many non-Dutch users makes LibraryThing more fun for Dutch people too."

I agree that the LibraryThing is the biggest and the most beautiful and I am very satisfied with it, but I despair with the translations (and sometimes with the characters ä, ö and å). I don't doubt Tim's or Abby's or other admins' good will and I hope that they don't doubt mine, either.

I think that most people use the English interface regardless of their mother tongue - so do I. So the whole translation business is marginal or maybe not even worthwhile. For some unfathomable reason I got entangled with the Finnish translation (I don't need it!) and now I would very much like the translations to be good. But it is not possible.

LibraryThing is for people who read. If the translations sound like they were compiled by illiterates or translation machines, it might harm the cause, wouldn't it?

In message 11 Tim wrote:
"The advantage of that approach is that if someone translates "show me more" on page 1, it can also be used on 2, 3, 4, 5, 6, and 7. Treating every piece of text as it's own little island would multiply the total amount of text enormously, and piss people off. "I just translated that" would be a familiar refrain"

I understand that it is convenient in some cases, but not always. I wish you would see, that this system causes unsolvable problems for many languages. It doesn't always work.

14MMcM
jul. 14, 2007, 1:54 pm

I do not understand the finest points of how the translation system works. But the basic idea of using English phrases as keys instead of message ids is evident. This achieves the two important goals of rapid editing of the site and approximately zero overhead rendering English. I have to believe that it's possible to fix this without having to "do it right."

Isolates could be marked at the page level. I assume that the translation engine receives the target page name. For that page, list those phrases that do not share translations with any other page. Either in the source, or in a separate database that maybe users extend. It means more work for the translators, but it's simple. Problem is, though, it still messes up if the same English occurs twice on the same page requiring two different senses. Unless some sort of source locator is also available.

The sense distinguishing marker could go in the markup instead of the text. I assume there is some way in which pieces of the page source can be marked as separately translated, rather than one long fragment. Something in the PHP source. Can't something else (like an attribute) go there and get passed to the translation engine? If that engine works with a stream instead of a bunch of separate calls, then insert the {{'s at that time when not English.

Perhaps even these are on the order of half a day.

15LA2
jul. 14, 2007, 8:49 pm

re 11: Tim, surely you can come up with something smarter than that. For starters look at what the Mediawiki software does, as Wikipedia doesn't have this kind of problem and has far more languages than LT. Instead of parsing regexps, the database could have a separate comment column for each English phrase. Perhaps we could "combine and separate" phrases just like we do with book titles and authors?

16LA2
jul. 17, 2007, 1:53 pm

Some general experience from internationalization is summarized in this recent posting to the wikitech-l,
http://lists.wikimedia.org/pipermail/wikitech-l/2007-July/032345.html

17timspalding
jul. 19, 2007, 9:17 pm