Update patch for Google Translate (corrects "by design" bugs)

deavidsedice - May 22, 2007 - 17:48
Project:Google Translate
Version:5.x-1.3
Component:Code
Category:feature request
Priority:normal
Assigned:Omar
Status:needs work
Description

Hi,

I've seen this module, and I like it. But this module needs a better method to send pages to Google and read them later. I've worked today to do this, and now its working fine.

At this moment, I corrected these bugs:

- Problems when logging in the website. Now, it is stable.
- Do not translating when sending POST request. Now, it translates, if it is possible.
- Corrected a bug with redirection "header('location:http://...')"

The only file modified is "gtrans.module". I attached the entire tar.gz.

Please, review my code, and tell me ;)

AttachmentSize
gtrans-deavidpatch.tar_.gz9.92 KB

#1

Omar - May 23, 2007 - 07:06
Version:HEAD» 5.x-1.3
Category:task» feature request
Priority:critical» normal
Assigned to:Anonymous» Omar

Good job Deavid.
I'll test the functionality and RTBC.
Please try to do "diff" in the future instead of packing all the files in a .rar then a .tar.gz ;)
Thanks once again.

#2

deavidsedice - May 23, 2007 - 07:37

Ok, I'll do it ;)

It was more easy to pack all in a tar.gz, but you're right, my KDE has done some strange with the tar. Sorry!
___________________________

I'm very interested to extend this module. I'm working on it.

The major feature that I'm trying to create is, to separate the html web page into lines of plain text, then send them to google for translating them. Finally, store the strings translated in a new table. In the next times, these strings will get translated vía DB, without google. (More faster)

If I can do that, I'll try to auto-translate the strings of drupal (es.po, ca.po) because there are a lot of modules that doesn't have an Spanish translation, and I wish to have them translated.

I'm thinking about translating only nodes, or integrate the module with i18n/localizer.

I don't know if I will be able to code this, but I'll try.

#3

Omar - May 23, 2007 - 08:56
Status:needs review» needs work

Deavid,
Your patch leads to a WSOD.
I tried with replacing gtrans.module on a working site and with a fresh installation: same result.
Please test further before submitting the patch.

#4

deavidsedice - May 23, 2007 - 09:28

I'm sorry. In my computer works very well. (An standard install of a LAMP system) I'm thinking about it. It can be the gzip/zlib compatibility.
Please, see the watchdog table to see what is the problem. (If you can't, because my patch makes a WSOD, comment all the code, or rename temporaly the module, to prevent loading of corrupted code.)

I've modified the file to evade the use of gzopen/gzwrite, but I haven't tested yet (I must be in my home to test it).

Now I've made a Diff from the original .module. I hope that works.

AttachmentSize
gtrans-deavid.diff 11.21 KB

#5

deavidsedice - May 23, 2007 - 13:17

Finally, I've find the problem: You're using a server under Windows, and my patch tries to write to /tmp/ (directory that exists in UNIX/LINUX).

I replaced that with a call to:
http://api.drupal.org/api/5/function/file_directory_temp

And now it should work.

I'm testing it under Linux, and I repeat: it works.
I can't test it on windows, because I haven't any servers with that OS.

AttachmentSize
gtrans.diff 11.07 KB

#6

Omar - May 24, 2007 - 06:39

Deavid,
I've tested your module in two environments: My laptop and my testing server: http://owahab.com
My laptop is: Debian Etch, Apache 2.2, MySQL 5.0, PHP 5.2.
Testing server is: CentOS 4.4, Apache 1.3.37, MySQL 4.1, PHP 4.4.6.
Still convinced I'm using Windows? :)

I think I will keep working with this issue, I was already planning to change the way the module interacts with Google Translation service.
I will give it more eye balls very soon.

Thanks for your persistence.

#7

deavidsedice - May 24, 2007 - 10:49

You don't use Windows?
oops...

Ok. now I've no idea about the problem. I hope that my last patch works well on your servers.

My servers are, always with: Debian Sarge/Etch/Sid GNU/Linux, MySQL 4.x/5.x, PHP 4.x, Apache 1.x/2.0, Drupal 4.7/5.1.

But I'm testing this patch with a standard hosting provided by ServeisWeb. It runs under GNU/Linux & PHP4.
I'll test the patch in my other servers.

I worked yesterday with other interesting features. For example, now, my site can use Google Translate Services to translate modules into spanish.Resulting translations are stored in drupal locale module tables, and will use those data in the future. And, with the locale module, I can export the resulting translation into a .po file, for load into another web site.

I think that is more important to know how to create fully working patches.
I'll send other interesting features later. ;)

Thank you for your patience.

#8

armand0 - May 25, 2007 - 09:42

No, to my neither works me.
I tried with the original module and it didn't translate and to the moment to logof the place stayed in blank.
Then I read what you have published here, then I tried with the patch and neither it worked. It showed me the screen in white to the moment to keep the languajes configuration.

#9

deavidsedice - May 25, 2007 - 13:39

I've tested my patch another time, in my machine. And works. ¿What's wrong?

See what I do to test my patch:

1) I've a AMD64 with Debian Sid fully updated. I'm using Konqueror and Firefox/iceweasel as browsers.
2) Install the following packages:
apache, php4, php4-gd, php4-mysql, mysql-client-5.0, mysql-server-5.0
(These packages should be suficient to run gtrans patched. I have others, such as php4-cli, php4-common, php4-mcrypt, php4-imagick, etc)
3) Install a drupal 5.1 in /var/www/, configure it as usual. Sites, files, etc.
4) Download the original gtrans module from the project page: (Download the 5.x-1.3 version)
http://drupal.org/project/gtrans

5) Untar it into /modules directory.
6) Enable it in the admin pages, and test it. It should work.
NOTE: Activate CLEAN URL's for your site. I've noticed an error "404 page not found" (gtrans.module was patched) when not using it. Sorry.
7) Open a console with proper permissions to write in that directory.
# go to the folder
cd /var/www/modules
# download my last patch
wget http://drupal.org/files/issues/gtrans.diff
#patch the original file
patch gtrans.module gtrans.diff
8) Try to translate your page now. It should work too.
9) Log-in into your admin account, if you wish.
10) Try to create some nodes, navigate, change parameters. It should work.

I've seen other bug. Some times, the page loads partially. I don't know why. Those times, the user must refresh the page.

PD: Please check if you have the same file. The md5sum of gtrans.module after patch, is "429b024bf93469bed653eb656fc96d4d".

I'll will test later it in a third server to see if it leads to a WSOD. (I wanna see it too!!)

I'm testing the patch at http://deavid.no-ip.info. This is my test server, and usually is down.
I've created an account in the site for you, If anyone wants to test the patch working when logged.
User: drupal
Pass: test

This user can create articles/stories.

#10

armand0 - May 31, 2007 - 22:50

Ok, I made all that you indicate in the previous message.
Even so it doesn't work me. If I prove with Firefox, from the moment in that I try to use the module, it showme a page in white. And if I attempt it with the IE7 it showme a page with the only following text:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

I see the directory " tmp " and don't appear the files that it is supposed it should send to google.
With the previous version (although it didn't work) if the files appeared. Like: " xx.gtrans_auxdata.gz "

Although the web has it in a Linux server, my access to the web (to configure it or to update it) is from a windows pc. But I don't believe that that interferes.

The web is configured with the Spanish language as having predetermined. And the module gtrans configured it for English and Spanish languages, with Spanish as the original of the web.
Although I also proved modifying those configurations and anything.
And in the reports of watchdog don't make mensión to anything with respect to the problem.

The data of my drupal installation and the server are the following ones:
Drupal 5.1
Archivo de configuración Protegido
Base de datos MySQL 4.1.21
Biblioteca GD bundled (2.0.28 compatible)
Biblioteca Unicode Extensión Mbstring de PHP
Drupal core update status Actualizado
Esquema de base de datos Actualizado
Module update status Actualizado
PHP 4.4.4
Servidor web Apache/1.3.37 (Unix) mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 FrontPage/5.0.2.2635.SR1.2 mod_ssl/2.8.28 OpenSSL/0.9.7a PHP-CGI/0.1b
Sistema de archivos Se puede escribir (método público de descarga)

#11

ddyrr - June 3, 2007 - 07:41

I have the exact same problem as armand0, except my site is originally in English. I tried going to the gtrans site and translating my page from there to make sure there were no problems with my site, preventing it from being translated, and it translated fine. So there is something in the module that does not work with my site. I tried all these different updates, but I still get the same result. I would LOVE to have this working, it's such a great tool!

#12

ddyrr - June 3, 2007 - 21:37
Title:Update patch for Google Translate (corrects "by design" bugs)» Figured it out!

In the patch file, I found that these statements do not work on my server:

<?php
        $content_array
= explode("\r\n", $content);
       
$content = $content_array[1];
?>

When I take those out, I get a little bit of junk at the top and bottom, but it works. To get rid of the junk, I used this:

<?php
    $content
= substr($content,strpos($content,"<!DOCTYPE"),strpos($content,"</html>")-strpos($content,"<!DOCTYPE")+7);
?>

#13

armand0 - June 4, 2007 - 05:35

Yes, your modification worked for me perfect with English and Spanish. For curiosity I added to Gtrans config a language more, German. And then it showed me the pages in white again. So I will remove it and I will leave alone the 2 languages.
Thank you

#14

xeladrane - November 30, 2007 - 21:38

Very interesting discussion - I too find the two lines you mention above, i.e. exploding into the array are causing it to fail. I assume that $content contains more than the expected 2 instances of "\r\n" line breaks (i.e. after google translation it should be just before and just after the actual page we want), possibly in the page text content itself I guess, and hence trying to display $content_array[1] ends up with a piece that is less than 30 characters long (mostly empty string it seems), and hence the watchdog message that is found.

Not sure of the logic in using explode() in the original code though...

Like the sound of your efforts, deavidsedice, I'll be keeping an eye out for it!

I can't make it work properly still though, in the logs it keeps coming up with "content moved", or is just unbearably slow to render. I seem to get only part of a page returned, I can click and view source and it's only the first couple of lines, and then try again a bit later and it's got a bit more. How that works I have no idea!

I suspect that this is due to translate.google.com now redirecting to an ip address. I have added a watchdog call as below:

// now prepare content
$orig_content=$content;
$content_array = explode("\r\n", $content);
$content = $content_array[1];
if(strlen($content)<30){
watchdog('user', t('Attempt failed to translate page').' '.$_SERVER['REQUEST_URI'].' '.t('to').' '.$languages[$hl].': Content I got for http://'.$url.'/'.$uri.' was: '.$content.' and the unprepared string is: '.$orig_content);
}
// Dev patch as per http://drupal.org/node/146017
//$content = substr($content,strpos($content,"<!DOCTYPE"),strpos($content,"</html>")-strpos($content,"<!DOCTYPE")+7);

This tells me the following:

302 Moved

The document has moved here.

Where "here" links to the ip address that the google translate redirected to, which varies. Has this thrown a spanner in the works?

_________________Update___________________

I replaced the URL with $url = '64.233.179.104';

It appears that there is a conflict with JSTools - that's what has been inserting unexpected "\r\n" line breaks in my pages. So I re-wrote the array part and added a watchdog as follows:

$content_array = explode("\r\n", $content);
$content="<!--";
foreach ($content_array as $i)
  {
  $content.=$i;
  $content.="###";
  }
  $content.="-->";
  watchdog('user', t('Attempted to translate page').' '.$_SERVER['REQUEST_URI'].' '.t('to').' '.$languages[$hl].': Content I got was: '.$content);

The <!-- seemed to escape the problems at the top of the page, and solved the v slow load issue. However, I now get every piece of content on the page first in English and then in Italian (which I'm translating into), which is really weird, and of course I'm still getting the dodgy chars at the bottom.

My original idea was to use the following approach to cut out the first array term and the 3 dodgy ones at the end, but include all of the genuine page item terms, but it didn't appear to work:

$content_array = explode("\r\n", $content);
$content="";
for ($i=1;$i<count($content_array)-3;$i++)
  {
  $content.=$content_array[$i];
  }

Any ideas? I have to stop for today so will keep an eye out. I think the issue is something to do with the IE fix tweaks for JScript Tools and combined with how this module copes with the stripping of google link modifications around this.

#15

spatz4000 - November 30, 2007 - 20:26
Title:Figured it out!» Update patch for Google Translate (corrects "by design" bugs)

#16

deavidsedice - December 10, 2007 - 07:52

Another idea:

It is possible to create a database table with all the strings that we want to translate, and associate to them a "translate ID".

Then, when we want to load our page, we can send to google a specially created page instead.

<div id="121">This will be the string 121 to translate</div>
<div id="157">And this will be the string 157</div>

Google should translate it, and return the div's with their ID's.
Then, with an "ereg", we can retrieve the translated strings, and then, we can put them into the database.

In this way, we can build a table with all translations cached, with no need of depending of google translate for each click anymore.

This will avoid the WSOD.

#17

Omar - December 10, 2007 - 09:13

deavidsedice,
Could you provide a patch that illustrates your ideas?

#18

deavidsedice - December 12, 2007 - 10:30

I'm busy this month, but If I can code something like that, I'll send the patch as a new Issue (feature).

In the past, I coded something like following the idea of sending strings to google, but that was using (I don't remember exactly) the "t" function of Drupal. I patched the core of drupal to do this, and I don't like this way. (And it corrupts the Drupal translation).

 
 

Drupal is a registered trademark of Dries Buytaert.