Changed google URL + response in UTF-8

khorpyakov - May 15, 2009 - 07:29
Project:Translation Framework
Version:6.x-1.x-dev
Component:Code
Category:feature request
Priority:minor
Assigned:Unassigned
Status:needs work
Description

Recently I had noticed that google response with russian translations is in old KOI8-R character encoding. So I have to modify headers in request with this patch so now response is always in UTF-8.

Also new url provided by google return multiply translation variants encoded in comma-separeted arrays with square brackets if the input was a single word.

AttachmentSize
google_translation_module_head.patch2.67 KB

#1

darren.ferguson - May 15, 2009 - 13:00

Can you apply this patch against the current DRUPAL-6--1 tag in CVS since the one it is applied against is older than what is currently in that branch.

#2

khorpyakov - May 19, 2009 - 10:26
AttachmentSize
google_translation_module_head.patch 3.94 KB

#3

EvanDonovan - July 22, 2009 - 18:47
Status:active» needs review

I have tested the patch in #2 and can confirm that it works now when I try to translate content using Google. The old URL wasn't working at all.

#4

EvanDonovan - July 22, 2009 - 21:01
Priority:minor» normal
Status:needs review» needs work

Note that when I used the old URL, I wasn't getting back a response at all.

While this patch works (i.e., the content is translated), the $data->result returned by the drupal_http_request is in JSON format. This means it looks odd in the browser, having all the JSON escape characters in it.

It looks like there are two alternatives:

1) Make the HTTP request in Javascript & then send the result (i.e., the finished translation) to Drupal (so it can be parsed properly in one go using PHP's parse_json function). I haven't tried this.

2) Use drupal_http_request but then parse the JSON, or at least do str_replace to clean up the response. This is what I've done, but my technique is rather poor. I would appreciate any suggestions for improvement.

In _google_translation_split_string:

<?php
  
if ($result->code == 200) {
         
$translation_decoded = str_replace('\x3c', '<', $result->data); // escaping for &lt;
         
$translation_decoded = str_replace('\x3e', '>', $translation_decoded); // escaping for &gt;
         
$translation_decoded = str_replace('\"', '"', $translation_decoded); // escaping for double quotes - this line is causing some weird output
         
$translated_string .= $translation_decoded;
?>

In google_translation_postprocess:

<?php
function google_translation_postprocess($translate) {
 
$translate->translation = trim($translate->translation, '"'); // strip leading and trailing quotes (added by JSON response)
 
return $translate;
}
?>

#5

EvanDonovan - July 22, 2009 - 21:49
Priority:normal» minor

Never mind. The original Google Translate URL is working for me now. drupal_http_request() was acting flaky for me before, I guess.

I do have a serious issue with the output it's giving me, which I'll post as a separate issue.

 
 

Drupal is a registered trademark of Dries Buytaert.