Investigate including all content types fields in the calais processing flow. COuld simply be a matter of combining all text based fields into a big string and sending it off. Putting this here so I don't forget to address it.

Comments

bacchus101’s picture

So am I correct in reading this that a custom CCK body field will not be processed by Calais?

I was using Calais, but needed to change to a custom CCK field for the body so that I could manipulate the output via contempate - but now I am no longer getting term extraction.

febbraro’s picture

Assigned: Unassigned » febbraro

No, this currently works to the best of my knowledge, this is just a placeholder to make sure I test a whole bunch of CCK field configurations, etc. but if things broke for you something may be happening.

Can you try printing out $node->body in calais_process_node in the calais.module file. That will tell us if it is not getting the body from the cck field.

bacchus101’s picture

Thanks for your quick reply. I ended up getting it sorted out by using the default body field and making some adjustments with contemplate. Once back to the standard body, Calais immediately started providing terms.

Before that I had a custom multi-row text field that was substituting for the body within the custom content type and Calais was not pulling the terms from that specific field (nor for instance does it pull from the image field description.)

I'd try to help debug (if there is indeed an issue beyond my installation) but I am not sure how to implement the debug code that you need from above. I'm less of a coder and more of a high functioning copy and paster. ;)

febbraro’s picture

That sounds like a resounding "not working well for CCK fields" I'll address this asap and try to get a new release out in the coming days.

febbraro’s picture

Status: Active » Needs review
StatusFileSize
new1010 bytes

bacchus101: Here is a quick patch that puts all cck fields into the content that is submitted to Calais. I'm thinking we may want to allow configuration of *which* fields get submitted, but for now they will all get in there.

bacchus101’s picture

It doesn't seem to be completing the patch.

XXXX@XXX:/var/www/drupal/sites/all/modules/opencalais# patch < calais-cck-fields.patch --verbose --binary
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: calais.module
|===================================================================
|RCS file: /cvs/drupal-contrib/contributions/modules/opencalais/calais.module,v
|retrieving revision 1.3.2.16.2.12
|diff -u -p -r1.3.2.16.2.12 calais.module
|--- calais.module      31 Oct 2008 21:30:15 -0000      1.3.2.16.2.12
|+++ calais.module      17 Dec 2008 23:04:49 -0000
--------------------------
Patching file calais.module using Plan A...
Hunk #1 FAILED at 120.
Hunk #2 succeeded at 143 (offset -6 lines).
1 out of 2 hunks FAILED -- saving rejects to file calais.module.rej
done

..and this ends up in the rej file.

*************** function calais_process_node(&$node, $pr
*** 111,119 ****
      $node->vid = db_result(db_query("SELECT vid FROM {node} WHERE nid=%d", $node->nid));
    }
  
- 
    $date = format_date($node->created, 'custom', 'r');
-   $body = filter_xss($node->body, array());
    $node_settings = calais_get_node_settings($node);
    $calais = new Calais($node_settings);
    $keywords = $calais->analyzeXML($node->title, $body, $date);
--- 120,128 ----
      $node->vid = db_result(db_query("SELECT vid FROM {node} WHERE nid=%d", $node->nid));
    }
  
+   $loaded_node = node_build_content($node);
+   $body = strip_tags(drupal_render($loaded_node->content));
    $date = format_date($node->created, 'custom', 'r');
    $node_settings = calais_get_node_settings($node);
    $calais = new Calais($node_settings);
    $keywords = $calais->analyzeXML($node->title, $body, $date);

Am I approaching this wrong? Thanks.

febbraro’s picture

StatusFileSize
new826 bytes

Ooops. Sorry I patched my dev version and not the version from the latest release. Give this a whirl.

bacchus101’s picture

Alright, patch applied successfully.

I am still testing things out, but thus far I see "no joy" with term extraction from CCK fields.

I created a new content type and kept the body field and added a text field. I then created two identical articles with the content type, one pasted only into the body and the other pasted only into the text field. The one with the body field suggested and chose terms for Calais upon publication, the other did not.

I don't want to tie you up with this as I have a vast array of 3rd party modules that could be causing unforeseen issues, unless of course it is an area you feel like actively pursuing.

febbraro’s picture

No worries, I appreciate you testing the patches out and this will ultimately make this module much better for everyone else, so it is no hassle at all.

In calais_process_node where the patch was applied, line 106 or so, can you print out what $body is before getting send for processing. It might also help to print out what $node->content is as well. $node->content is supposed to return an array of all of the CCK fields for that node and should contain all the data you are looking to get processed.

Let me know what turns up.

bacchus101’s picture

StatusFileSize
new2.06 KB
new3.67 KB
new4.73 KB

I attached a file showing both the working and non-working versions of the article (with data in the body field and data instead in the CCK field.)

febbraro’s picture

So the return of

strip_tags(drupal_render($loaded_node->content));

Is empty?

bacchus101’s picture

This is where my lack of knowledge will handicap this investigation. I am not sure how to find out what that statement is returning.

febbraro’s picture

That statement is in the function calais_process_node in the file calais.module around line 110 or so. It was added by that patch in #7.

You can do something like this....

$body = strip_tags(drupal_render($loaded_node->content));
print_r($body);
// or if you have devel installed dvm($body);

Let me know if that helps.

bacchus101’s picture

When I add print under that statement, it outputs the text from the CCK field in question upon updating the node.

febbraro’s picture

Ok, that is good, that means that it is grabbing the text and attempting to send it to Calais for processing.

Can you now print out the return from Calais itself?

opencalais/includes/Calais.inc around line 81 is

    $ret = drupal_http_request($uri, $headers, 'POST', $data_enc);
    print_r($data);
    print_r($ret);

Thanks. I really need to make this a debug option for folks, this is not the first time :)

bacchus101’s picture

Here it is:

Array ( [licenseID] => xxxxxxxxxxxxxxx [content] => Thu, 18 Dec 2008 14:49:23 -0500 Boston is the capital and largest city of the Commonwealth of Massachusetts, and is one of the oldest cities in the United States. The largest city in New England, Boston is considered the economic and cultural center of the region, and is sometimes regarded as the unofficial "Capital of New England." Boston city proper had a 2007 estimated population of 608,352, making it the twenty-first largest in the country. Boston is also the anchor of a substantially larger metropolitan area called Greater Boston, home to 4.4 million people and the tenth-largest metropolitan area in the country. Greater Boston as a commuting region includes parts of Rhode Island, New Hampshire and Maine; it includes 7.4 million people, making it the fifth-largest Combined Statistical Area in the country. www.boston.com  [paramsXML] =>       Drupal   ) stdClass Object ( [request] => POST /enlighten/rest/ HTTP/1.0 Host: api.opencalais.com User-Agent: Drupal (+http://drupal.org/) Content-Length: 1931 Content-Type: application/x-www-form-urlencoded licenseID=pkp95spkgs3n8rz82yzj6wxd&content=%3CDOCUMENT%3E%3CTITLE%3EtestTwo%3C%2FTITLE%3E%3CDATE%3EThu%2C+18+Dec+2008+14%3A49%3A23+-0500%3C%2FDATE%3E%3CBODY%3E%0A++++%0A++++++++++++%0A++++++++++++++++++++Boston+is+the+capital+and+largest+city+of+the+Commonwealth+of+Massachusetts%2C+and+is+one+of+the+oldest+cities+in+the+United+States.+The+largest+city+in+New+England%2C+Boston+is+considered+the+economic+and+cultural+center+of+the+region%2C+and+is+sometimes+regarded+as+the+unofficial+%22Capital+of+New+England.%22+Boston+city+proper+had+a+2007+estimated+population+of+608%2C352%2C+making+it+the+twenty-first+largest+in+the+country.+Boston+is+also+the+anchor+of+a+substantially+larger+metropolitan+area+called+Greater+Boston%2C+home+to+4.4+million+people+and+the+tenth-largest+metropolitan+area+in+the+country.+Greater+Boston+as+a+commuting+region+includes+parts+of+Rhode+Island%2C+New+Hampshire+and+Maine%3B+it+includes+7.4+million+people%2C+making+it+the+fifth-largest+Combined+Statistical+Area+in+the+country.%0Awww.boston.com%0A++++++++%0A++++++++%0A%0A%3C%2FBODY%3E%3C%2FDOCUMENT%3E&paramsXML=++++%3Cc%3Aparams+xmlns%3Ac%3D%22http%3A%2F%2Fs.opencalais.com%2F1%2Fpred%2F%22+xmlns%3Ardf%3D%22http%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%22%3E%0A++++%3Cc%3AprocessingDirectives+c%3AcontentType%3D%22TEXT%2FXML%22+%0A++++++++++++++++++++++++++++c%3AoutputFormat%3D%22XML%2FRDF%22%0A++++++++++++++++++++++++++++c%3AcalculateRelevanceScore%3D%22true%22%3E%0A++++%3C%2Fc%3AprocessingDirectives%3E%0A++++%3Cc%3AuserDirectives+c%3AallowDistribution%3D%22true%22+%0A++++++++++++++++++++++c%3AallowSearch%3D%22true%22+++++++++++%0A++++++++++++++++++++++c%3AexternalID%3D%221229637901%22+%0A++++++++++++++++++++++c%3Asubmitter%3D%22Drupal%22%3E%0A++++%3C%2Fc%3AuserDirectives%3E%0A++++%3Cc%3AexternalMetadata%3E%0A++++++%3Cc%3Acaller%3EDrupal%3C%2Fc%3Acaller%3E%0A++++%3C%2Fc%3AexternalMetadata%3E%0A++++%3C%2Fc%3Aparams%3E [data] => testTwo2008-12-18 14:49:23  Drupal  Drupal8b49730c-8253-1350-732e-fbf955c836a1digestalg-1|y9FjrBZKymDx/CqNtRbvT1o5nWY=|Mur4R7ohnQqJGx9F8SC+l45jqjoDIPOzLj783fjhVMInf0c3Lrol0g==CalaisBusiness_Finance1.000New England[ cities in the United States. The largest city in ]New England[, Boston is considered the economic and cultural] cities in the United States. The largest city in New England, Boston is considered the economic and cultural27211[ sometimes regarded as the unofficial "Capital of ]New England[." Boston city proper had a 2007 estimated] sometimes regarded as the unofficial "Capital of New England." Boston city proper had a 2007 estimated410110.321Rhode Island[Boston as a commuting region includes parts of ]Rhode Island[, New Hampshire and Maine; it includes 7.4]Boston as a commuting region includes parts of Rhode Island, New Hampshire and Maine; it includes 7.4769120.093Massachusetts,United States42.3-71.8Boston[ ]Boston[ is the capital and largest city of the] Boston is the capital and largest city of the1216[United States. The largest city in New England, ]Boston[ is considered the economic and cultural center]United States. The largest city in New England, Boston is considered the economic and cultural center2856[as the unofficial "Capital of New England." ]Boston[ city proper had a 2007 estimated population of]as the unofficial "Capital of New England." Boston city proper had a 2007 estimated population of4246[it the twenty-first largest in the country. ]Boston[ is also the anchor of a substantially larger]it the twenty-first largest in the country. Boston is also the anchor of a substantially larger5386[metropolitan area in the country. Greater ]Boston[ as a commuting region includes parts of Rhode]metropolitan area in the country. Greater Boston as a commuting region includes parts of Rhode72260.816Commonwealth Day[ Boston is the capital and largest city of the ]Commonwealth[ of Massachusetts, and is one of the oldest] Boston is the capital and largest city of the Commonwealth of Massachusetts, and is one of the oldest167120.330New Hampshire,United States43.654083-71.56421Maine,United States44.693165-69.33462United States40.423-98.73722Maine[parts of Rhode Island, New Hampshire and ]Maine[; it includes 7.4 million people, making it the]parts of Rhode Island, New Hampshire and Maine; it includes 7.4 million people, making it the80150.093United States[and is one of the oldest cities in the ]United States[. The largest city in New England, Boston is]and is one of the oldest cities in the United States. The largest city in New England, Boston is237130.330Massachusetts[capital and largest city of the Commonwealth of ]Massachusetts[, and is one of the oldest cities in the United]capital and largest city of the Commonwealth of Massachusetts, and is one of the oldest cities in the United183130.330Rhode Island,United States41.7-71.5www.boston.com[Combined Statistical Area in the country. ]www.boston.com[ </BODY></DOCUMENT>]Combined Statistical Area in the country. www.boston.com  </BODY></DOCUMENT>910140.093New Hampshire[ commuting region includes parts of Rhode Island, ]New Hampshire[ and Maine; it includes 7.4 million people,] commuting region includes parts of Rhode Island, New Hampshire and Maine; it includes 7.4 million people,783130.093  [headers] => Array ( [Connection] => close [X-Lighty-Magnet-Uri-Path] => /enlighten/rest/ [X-Mashery-Responder] => proxyworker-i-fc7bde95.mashery.com [Content-Type] => text/xml; charset=utf-8 [X-Aspnet-Version] => 2.0.50727 [Server] => Microsoft-IIS/6.0 [Cache-Control] => private [Date] => Thu, 18 Dec 2008 22:05:02 GMT [X-Powered-By] => ASP.NET [Accept-Ranges] => bytes [Content-Length] => 22885 ) [code] => 200 ) 
bacchus101’s picture

I am not sure what happened between point A and point B, but it is now pulling the terms for some CCK content types.

Here is one that is not:

Array ( [licenseID] => xxxxxxxxxxxxxxx [content] => Wed, 17 Dec 2008 14:33:48 -0500 Description:  Anthem by ShiraGirl, featuring Devyn Simone from The Real World Brooklyn. Filmed in Brooklyn at Public Assembly on September 19th, 2008. Imagine Chicks on Speed for the new generation " complete with their own unique, electro-punk style. Enter: Shiragirl " the fiercest girl band to hit the music scene in awhile. But classifying them as just a "girl band" is like calling Elvis just a pop star or Michael Jackson just a poor white guy with a bad nose " there is so much more to the story. Starting three years ago, Shiragirl has covered a lot of ground in the underground music scene; crashing every Warped Tour being just one of their many accomplishments. Shiragirl consists of Shira (vocals), DJ Lava, Lux, Flash, and Marisa and currently auditioning for a keyboardist to add to their already unique sound. Shiragirl's self-proclaimed philosophy of "make your own" hits the punk rock nail right on the head. They pride themselves in making their own sound, stage, and even tour bus (they painted it bright pink). They are rowdy; they are feministy; and they are just so goddamn trendy. Just like their musical predecessors (the aforementioned Chicks on Speed), Shiragirl has broken down so many walls between genres that they’ve become a huge melodious fusion of hip-hop, punk, electronica, metal, whatever; you name it they've got it. It's as if they've stolen thousands of car parts from famous musicians and have stuck them all together to create the ultimate work of art (or possibly a hot pink tour bus). But redefining the term D.I.Y. or providing a stage at Warped tour for so many female indie artists is just a taste of what Shira strives to accomplish. Whether it be her music, her stage presence or even her Warped Tour stage Shira refuses to allow anyone or any label to control her destiny. She is determined to expand and explore with her music as she proves with her latest line up and stage performance. "This is Shiragirl." Shira says shortly after opening for none other than the original riot grrl and queen of rock, Joan Jett at the Starland Ballroom in Sayreville, NJ. When asked what it felt like to be opening for a legend such as Joan Jett, Shira said, " It was an incredible honor to open up for Joan Jett! An original riot grrrl! We got contacted by Blackheart Records and asked to play the show. I'll never forget looking over to the side of the stage and seeing her watching... and my mind went black hahaha.... but it was an awesome experience and I feel so lucky to have gotten that chance." Delivering what this writer believes is her most daring and energentic stage show yet, Shira gives us a taste of just what she'll be flaunting at this year's Warped Tour. Backed by a solid and experienced line up Shira now explores the stage freedom she has held back so often in the past. The result, 30 minutes of turbo charged Punk-rap-electronica at its best, flowing along so rapidly and smoothly when you finally take a moment and step back you realize you have entered a new sound and world you have never experienced before.  [paramsXML] =>       Drupal   ) stdClass Object ( [request] => POST /enlighten/rest/ HTTP/1.0 Host: api.opencalais.com User-Agent: Drupal (+http://drupal.org/) Content-Length: 4588 Content-Type: application/x-www-form-urlencoded licenseID=pkp95spkgs3n8rz82yzj6wxd&content=%3CDOCUMENT%3E%3CTITLE%3EAnthem+-+ShiraGirl+Featuring+Devyn+Simone%3C%2FTITLE%3E%3CDATE%3EWed%2C+17+Dec+2008+14%3A33%3A48+-0500%3C%2FDATE%3E%3CBODY%3E%0A++++%0A++++++++++++%0A++++++++++++++++++++++++++++%0A++++++++++%0A++++++++++%0A++++++++++%0A++++++++++%0A++++++++++%0A++++++++++%0A++++++++++%0A++++++++++%0A++++++++++%0A++++++++++++++++%0A++++++++%0A%0A%0A++++++Description%3A%26nbsp%3B%0A++++%0A++++++++++++%0A++++++++++++++++++++Anthem+by+ShiraGirl%2C+featuring+Devyn+Simone+from+The+Real+World+Brooklyn.+Filmed+in+Brooklyn+at+Public+Assembly+on+September+19th%2C+2008.+%0AImagine+Chicks+on+Speed+for+the+new+generation+%22+complete+with+their+own+unique%2C+electro-punk+style.+Enter%3A+Shiragirl+%22+the+fiercest+girl+band+to+hit+the+music+scene+in+awhile.+But+classifying+them+as+just+a+%22girl+band%22+is+like+calling+Elvis+just+a+pop+star+or+Michael+Jackson+just+a+poor+white+guy+with+a+bad+nose+%22+there+is+so+much+more+to+the+story.%0A++++Starting+three+years+ago%2C+Shiragirl+has+covered+a+lot+of+ground+in+the+underground+music+scene%3B+crashing+every+Warped+Tour+being+just+one+of+their+many+accomplishments.+Shiragirl+consists+of+Shira+%28vocals%29%2C+DJ+Lava%2C+Lux%2C+Flash%2C+and+Marisa+and+currently+auditioning+for+a+keyboardist+to+add+to+their+already+unique+sound.%0A++++Shiragirl%27s+self-proclaimed+philosophy+of+%22make+your+own%22+hits+the+punk+rock+nail+right+on+the+head.+They+pride+themselves+in+making+their+own+sound%2C+stage%2C+and+even+tour+bus+%28they+painted+it+bright+pink%29.+They+are+rowdy%3B+they+are+feministy%3B+and+they+are+just+so+goddamn+trendy.+Just+like+their+musical+predecessors+%28the+aforementioned+Chicks+on+Speed%29%2C+Shiragirl+has+broken+down+so+many+walls+between+genres+that+they%E2%80%99ve+become+a+huge+melodious+fusion+of+hip-hop%2C+punk%2C+electronica%2C+metal%2C+whatever%3B+you+name+it+they%27ve+got+it.+It%27s+as+if+they%27ve+stolen+thousands+of+car+parts+from+famous+musicians+and+have+stuck+them+all+together+to+create+the+ultimate+work+of+art+%28or+possibly+a+hot+pink+tour+bus%29.%0A++++But+redefining+the+term+D.I.Y.+or+providing+a+stage+at+Warped+tour+for+so+many+female+indie+artists+is+just+a+taste+of+what+Shira+strives+to+accomplish.+Whether+it+be+her+music%2C+her+stage+presence+or+even+her+Warped+Tour+stage+Shira+refuses+to+allow+anyone+or+any+label+to+control+her+destiny.+She+is+determined+to+expand+and+explore+with+her+music+as+she+proves+with+her+latest+line+up+and+stage+performance.+%22This+is+Shiragirl.%22+Shira+says+shortly+after+opening+for+none+other+than+the+original+riot+grrl+and+queen+of+rock%2C+Joan+Jett+at+the+Starland+Ballroom+in+Sayreville%2C+NJ.%0A++++When+asked+what+it+felt+like+to+be+opening+for+a+legend+such+as+Joan+Jett%2C+Shira+said%2C+%22+It+was+an+incredible+honor+to+open+up+for+Joan+Jett%21+An+original+riot+grrrl%21+We+got+contacted+by+Blackheart+Records+and+asked+to+play+the+show.+I%27ll+never+forget+looking+over+to+the+side+of+the+stage+and+seeing+her+watching...+and+my+mind+went+black+hahaha....+but+it+was+an+awesome+experience+and+I+feel+so+lucky+to+have+gotten+that+chance.%22%0A++++Delivering+what+this+writer+believes+is+her+most+daring+and+energentic+stage+show+yet%2C+Shira+gives+us+a+taste+of+just+what+she%27ll+be+flaunting+at+this+year%27s+Warped+Tour.+Backed+by+a+solid+and+experienced+line+up+Shira+now+explores+the+stage+freedom+she+has+held+back+so+often+in+the+past.+The+result%2C+30+minutes+of+turbo+charged+Punk-rap-electronica+at+its+best%2C+flowing+along+so+rapidly+and+smoothly+when+you+finally+take+a+moment+and+step+back+you+realize+you+have+entered+a+new+sound+and+world+you+have+never+experienced+before.%0A++++++++%0A++++++++%0A%0A%3C%2FBODY%3E%3C%2FDOCUMENT%3E&paramsXML=++++%3Cc%3Aparams+xmlns%3Ac%3D%22http%3A%2F%2Fs.opencalais.com%2F1%2Fpred%2F%22+xmlns%3Ardf%3D%22http%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%22%3E%0A++++%3Cc%3AprocessingDirectives+c%3AcontentType%3D%22TEXT%2FXML%22+%0A++++++++++++++++++++++++++++c%3AoutputFormat%3D%22XML%2FRDF%22%0A++++++++++++++++++++++++++++c%3AcalculateRelevanceScore%3D%22true%22%3E%0A++++%3C%2Fc%3AprocessingDirectives%3E%0A++++%3Cc%3AuserDirectives+c%3AallowDistribution%3D%22true%22+%0A++++++++++++++++++++++c%3AallowSearch%3D%22true%22+++++++++++%0A++++++++++++++++++++++c%3AexternalID%3D%221229638262%22+%0A++++++++++++++++++++++c%3Asubmitter%3D%22Drupal%22%3E%0A++++%3C%2Fc%3AuserDirectives%3E%0A++++%3Cc%3AexternalMetadata%3E%0A++++++%3Cc%3Acaller%3EDrupal%3C%2Fc%3Acaller%3E%0A++++%3C%2Fc%3AexternalMetadata%3E%0A++++%3C%2Fc%3Aparams%3E [data] => Document conversion error. Please make sure that the content-type (passed through the paramsXML) matches this document contents. [ErrorMessage: org.dom4j.DocumentException: Error on line 18 of document : The entity "nbsp" was referenced, but not declared. Nested exception: The entity "nbsp" was referenced, but not declared. ]  [headers] => Array ( [Connection] => close [X-Lighty-Magnet-Uri-Path] => /enlighten/rest/ [X-Mashery-Responder] => proxyworker-i-1c36ec75.mashery.com [Content-Type] => text/xml; charset=utf-8 [X-Aspnet-Version] => 2.0.50727 [Server] => Microsoft-IIS/6.0 [Cache-Control] => private [Date] => Thu, 18 Dec 2008 22:11:03 GMT [X-Powered-By] => ASP.NET [Accept-Ranges] => bytes [Content-Length] => 512 ) [code] => 200 ) 

...and another:

Array ( [licenseID] => xxxxxxxxxxxxxxx [content] => Thu, 18 Dec 2008 12:17:35 -0500  A beached boat near the exterior of The Real World: Brooklyn house.  [paramsXML] =>       Drupal   ) stdClass Object ( [request] => POST /enlighten/rest/ HTTP/1.0 Host: api.opencalais.com User-Agent: Drupal (+http://drupal.org/) Content-Length: 1198 Content-Type: application/x-www-form-urlencoded licenseID=pkp95spkgs3n8rz82yzj6wxd&content=%3CDOCUMENT%3E%3CTITLE%3EReal+World+Brooklyn+House+Photos%3C%2FTITLE%3E%3CDATE%3EThu%2C+18+Dec+2008+12%3A17%3A35+-0500%3C%2FDATE%3E%3CBODY%3E%0A++++%0A++++++++++++%0A++++++++++++++++++++++++++++%0A++++++++%0A%0AA+beached+boat+near+the+exterior+of+The+Real+World%3A+Brooklyn+house.%0A%3C%2FBODY%3E%3C%2FDOCUMENT%3E&paramsXML=++++%3Cc%3Aparams+xmlns%3Ac%3D%22http%3A%2F%2Fs.opencalais.com%2F1%2Fpred%2F%22+xmlns%3Ardf%3D%22http%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%22%3E%0A++++%3Cc%3AprocessingDirectives+c%3AcontentType%3D%22TEXT%2FXML%22+%0A++++++++++++++++++++++++++++c%3AoutputFormat%3D%22XML%2FRDF%22%0A++++++++++++++++++++++++++++c%3AcalculateRelevanceScore%3D%22true%22%3E%0A++++%3C%2Fc%3AprocessingDirectives%3E%0A++++%3Cc%3AuserDirectives+c%3AallowDistribution%3D%22true%22+%0A++++++++++++++++++++++c%3AallowSearch%3D%22true%22+++++++++++%0A++++++++++++++++++++++c%3AexternalID%3D%221229638401%22+%0A++++++++++++++++++++++c%3Asubmitter%3D%22Drupal%22%3E%0A++++%3C%2Fc%3AuserDirectives%3E%0A++++%3Cc%3AexternalMetadata%3E%0A++++++%3Cc%3Acaller%3EDrupal%3C%2Fc%3Acaller%3E%0A++++%3C%2Fc%3AexternalMetadata%3E%0A++++%3C%2Fc%3Aparams%3E [data] => Real World Brooklyn House Photos2008-12-18 12:17:35  Drupal  Drupal8b49730c-8253-1350-732e-fbf955c836a1digestalg-1|C2yS/qSaQRiKOF21sl4vhoL+SDI=|Ej48jMnOQOkvPDOcVazXv0fXuZJDtkX0rdDZ/wLjRBuOBBK6TgrkfQ==  [headers] => Array ( [Connection] => close [X-Lighty-Magnet-Uri-Path] => /enlighten/rest/ [X-Mashery-Responder] => proxyworker-i-ef7bde86.mashery.com [Content-Type] => text/xml; charset=utf-8 [X-Aspnet-Version] => 2.0.50727 [Server] => Microsoft-IIS/6.0 [Cache-Control] => private [Date] => Thu, 18 Dec 2008 22:13:21 GMT [X-Powered-By] => ASP.NET [Accept-Ranges] => bytes [Content-Length] => 2266 ) [code] => 200 ) 
bacchus101’s picture

Yes, it is working now with new content. I just posted a test page in a CCK field and it rendered the terms.

I am not sure what I did between when I first installed the patch and where I am now, but I have been doing a variety of things with my drupal installation over the last few hours (adding and removing other modules) so I can't be sure what changed the environ.

febbraro’s picture

Ok, the 1st one is working, that is great.

2nd one is not working, you need to also apply the patch on #344279: no tags created from comment #17

3rd one. I have a feeling that the text is too short to find any entities within it.

bacchus101’s picture

That is what I thought with the 3rd one (although I was secretly hoping it could cull data from short descriptions of images and such.)

I applied the other patch and all is well. It created the terms for the 2nd one.

Thanks!

febbraro’s picture

Status: Needs review » Fixed

Nice. Glad it is all worked out. Enjoy.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

Johnny vd Laar’s picture

the problem with this line:
$loaded_node = node_build_content($node);

is that it changes the $node->body variable

we have built a module that sends the $node->body to another webservice as well. but the node_build_content function wrapped the $node->body with html tags

so we replaced this line with:
$loaded_node = node_build_content(clone $node);

such that the function works on a clone of the $node object, and thus doesn't change the $node object

Johnny vd Laar’s picture

Status: Closed (fixed) » Needs review

forgot to reopen the issue

febbraro’s picture

Status: Needs review » Closed (fixed)

@Johnny vd Laar I created a new issue just for this. #412796: Calais changes the node body when agregating CCK fields

Anonymous’s picture

Status: Closed (fixed) » Needs review

Does it make sense to let Calais process all CCK field labels? In my case, I have a list of hundreds of CCK field per node type. The labels are not interesting (e.g. WebDAV Support). Now all my nodes get a Calais for WebDAV Support, which is nonsense.

So I'd need to be able to specifiy which CCK fields and when to process them, e.g. if WebDAV Support == Yes. Then it would make sense again for me.

Anonymous’s picture

I decided to increase the relevancy threshold for terms. But it's not a real solution -- CCK field labels shouldn't be indexed, to my opinion.

febbraro’s picture

Status: Needs review » Closed (won't fix)

We can't expect this module to know what you intend or don;t intend to get processed with OpenCalais. Labels might be a part of the content for all I know. If they are not useful then why are they rendered on the node body itself? That said, I do provide a hook hook_calais_body_alter that will allow you do change the body however you want like possibly render it without the labels, etc.