use extractFormat as 'text' [#616426]

If extractOnly is true, additional input parameters we can use is:

extractFormat=xml|text - Default is xml. Controls the serialization format of the extract content. xml format is actually XHTML, like passing the -x command to the tika command line application, while text is like the -t command.

I had planned to include this for the last weeks since I knew my patch got into Solr, but forgot in my excitement of getting this module working at all with Solr in the last few days.

Probably doesn't matter much since we are stripping out all tags, but should give even greater consistency between using tika and Solr.

Comment	File	Size	Author
#2	text-format-616426-1.patch	897 bytes	pwolanin
#1	text-format-616426-1.patch	897 bytes	pwolanin

Comments

Comment #1

pwolanin commented 27 October 2009 at 23:45

Status:

Active

» Needs review

Status	File	Size
new	text-format-616426-1.patch	897 bytes

Comment #2

pwolanin commented 27 October 2009 at 23:57

Status:

Needs review

» Fixed

Status	File	Size
new	text-format-616426-1.patch	897 bytes

committed

Comment #3

11 November 2009 at 00:00

Status:

Fixed

» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

use extractFormat as 'text'

Comments

Comment #1

Comment #2

Comment #3

News items

Our community

Documentation

Drupal code base

Governance of community