Some old html dont have proper Tag ID.

Is it possible to choose content before and after text? (similar to Yahoo Pipes)

For example: import content after text text1 and before text2
and also: import content start from text text1 until text2

Comments

dman’s picture

Version: 6.x-1.x-dev » 7.x-1.x-dev
Status: Active » Postponed

Clearing the old 6.x issues from the issue queue for a cleanup.

The very first D4 version DID have token-based and regular-expression-based data extraction, but Only DOM-based methods have gone forward since. What you want CAN be done in a custom module that uses the HOOK_import_html() callback, where you can do your own process on the raw text and add it to the new $node->body
It would be an OK extension and I'd support someone who had a go at it, but won't be built on the current roadmap.