I had a situation in which users were given the ability to embed content from a third-party provider. The default embed code contained a bunch of junk tags, including wrapping itself in a div with forced styles, and links to the provider's site, the user's profile on the provider's site, and links to pages on the provider's site with listings for every tag on the content.
I used computed field to strip out everything except the content between the
<object> tags and it worked great!
First, create a plaintext field for the embed code. Then, create a computed field using the snippet below. The snippet assumes the plaintext field is named
// grabs the value of the embed code from the plaintext field
$body = $node->field_embed_code['value'];
// removes any whitespace
$body = preg_replace('/\s\s+/', ' ', $body);
// matches everything between the object tags, and nothing else
$pattern = '/<object[^>]*>(.*?)<\/object>/';
preg_match($pattern, $body, $matches);
// adds back the object tags and returns the value
$node_field['value'] = '<object>' . $matches . '</object>';
Note that this returns only the first match of tag pairs, if there are multiple matches (that's what
$matches is for).
This could be used for matching the content between any given tags.