Tech Spechs:
Drupal 5.4
Glossary 1.8

I use FCKEditor to edit articles, so my articles are sometimes full of html code. All tags I use are in the list of tags to omit by the glossary filter. That works, but the filter always adds the less-than (<) sign at the end of the tag. (Could probably just a replacement one char too long).

CommentFileSizeAuthor
#25 glossary.module.mb_.25.patch5.81 KBmwrochna

Comments

ohnemax’s picture

Title: Filter adds < on the end of every parsed HTML Tag » Filter adds > on the end of every parsed HTML Tag

Sorry, it adds an greater than >

nancydru’s picture

I've seen this and it was something I was doing. I wish I could remember what it was.

ohnemax’s picture

I edited the module and removed the "<" out of the list of standard opening tags and the ">" of the standard closing tags. I now this is unsecure in the sense of somebody could destroy html code by having a glossary term "div" or whatever, but this can easily be controled. without it, the module was worthless for me because of the added signs.
I couldn't exactly find out where they came from, thats why I used this dirty trick.

nancydru’s picture

Well, it works for you. I'm still wracking my pea-brain trying to remember what I did that caused this. It was pretty simple, I know that.

zugaldia’s picture

I have the same problem with version 6.x-1.0-beta4. It also adds an extra 'a' to the link closing (), breaking the markup. This makes the module unusable, I haven't been able yet to trace the origin of the problem.

shane birley’s picture

This is occurring for me as well. It is a recent issue and I don't see where this is coming from. Looks like it started soon after the latest release. I am running Drupal 5.7 and Glossary version 2.1.

nancydru’s picture

Can someone help track this down, please? Without being able to recreate it, I can't fix it.

shane birley’s picture

One thing that may be noted is that this behaviour only has appeared on a Windows IIS server. Does anyone have an opinion on that? PHP modules missing, perhaps?

shane birley’s picture

Do you want to check out the site I have this module on?

nancydru’s picture

Sure

nancydru’s picture

Status: Active » Postponed (maintainer needs more info)
zugaldia’s picture

I am using a Windows IIS server as well, with the same problem.

nancydru’s picture

I have no way to test IIS, so that's not helping resolve this problem. Have any of you searched the forums for IIS problems? I know there are many, but not whether they may be in effect here.

zugaldia’s picture

nancyw, can you point us exactly what part of the glossary code is involved in this? With that, I'll try to track which php module / server component is messing the things up. Thanks!

nancydru’s picture

I would think it would be in _glossary_insertlink (around line 355).

zugaldia’s picture

OK, as stated before, the obvious hack is to check the value of $_SERVER["SERVER_SOFTWARE"] to see if it's "Microsoft-IIS/6.0" or similar. In that case you need to remove the more-than and less-than signs from $open_tags and $close_tags. Clean the cache, and it'll be working fine.

Why is this happening? Haven't found out yet...

nancydru’s picture

That would be a last resort. In general, Drupal modules do not include browser-specific code. And I suspect that removing the < and > would break the code (consider having every "b" in your node confused with the bold tag, for example).

shane birley’s picture

This behaviour seems to have disappeared in situation.

nancydru’s picture

That's what happened to me. I know I changed something but can't remember what. And I can't make it happen again.

shane birley’s picture

This was very weird. I didn't change much either, it just...disappeared.

nancydru’s picture

Do you know what you changed at all? I know it was something really small. Maybe just the order of the filters.

robert castelo’s picture

I'm seeing the same bug.

On the live site I get '>' added to all links but on the development site, same code base and database but different server, it works fine.

I strongly suspect that the cause is the PHP mbstring extension being missing on the live site.

nancydru’s picture

Mbstring code be a big part of it, since Glossary needs that for non-English support. Ask your host to install it - it only takes a minute. They need to load "php_mbstring" from the extensions directory (this is in php.ini).

Are the PHP versions the same on the two sites? More and more frequently, I'm seeing problems with people who are on very old (4.3.x) versions of PHP.

robert castelo’s picture

PHP Version 4.4.4

I've asked the hosting company (Zen) to install php_mbstring, but haven't heard back from them for a few days. They also refuse to allow cron, so hopefully this will be the deal breaker and I can persuade the client to move to a proper host.

mwrochna’s picture

Status: Postponed (maintainer needs more info) » Needs review
StatusFileSize
new5.81 KB

The patch should work with mbstring on, mbstring off with plain (ascii) text, it could break on foreign texts with mbstring off, but I didn't manage to do so.

The problem is there's no 'drupal_strpos' that would do multibyte handling by hand - so we have to use plain strpos() when there's no mbstring. But strpos() returns indexes incompatible with multibyte functions (mb_ and drupal_), so plain substr() has to be used (combining mb with plain functions will surely fail with multibyte texts, dunno why it fails with plain text).
The patch uses plain substr and strlen when there's no mbstring - it works perfectly with plain text and it doesn't mean utf8 texts won't work - there's just a small probability that single bytes from multibyte characters will match single-byte searches. I think the only place where it could break is when [multibyte character here]foo is searched for 'foo' - the mb character could mismatch in _glossary_is_boundary(). For this reason I would also try turning mb_ off to see if it doesn't improve speed a lot. I'm not sure, I don't know much about utf8, but I tested the patch with mbstring off on a polish text with blocking tags here and there.
When mbstring is on, only mb_ functions are used, they would be called anyway by drupal_ functions - you can change $drupal_prefix back to 'drupal_', you can also delete it and use $mb_prefix instead.
I left drupal_strtolower() functions, because they're not using the indexes returned by strpos(), so that's safe, and it may allow some non-ascii characters to be searched case-insensitively. I also left drupal_substr() calls in some other places, they're ok (and needed) there.

P.S.: I'm not sure I correctly made the patch, I used DRUPAL-6--1 versions.

nancydru’s picture

Thanks, Marcin. It looks okay. I will be applying it shortly.

NecroHill’s picture

Thats patch works just fine for me. Thank you very much mate.
to developers: why don't you replace released verstion 1.3 with this patched one yet?

nancydru’s picture

Because it should be tested before forcing people to install it. And, with 18 modules and 16 web sites to maintain, I keep pretty busy. Someone with cash can change my priorities.

nancydru’s picture

Status: Needs review » Patch (to be ported)
kmillecam’s picture

FYI,

I was seeing this issue on a new site I was setting up.

I was using Glossary v5.x-2.6 so I applied the patch (by hand) and it fixed my problem.

PHP Version 5.2.6

HTH,
Kevin

nancydru’s picture

Assigned: Unassigned » nancydru
Status: Patch (to be ported) » Fixed

Applied to the 5.x branch finally.

nancydru’s picture

Status: Fixed » Closed (fixed)