DBLog module calls truncate_utf8() on an HTML string. This may break the string in the middle of an HTML tag and generate output like this:
<td><a href="/admin/reports/event/147">Session opened for <em ...</a></td>
Would it be okay just to display the entire string? Or would it be better to strip tags from the message before displaying it (this would allow it to be truncated without problems)?
| Comment | File | Size | Author |
|---|---|---|---|
| #19 | 718662-dblog-truncate-d7.patch | 9.61 KB | andypost |
| #17 | 718662-dblog-truncate-d7.patch | 9.52 KB | andypost |
| #12 | 718662-dblog-truncate-d7.patch | 6.52 KB | andypost |
| #10 | 718662-dblog-truncate-d7.patch | 6.48 KB | andypost |
| #8 | 718662-dblog-truncate-d7.patch | 6.47 KB | andypost |
Comments
Comment #1
c960657 commentedI also updated the tests to use the actual t() strings used in user.module, and to call check_plain() on the output from randomString() to verify that it is well-formed HTML.
Comment #2
c960657 commentedThis is blocking #535440: Random strings in SimpleTest should not contain $db_prefix.
Comment #3
damien tournoud commentedSounds reasonable. Cutting strings in the module just feels wrong anyway (that's clearly the job of the theme).
(That patch actually passed, but the results failed to be reported for some reason.)
Comment #4
andypostThis solves a problem but may break usability
Comment #5
andypostScreenshots added
Comment #6
berdirFor small strings, this is actually a huge improvement, but watchdog messages can quite big, not sure about that.....
See screenshots
Comment #7
Bojhan commentedLooked at this issue, there is no way to magically truncate the message to for example two sentences when there is a long message - because this depends on the width of the table.
Obviously future wise, we want to truncate this.
Comment #8
andypostNew extended patch:
- Recent log entries (admin/reports/dblog) displays truncated messages (filtered from html tags) because they have detailed view
- Top * entries prints whole message because they does not have details
- Fixed php-doc about difference of overview pages (maybe someone check my english grammar)
- tests are changed to follow this:
1) assertLogMessage() now checks for links with truncated messages
2) added test for event details page for a full message
3) added test for page-not-found display full message
Comment #9
damien tournoud commentedPlease remove that truncating all-together (as in previous patches). Truncating is not the job of the PHP code. The only proper way to truncate is at the theme level (and in javascript).
Comment #10
andypostForget about index argument for assertLink()
Comment #12
andypostLink for event details was not extracted right for testing bot...
Comment #13
andypost@Damien if we display full messages so details page makes mostly useless and listing became ugly-looking. So I ask for usability review.
Anyway now we have extended tests.
EDIT: I see no reason to move truncating to _preprocess or _theme function because it makes code more complicated.
Comment #14
damien tournoud commentedAgain, truncating is the job of the theme. That, and it can only reliably be performed in Javascript.
Comment #15
dries commentedThe entries in the watchdog listing page should be truncated.
Drupal does a lot of truncating because otherwise a lot of listing pages would blow up -- I don't think we do that consistently in the theme layer today. I suggest we keep the truncating in the module code for now, and that we move all truncating to theme functions in a follow-up issue. Let's keep this issue focused on fixing the bug.
Comment #16
Bojhan commentedNot sure I get it, if we don't do it consistently. Why wouldn't we do it the right way? Anyways, not to important to me :) So whatever is good.
Comment #17
andypostNew patch after #535440: Random strings in SimpleTest should not contain $db_prefix
Totally agree with Dries about short links on listing view and I do not like clogging up pages.
Formatting of message now done with theming!
Changed a test for a link presence at admin/reports/dblog, because variables could have special chars (<, >, &) xpath() in assertLink() could fail.
Added comments for checking the details page.
Comment #18
dries commentedCan we just call this 'log' or 'event' instead of 'dblog'?
Why do we need to do an is_object()?
Indentation issue. Tabs?
What is the legacy stuff again? I know it was in the original code but I don't remember what it was about. Now it is a theme function, it would be nice if we could remove all cruft (assuming it is cruft).
Indentation issue.
In documentation we write URL instead of url.
Comment #19
andypost"Event" seems more informative.
is_object() changed to check for required properties, same for $event->wid before trying to make a link
Legacy staff come from #24 #76588: Syslog module's serialization of log components possibly broken and this issue still open :)
So I change this comment to more informative.
Comment #20
andypostchanging status
Comment #21
andypostBump, patch still valid
Comment #22
damien tournoud commentedLooks good to me.
Comment #23
dries commentedCommitted to CVS HEAD. Thanks.
Comment #24
jhodgdonA question I asked on #269911: Search result trimming should not fall inside HTML entities/tags, which is about truncating for search results listing:
Does trucate_utf8( a, b, TRUE, TRUE) work with Chinese, Japanese, etc. where word boundaries are not spaces (I believe), and with other languages where maybe putting "..." after a truncated excerpt is not the right thing to do?
I don't know the answer to this question, but it applies to this issue, both before and after the patch that was just committed.
Comment #25
jhodgdonWe did get an answer to that question: It doesn't work with CJK languages. See
http://drupal.org/node/269911#comment-2795478
So the committed fix will NOT work for those languages.
Comment #26
andypostLet's mark this fixed and continue work on truncate_utf8() in #269911: Search result trimming should not fall inside HTML entities/tags
This issue about different thing
EDIT: or this issue #200185: truncate_utf8() is used as a substring function
Comment #27
jhodgdonAFAIK, there is no way to fix truncate_utf8() so that it will break on words, without knowing what language its input is in.
I don't think this issue is fixed. It can't break on word boundaries. It needs to use a different solution, like stripping tags before truncating.
Comment #28
jhodgdonSee also:
#768040: truncate_utf8() only works for latin languages (and drupal_substr has a bug)
Comment #29
jhodgdonOK, marking this back as fixed, as #768040 is on the verge of being committed, and the consensus is that if truncate_utf8() doesn't work as intended for non-Latin languages, it's a bug (and being taken care of on that issue one way or another).