Handling bibtex's {Protected} capitalization

macko - September 28, 2008 - 13:03
Project:Bibliography Module
Version:HEAD
Component:Code
Category:feature request
Priority:normal
Assigned:Unassigned
Status:active
Description

In bibtex, one uses curly brackets to protect capitalization – so that, for example, {3D} or {DNA} or {BioSysComp} won't turn into 3d or dna or biosyscomp.

I think these brackets should stay in the database as they are imported, but when displaying titles, I believe they could be removed... curly brackets are extremely rarely used in titles so removing them (only for display) would improve the look of bibliographies.

#1

rene_w - March 14, 2009 - 16:24

subscribing

#2

Frank Steiner - March 16, 2009 - 10:42

This problem is not trivial and has been discussed some times before. biblio uses the normal node title for the publication title, and the node title can be used at so many different places from so many different modules that you cannot keep the brackets in there. So I guess one would need to have an additional biblio_title field to store the (imported or entered) title with brackets while still using node->title for the unbracketed version.

We currently embrace the whole title when exporting (via our famous bibtex hook ;-)) so that the title is at least saved as a whole.

Ron, do you think that using an additional biblio->bibtex-title or similar could be a solution?

#3

rene_w - March 16, 2009 - 14:24

From my (end user's) perspective, it's important to have round-trip support for bibtex files. I work with LaTeX, so I need to get the exact same bibtex out of Biblio that I put in, which includes quirks like the {} thing in title/booktitle.

I suppose a separate node-title could work, but then there's a new problem in that changes to one of the titles must be reflected in the other (lest you end up with two different titles for the same entry). Presenting only the bibtex title (braces and all) for editing and automatically deriving the node title from it could work.

Note that there are also math-mode braces in BibTeX titles ("My {$\srqt{\pi^2}$} cents"}, but I image that'll be even more difficult to handle unless you embed tex4ht or somesuch.

#4

rjerome - March 16, 2009 - 19:16

I'd say it's pretty hard to guarantee round trip support, mainly because we are not dealing only with bibtex data. One way might be to simply put a filter in which would remove any remaining {} when the node is displayed but they would remain in the database so on export they should still be in the export data file.

#5

Frank Steiner - March 16, 2009 - 22:16

I tried this once but one will fail to handle at least the node title because an input filter doesn't apply to the title. And so you will see it in any views application and I don't know where else. However, this might be a minor drawback that one might be able to live with.

#6

Tecktron - April 24, 2009 - 01:41

[Subscribing]

I agree that this would be great for display purposes only, so that when it's exported it comes directly from the DB where it's stored correctly.
I'd do something along the lines of a preg_replace_callback to simply remove them.
I'll work on the function with a regex and callback so it removes the {} from any string you pass to it, eg:

"This is {CAPITAL} {T}ext in {3D}!" would become simply "This is CAPITAL Text in 3D!"

Then all someone has to do would be to implementing the function where it needs to go =)

Would also be cool and convenient for an option to turn the feature on or off in the admin menu.
Something like: ($remBraceOpt==TRUE)?remBraceFunc($title_string):$title_string;

Would be a nice feature in my opinion. Expect something from me soon.

Thanks,
--Craig

#7

masood_mj - April 27, 2009 - 11:08

subscribing
would you help me to write a sql query to remove { } from biblio tables?

#8

Tecktron - May 1, 2009 - 22:19

So this takes in a string and removes the braces. The idea is this would be on a db pull and display, but never for saving into the db (you want the braces for export). I've made it so it ignores {$...$} in order to save any math notation. Give it a spin and let me know what you think.

OK Here is the code for the functions:

<?php
function remBraceFunc($title_string){
   
$title_string=utf8_encode($title_string);
   
$matchpattern='/\{\$(?:(?!\$\}).)*\$\}|(\{[^}]*\})/';
   
$output=preg_replace_callback($matchpattern,'remBraceFromMatch',$title_string);
    return
$output;
}

function
remBraceFromMatch($match){
        if(isset(
$match[1])){
               
$braceless=str_replace('{','',$match[1]);
               
$braceless=str_replace('}','',$braceless);
                return
$braceless;
        }
        return
$match[0];
}

$displayOutput=remBraceFunc($title_string);
?>

Now it just needs to be implemented into the hook... Interested Ron? Anybody?

Thanks,
--Craig

#9

rjerome - May 2, 2009 - 01:51

Thanks Craig,

I thought maybe I could use hook_nodeapi and the "view" op to achieve this, and it almost works but the page title seems to get rendered prior to this being called so the braces still show up in the page title.

I'll dig around a bit more, but we're close.

Ron.

#10

Tecktron - May 15, 2009 - 01:12

Ron,

I've been looking through this a bit (and slowly coming to grasps with Drupal), and I think that "load" case of the nodeapi would work. That's what I see some of the page title modules use (beyond that part I'm lost, but still learning). Maybe give that a shot.

Thanks,
--Craig

#11

rjerome - May 15, 2009 - 02:09

I did try that but the problem with "load" is that it's too early in the process, so if you edit the node, the you will see that the braces are gone, then if you save it the braces will be gone for good. The problem with view is it's just a little too late in the render process so the node "contents/body" are OK, but the title is rendered prior to calling nodeapi with the "view" op so the page title still has the braces.

I just came up with the solution, which is sort of a hybrid, the braces are stripped off in hook_view and then drupal_set_title is called in hook_nodeapi (op == 'view') to reset the title to the braceless version. Using this method, the braces are save in the database, and you will see them when you edit the node but they are not displayed.

#12

Tecktron - May 27, 2009 - 20:34

Brilliant!
Thanks!

I can't wait to see it in action!

--Craig

#13

rjerome - May 27, 2009 - 21:36

No need to wait Craig, you can try it now in the -dev version...

Ron.

#14

nicomat - September 24, 2009 - 22:21

subscribing

#15

rjerome - September 25, 2009 - 11:44

In case you were wondering Nico, this feature is in the 1.6 version.

Ron.

#16

nicomat - September 27, 2009 - 19:51

Hm, unconfirmed. -- I'm still getting the issue with 1.6; see, for example, the file at http://drupal.org/node/584616#comment-2088188.

Cheers,
Nico

 
 

Drupal is a registered trademark of Dries Buytaert.