By hyperlogos on
Hi, I'm running Drupal 5.7 with quotes module and I somehow created lots of duplicate quotes. Since you cannot sort nodes by body with the views module (why not?! if I want to kill my database, let me!) there is no trivial way to do this with the views module. The duplicate nodes do not have the same titles, only the same body. How can I locate and remove the duplicates?
Comments
To view the body of the
To view the body of the nodes, log into mysql and run the following:
This will return all nodes in your database orderd by the body, which is the node content, and the node title. You will see the duplicates that way.
Depending on your SQL skills, you could add more to the statement so that you only return the nodes that have duplicate body content.
Use this list to determine what you want to do with the duplicate content within your Drupal install. I would not recommend trying to address this via SQL unless you really know and understand the Drupal schema.
An even better SQL
An even better SQL solution:
SELECT a.nid, b.nid, a.title, b.title FROM node_revisions a, node_revisions b WHERE a.nid < b.nid AND a.body = b.body;This will show you only the node ids and title of the nodes that are duplicates. Much easier to sift through :-)
Shows nodes with empty contents
How do I suppress nodes whose body is empty? Lots of nodes have no body (mostly images.) I tried adding
AND a.body != ''to the SQL statement but then I got not results. Of course, it is entirely possible that the duplicate nodes somehow have slightly different body content due to having different input formats... in which case the whole thing gets really horribly complicated, I bet. Perhaps I could match on the first n characters of the body, I'll go look that up.