Pathologic

Pathologic is an input filter for Drupal which attempts to make sure that links to other content in your Drupal installation, as well as images and other media, will always work correctly; the links won't "break" in situations which would otherwise cause them not to work. Maybe it could be explained better by listing some situations where Pathologic would come in handy…

  • You run a personal site, and the address of your site recently changed. Perhaps you moved to a shiny new domain name, or perhaps you moved the Drupal installation from one subdirectory to another. Now all the images in your content are broken, as well as links to other content on your site. You could go through all your content and update all the links… or you could install Pathologic.
  • You oversee a site which has testing and production servers. Copy-editors (and/or you) edit content on the testing server, and that eventually gets pushed over to the production server. When those darn editors link to other content on the site, they sometimes link to the version of it that's up on the test server; these links break when the content is published on the production server. You could flog the editors (and/or yourself) with a bullwhip for each infraction… or you could install Pathologic.
  • Your Drupal site has been up for a while, but you've recently discovered the Clean URLs feature and enabled it. Your links still work, but they still have that ugly "?q=" thing in them, and you have better things to do with your time than go through all your content to prettify the links. Or maybe you're going the other way; you used to have Clean URLs enabled, but you've had to disable it, and now your links are broken. Pathologic to the rescue!
  • Your site content uses relative paths (eg, <a href="tag/food/pizza"> which break for people reading your site's feed. You could just start using absolute paths instead, but you're too set in your ways and would rather have a tool do it for you.

Pathologic is designed to be a simple, set-it-and-forget-it utility. You don't need to type any special "tags" or special characters in your content to trigger Pathologic to work; it finds paths it can manage in your content automatically.

Installation

Pathologic is an input filter, so getting it installed and configured is a little bit more difficult than standard modules, but the instructions below will walk you through the process.

  1. Install the Pathologic module as normal. (If you're a total Drupal newbie, these instructions for installing modules may be helpful — and welcome to the community, by the way!)
  2. In the Administer menu, select "Input formats" from the "Site configuration" section. A list will appear of the various input formats your site uses. Find one in the list which you want to use Pathologic with, and click the "configure" link for that format.
  3. On the next page, find the section labeled "Filters." Check the box next to "Pathologic." All other options on this page can be left alone. Click the "Save configuration" button at the bottom.
  4. This will take you back to the same page, with a message telling you "The input format settings have been updated." Now, find the "Rearrange" tab at the top of the page and click it.
  5. This will bring you to a list of filters that this input format uses. Pathologic should be at the bottom of this list; if so, you don't have to do anything. If it is not, adjust the values in the "Weight" column so that Pathologic has the highest value. Click "Save configuration" when done.
  6. If you wish to use Pathologic with other input formats, go back to step 2 and repeat the process.
  7. Pathologic is now working on all old and new content which uses the input format(s) you added it to.

Is configuration necessary?

Depending on how you intend to use Pathologic and how the paths in your currently-existing content are formed, further configuration may not be necessary. To understand if further configuration is necessary in your case, and to explain how to go about that configuring, allow me to take a moment to explain how Pathologic works.

Pathologic looks at paths that are located in href attributes of links (<a> tags), as well as the src attributes of image tags and tags for other embedded media (<img>, <embed>, etc). If you wish, you can configure Pathologic to work on src attributes but ignore href attributes, or vice versa.

After finding a path in an attribute, Pathologic then determines if a path is "local." It does its magic on local paths, but leaves other paths alone.

Let's assume that your Drupal site is up and running at http://example.com/drupal/. Pathologic considers a path local if:

  • The path is a relative path. That is, it does not have a protocol fragment (such as http://) and does not begin with a slash. For example, tags/food/pizza will be considered a local path, but /tags/food/pizza and http://drupal.org/tags/food/pizza are not.
  • The path is an absolute path that points to a resource located within your Drupal installation. Our example is located at http://example.com/drupal/, so http://example.com/drupal/tag/food/pizza is considered a local path. However, while http://example.com/not_drupal/ points to a resource on the same domain name, it points to something outside of the Drupal installation, so it is not considered local.
  • The path contains only an anchor fragment, such as #pizza.
  • The path is an absolute path which begins with a URI of another Drupal installation which you've instructed Pathologic to considered local.

Aha! That last one is where things start getting interesting. Let's say you've grown tired of using http://example.com/drupal/, so you've moved your site over to http://example.net/. (For those interested in using Drupal in a test/production server paradigm, imagine that example.com is the test server and example.net is the production server.) If all the paths in your content are relative paths, then Pathologic will handle them perfectly — no need for further configuration. However, if they are absolute paths that begin with http://example.com/drupal/, then Pathologic will not consider them local paths and will ignore them. However, we can tell Pathologic to consider such paths as local paths and to fix them.

Configuring Pathologic

If you've determined that configuring Pathologic may be necessary, here's how to go about it.

  1. In the Administer menu, select "Input formats" from the "Site configuration" section. A list will appear of the various input formats your site uses. Find one in the list which you are using Pathologic with, and click the "configure" link for that format. (Note that if you are using Pathologic with more than one input format, you will have to repeat this configuration process for each input format.)
  2. Click the "Configure" tab at the top of the next page.
  3. Find the Pathologic section on the next page — it should be near the bottom.
  4. Toggle the "Transform values of href attributes" and "Transform values of src attributes" check boxes as may be necessary.
  5. Enter the paths of other/previous Drupal installations which should be considered local in the "Additional paths to be considered local" text field. Enter one path per line. For the above example, we'd want to enter http://example.com/drupal/.
  6. Click the "Save configuration" button when done.

(Note for those using testing and production servers; in cases where it would be inconvenient to have separate settings on each server, it's safe to put the path for the "current" server in the "Additional paths" field. Pathologic will simply remove it when it does its trick. In other words, both the example.com and example.net servers can have both http://example.com/drupal/ and http://example.net/ in the field.)

Now sit back and enjoy the fruits of Pathologic's labor.

Questions? Suggestions? Need help?

Please open an issue on Pathologic's issue queue or contact the author and I'll get back to you soon. Thanks for trying Pathologic!

 
 

Drupal is a registered trademark of Dries Buytaert.