Is there a module or a feature to have a listing of all external links within a drupal website? (ie. all links pointing to something else than example.com in this example.com website)? I know of the module Link Checker, and have it in use on my website, but since I'm migrating from plain HTML and PHP to drupal, there ought to be some links that won't link inside the site. Whether those links are online or not is not of a concern, Link Checker will take care of that.
So basically what I am looking for is:
link to example.com/node/217, do not list this link
link to example.com/about-us, do not list this link
link to my-old-website.com/about-us, list this link
link to google.com, list this link
Any modules that accomplish this? I'm not sure if I'm willing to code my own module just for this purpose, since the time it takes to create something like this might be more than it would take to manually hunt for all those a href's inside the node content.
-Zelgadis85
Comments
Need more requirements info
What is the requirements driving your need to only show non-site domain anchor URLs? Where are these links listed now? Will they be in your content? Sidebars? Banners? Does this need to be dynamic, or can a bot that recursively captures existing non-site domain anchor URLs fit your needs?
More info about URL links
Thanks for the fast response. The URL links are located mainly in node content, but if possible, I'd like to search through sidebars as well. Banners are of no concern on this site. Since the idea is to merge the old web site with the new one, a bot (or similar) that runs through the links once is sufficient.
Of course, the solution does not have to be a drupal module, if such tools exist on the web already. But a quick search did me no good, or maybe I was using the wrong words.
Thank you for your time.
-zelgadis85
just a link checker, then?
Use Xenu (Win) or Integrity (Mac) for link checking.
If your old/site domain is still active, you should add an entry to your computer's hosts file to resolve/point that domain name to an IP address that the site is NOT available, thus returning a list of broken links.
The tool wasn't that helpful... or maybe it was just user fault
Thanks for the help, Lrrr, but I could not get much valuable information using Xenu. True, it listed all links on the site, but in the end, what I was looking for is something that also reports the 'referrer' (aka the page the link was found on). If this is available using Xenu, well then I guess I'll have to dig in deeper :)
Your help is much appreciated. I'd like to solve this before Christmas, if anyhow possible.
Thank you in advance.
-zelgadis85
ctrl-b
When Xenu runs, you get a "live-updated" link list to show you "hey i'm working here". You can limit this list to show broken links only by pressing ctrl-b. When the session is done, it will ask if you'd like a report, select YES. It may ask you for FTP credentials, you don't have to give them. The FTP credentials are used (in part) to report orphan files, which may result in a bunch of false positives on a Drupal install. (a file linked from an unpublished page is an orphan, etc.)
The report opens in your default web browser. You'll get a report for whatever was checked in the "options -> preferences" menu, which should include 3 different ways of seeing the broken links.
Xenu's Link Checker: one of the best web QA tools around
I'm with Lrrr, xenu is an invaluable tool (and where I thought this discussion was going) Lrrr answers your "how-to" question. If you want to go deeper, you can always script lynx and it's traversal feature, (giving you more specific configuration and control) but xenu does exactly what you're looking for.