A recent change seems to have been committed without going through the Issue system and community review. This Issue is to address it, and hopefully, reverse it. It's flawed in many ways:

  • Doesn't handle ports at all.
  • Doesn't handle paths at all.
  • Is a lot harder to edit then the original. Whereas the original was all in one set of quotes, this change now comes with lots of fun periods and dollar signs and multiple quotes and brackets and blah blah blah. Makes it far easier for a user to screw up and unintentionally cause a syntax error.
  • Does HTTP_HOST even exist in non-Apache servers?
  • The documentation says "the code below" but doesn't actually explain what the code does. It's about equivalent to saying "this function handles URLs using the code below" - rather useless.
  • Why wasn't this put up for community review?

In my opinion, this change is a big step backwards: we have many many users who install Drupal in a subdirectory, and they may erroneously expect this default determination to work for them. Likewise, with the new port support for sites directories, the same concern applies.

Comments

killes@www.drop.org’s picture

15:25 < killes> that's the kind of usability a programmer would come up with, I guess. :p

What I mean is that a programmer tries to fix things the automagic way. This is not appropriate if it has such a high rate of possible failure.
I could provide a way to set this including subdirs, but then I would demand that this be encapsulated in a function. settings.php is a configuration file that shoudl be understandable by people.

webchick’s picture

I agree with Morbus. Please remove this change. I ran into this today, since when I install test sites, they always go in http://localhost/whatever_patch_im_working_on.

Now, *I* know enough about PHP to know I can just change:

$base_url = 'http://'. $_SERVER['HTTP_HOST'];

to

$base_url = 'http://'. $_SERVER['HTTP_HOST'] .'mysub';

or even

$base_url = 'http://localhost/mysub';

I wouldn't expect an end-user to know this though (yes I know it's written in the docs above but I guarantee you most people don't bother reading that unless they're deviating from the normal course of action).

But I will +1 the two examples given:

* $base_url = 'http://localhost';
* $base_url = 'http://example.com/drupal';

It's come up in #drupal_support and on the support forums many times where users install to a subdirectory but think $base_url is just the http://www.example.com part. I normally clarify this with, "Just enter whatever you type into your browser to get to your Drupal page" but I don't know if that's appropriate for installation documentation.

morbus iff’s picture

As an example of how easy to screw this up, HTTP_HOST doesn't contain a slash, so webchick's second example would fail.

kbahey’s picture

Priority: Critical » Normal

But how is that different than having the default being localhost?

localhost is useless unless it is the same machine you are running the browser on.

People will change it anyways, so putting a default that has more chances of running out of the box is better than localhost, me thinks ...

For me, I will also change it anyway ...

chx’s picture

Status: Active » Closed (works as designed)

I was the one who spoke with Dries that we should use the 'tolerant base URL' in bootstrap and let settings.php override it if needed. He found that too complicated and made a quick commit. We _are_ better with the current settings than we were before -- most Drupal sites are at 80 and in the root. Those who are not are free to change the setting.

killes@www.drop.org’s picture

Status: Closed (works as designed) » Active

Sorry, I disagree. the new setting is much more confusing for users who have to change it. And you'd be surprised how many people start their Drupal experience with a site in a subdirectory because they want to test it.

morbus iff’s picture

The problem, of course, is since you and Dries went behind everyone's backs, you've won by default: just let this bug wallow in the queue system for years and everything will be dandy. Wonderful community involvement guys, awesome.

morbus iff’s picture

kbahey: I think it's irrelevant to argue that "well, people will change it" - the issue is how *easy* it is to change it. And as webchick's comment shows (in #2), even the Illustrious She would screw it up. As killes said: this is usability that only a programmer would dream up.

morbus iff’s picture

And "usability" that can be inaccurate a goodly amount of times (what with ignoring port or path possibilities). The settings.php is NOT for logic, it is for configuration. And using HTTP_HOST is a healthy dose of certainly fallible logic.

drewish’s picture

i'm going to agree with Morbus, with the old setting it was obvious what you needed to change. the new one has too much "magic" to it.

bonobo’s picture

This change in the settings.php file will confuse many users -- while I understand the perceived advantage (it will work fine for most people most of the time), it will deter many new users. So, either the syntax needs to be documented much more clearly -- such that a person who has never coded can make sense of what to do -- or this patch should be rolled back. As it stands, however, this will deter new users from trying/using drupal.

kbahey’s picture

Morbus,

The present setting (http://localhost) is more confusing. The site will display OK, but clicking on any link will not work.

The new setting has more chances of working right out of the box.

The comments already give examples:

* Examples:
* $base_url = 'http://localhost';
* $base_url = 'http://example.com/drupal';
*

The real solution to all this is for an installer to set all this correctly, and no need for manual editing at all.

Until then, I think the newer setting is better. What we need is some more comments in the file (e.g. "you have to replace the entire line by blah if you have installed in a subdirectory"), and of course documentation to match this.

Regarding lack of community input, I am with you on this. This should have been at least advertised. Of course, I am giving chx and Dries every benefit of the doubt on this. They are stressed, overworked, ....etc. so this just slipped by ...

kbahey’s picture

How about something like this then:

// 
// By default, Drupal tries to use a setting that should work in most cases.
$host = $_SERVER['HTTP_HOST'];
//
// $host = 'www.example.com';
// $host = 'www.example.com/subdirectory';
// You can uncomment a line above and put your own host name and subdirectory
// if you want to have Drupal in a subdirectory
//
// Do not change the line below
$base_url = 'http://' . $host;

All I did was break down into two components, and $host is the only place that needs to be uncommented and changed.

Again, this is mostly a documentation issue. If it is in the INSTALL.txt, on the Handbook, and in the comments, then it is a non issue.

chx’s picture

I The problem, of course, is since you and Dries went behind everyone's backs

Dries have made changes to Drupal core more than once without asking anyone and there was no rucus. And I have just asked his opinion in this matter nd have not indeded to backstab anyone, I wanted to roll a patch after knowing his views on this.

Morbus, just because you do not like this change you do not need to cry foul. Peace.

chx’s picture

StatusFileSize
new1.32 KB

So once again, I just asked Dries why we are not using this code, straight out of handbook, I thought there is a special reason.

chx’s picture

StatusFileSize
new1.32 KB

Get rid of double ?:

chx’s picture

StatusFileSize
new1.32 KB
kbahey’s picture

Better wording may be:

* This is the base URL of your website. Normally, this is the main page
* of your site. Usually Drupal can automatically find this by itself.
* If it does not work, remove the // characters and put your
* domain/subdomain name.
* It consists of a hostname, and an optional subdirectory
* name. If you installed Drupal in the main directory of your site, then
* do not include a directory. A trailing slash is not allowed, since Drupal
* will automatically add it for you.

Once concern (or maybe a clarification), what happens if a site is accessed as www.example.com and as example.com? Both will work, but some features in Drupal may break.

For example, the external referer in admin/logs will sometimes consider www.example.com to be external, and on others will consider example.com to be so, hence giving an incomplete picture. There are some cache set/get in bootstrap.inc. In aggregator, the base_url is used.

dries’s picture

Gosh, I've made thousands of commits without consulting the larger community, and I will continue to make changes without consulting the larger community. You make it sound as if I did something aweful, while this is what I used to do all the time. Relax, Morbus. I'll let you guys figure this out, and I'll commit whather change you come up with.

chx’s picture

Also note that the issue is not assigned to me, anyone (for eg. kbahey, Morbus) should feel free to take over.

morbus iff’s picture

Dries: either I've not been catching them, or you "used to do [this] all the time" before I got involved with Drupal. Either way, it offends me: I "signed up" for a "community" "open source" project. I'm not used to seeing code being committed without having the *chance* to voice my opinions on it. I hate to say, but if it ever gets to a point where you're once again "do[ing] [this] all the time", then I'll have to take my meagre toys and go home. I didn't "sign up" for a codebase where I'd have to fight an uphill battle after the fact.

As for the patch itself, I'm still slightly against chx's final version: it needs to support ports other than 80, and I don't like the extra (and easily missable) step of removing the "//" in the settings.php. I'd much rather have a "default" string that we check against (like $base_url = 'http://your-drupal-location-here.org'). But, of course, since I'm not even home right now, I can't generate a proper patch to fix this, and now have to push everything aside when I do get home to retroactively change something that should have gone through peer review anyways. But didn't.

Chx says "So once again, I just asked Dries why we are not using this code, straight out of handbook, I thought there is a special reason." Those special reasons, and concerns, is what this whole flipping Issue workflow thing is supposed to iron out, document, and give voice too. Committing without said workflow does, in fact, make me rabidly angry, yes. Sure, I know there are complaints that the patch workflow is slow and there's "only a few commiters to handle the workload", but if we're heading down the path of "the solution is simple: ignore the workflow entirely", presumably again ("used to do [this] all the time"), then that's the worst salve I've ever seen for a project's growing pains. And I can't be apart of it.

chx’s picture

Just a few _sweeping_ changes: last November Dries refactored statistics module. Steven recently made a unicode.inc . I do not quite remember whether the new filter system got a peer review last fall or not.

I am fine with this model. If you want, consider it this way: for the upper class, peer review happens after commit and not before. Case closed.

adrinux’s picture

for the upper class, peer review happens after commit and not before. Case closed

That's very arrogant. Almost to the point of flamebait.

I know Drupal isn't a democracy, but what is the big deal about putting all patches in the patch queue? Even if Dries applied the patch straight away without review the patch queue would at least be the correct 'Drupal' place to make comment and submit improved patches - bypassing the whole review process and leaving the responses to bug reports is imnsho just wrong.

Secondly, it sets a very bad example to other developers - pester Dries and get your patch applied...

chx’s picture

Noone reads what I say? Is my English so terrible that you can not understand what I am trying to communicate through?

There was no patch, let me quote myself: "I have just asked his opinion in this matter nd have not indeded to backstab anyone, I wanted to roll a patch after knowing his views on this" and then he committed something what surprised even me.

By "upper class" I meant Dries and Steven and noone else.

No, it's not about pestering Dries and getting your patch committed. If I want my patch committed I post a followup "what holds this patch?"

Bèr Kessels’s picture

StatusFileSize
new1.06 KB

I believe te fact a users gets as far as settings.php at least means he or she can read and can copy-paste. This patch does no longer use any "Smart" magic, but merely informs users about what to do .
I could not find a handbook page, but otherwise I would refer to also add a link to that page in the settings.php comments.

Bèr Kessels’s picture

StatusFileSize
new1014 bytes

some odd line slipped in.

adrian’s picture

I vote for having the host populated automatically by default, and having an extra (optional) subdirectory variable, and possibly even a port variable.

So it would be :

/**** top of file ****/
/* if you run drupal in a sub directory, change it here */
$subdir = '';


/*** base url configuration - closer to the bottom of the file ****/
$host = $_SERVER['http_host']
$port = 80;

$base_url = 'http://$host" . ($port ? ":$port" : '') . ($subdir ? "/$subdir" : "");

I also trust dries enough to agree with the current commit practices.

Bèr Kessels’s picture

In fact, I hardly understand all this buzz and fuzz about this setting. I really do not have the feeling that this is an issue at all.
It never comes up on the supp. mailinglist, if it comes up in the forums (and that is not too often) it is always solved within the day. There was one person who had some issues lately, but that is about the only one I can recall.

Also the CS usability investigation did not point out that this was a big issue. Can we not just leave it the way it was (never fix something that aint broken) and put all this energy into getting the install system up?

morbus iff’s picture

I'd be perfectly fine with a rollback and wait-for-installer. Though, I was going to work on a patch this afternoon too.

webchick’s picture

$base_url = 'http://$host" . ($port ? ":$port" : '') . ($subdir ? "/$subdir" : "");

Please, please no. :)

Users understand what an "address" (URL) is for a page. It's the thing they type into their browser to get to a website:

http://drupal.org/about

They don't know that "drupal.org" is the host name, and "about" is a subdirectory, and it's running at the default port 80 so they don't need to change it.

Either have enough flexibility in the "smart" method of figuring this out to detect:

1. http or https
2. host name
3. port number
4. subdirectory

(aka the full path to my Drupal installation, no matter what wacky place I stick it)

or keep it as-is, let the user enter it his/herself, and just write better docs above the section to explain what that setting is for.

A question about this "smart" method of figuring it out though. Will this mess up statistics packages? Because the nice thing about a 'hard-coded' $base_url right now is that if I specify:

$base_url = 'http://www.example.com';

If a user tries to go to just http://example.com, it will automatically expand it to www.example.com when they click on any links.

Bèr Kessels’s picture

@webchick my last patch does nearly all you ask for, mind reviewing it? :)

morbus iff’s picture

Hrm. The statistics problem is an interesting question. Bears investigation.

morbus iff’s picture

Regarding the statistics issue: I thought she meant statistics.module, when actually she meant server log analysers, like webtrends, analog, etc. Those COULD be affected if you're keeping track of referers, as referers are logged with the hostname. For example, here are two referers using different hostnames, though the same content:

# accessing www.disobey.com, clicking on link to disobey.com:
63.173.138.23 - - [28/Nov/2005:13:34:42 -0500] "GET /detergent/code/wowquests.pl HTTP/1.1" 200 6212 "http://www.disobey.com/node/1629" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8) Gecko/20051111 Firefox/1.5"

# accessing disobey.com, clicking on link to www.disobey.com:
63.173.138.23 - - [28/Nov/2005:13:35:44 -0500] "GET /d/code/wowquests.pl HTTP/1.1" 301 338 "http://disobey.com/node/1629" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8) Gecko/20051111 Firefox/1.5"

This can also get further exasperated with SSL: some certificates are issues ONLY to "www.domain.com", and attempts to access "https://domain.com/" will cause errors (Google AdSense (with no-www) vs. Google AdSense (with www).

At this point, I'm leaning more and more into Berkes direction: no logic, more documentation, wait for installer.

kbahey’s picture

I alluded to the statistics issue in #18. It will affect Drupal statistics, e.g. external vs. internal referers.

Anything that uses $base_url has the potential to be affected by this flexible host name thing.

webchick’s picture

@berkes: I like your "patch" except that it fails to recognize that $base_url presents a real problem for users. I answer questions about this on forums and in #drupal_support two or three times per week, easily (I literally just finished answering it yet again). I'm not sure why the mailing lists don't reflect this trend, other than perhaps the mailing lists tend to get a more "technically savvy" audience than the forums since the forums are just "there" for anyone to post to... I don't know if most users who weren't using the Internet back in the mid-90s have ever touched a mailing list.

There are two problems that I see coming up consistently:

1. User leaves setting at http://localhost since he can see that fine on his computer, but since everyone else is accessing it via http://something.homelinux.org or whatever, it breaks for his visitors. (this is less common, though should be solved by the current method)

2. User installs to subdirectory, such as http://www.example.com/drupal, thinks $base_url refers to only the "base" part of the URL (http://www.example.com) and can't figure out why nothing works. (this is VERY common, and the current way only makes solving that problem harder and even more confusing)

But overall, I agree. Revert, and improve documentation, imo. Unless the "fancy" way can be made to deal with all the possible scenarios a user can throw at it.

@kbahey: Sorry, I didn't mean to reiterate what you'd already said. :(

Timotheos’s picture

I have the same perception as webchick that this is a common problem. I've been using this snippet for awhile and really like it. So my vote is to use chx's settings_3.patch

morbus iff’s picture

Status: Active » Needs review
StatusFileSize
new1.09 KB

There appear to be three different camps for this issue:

  • Don't autodetermine $base_url. This group believes that no amount of determination logic can satisfy every possibility, nor wants code that *can't* handle every possibility. They're concerned about ease-of modification, largely due to perceived "hurdles" in modifying the settings (uncommenting some code, syntax ugliness, etc.). They also have misgivings about no-www vs. www, referers, and SSL certificates. This was the way it was before this patch.
  • Try to autodetermine everything. Every possibility should be attempted: HTTPS, subdirectories, port numbers, and so forth. Chx's patch comes closest to this, though misses out on port determination. This approach still has problems with no-www vs. www, referers, and SSL certificates. All problems can be solved by "editing base_url like always".
  • Only autodetermine the commonest possibilities: Determine whether it's HTTPS, the hostname, and a subdirectory. Ignore ports because they're used so infrequently that it'd be wasted code. This approach has problems with no-www vs. www, referers, SSL certifications, and ports. All problems can be solved by "editing base_url like always".

There is no happy medium here that satisfies every group, and I could belong in "the way it was" or "autodetermine everything" camps. However, I think the no-www vs. www and SSL certificates issues are "hidden dangers": you don't really know they're broken until it's too late: when your log analyser reports incorrect referer counts for your own site (www.disobey.com vs. disobey.com, etc.) or when you lose a customer due to a scary SSL prompt (see comment #33).

Thus, I'd like to suggest that:

  • we include more documentation about $base_url.
  • we don't autodetermine $base_url. YET. not for 4.7, at least.
  • we wait for logical $base_url determination in our installer.

Thus, the attached patch is largely a cleaned up version of berkes'.

webchick’s picture

+1 Morbus!! :)

Bèr Kessels’s picture

+1 For the Morbus way.
Was't KISS one of Drupals philosophies?

chx’s picture

Status: Needs review » Reviewed & tested by the community

There seems to be a consensus on Morbus' solution.

dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to HEAD. Thanks.

syscrusher’s picture

Okay, I know I'm weighing in late, but.....

+1 on the current state of HEAD (per Dries' latest commit). Among other things -- and I haven't seen anyone else mention this -- the use of "example.com" rather than something like "localhost" specifically complies with IETF RFC 2606, section 3, in the choice of second-level domains for documentation purposes.

Scott

sepeck’s picture

Yes well I liked Dries commit better as well but hadn't had time to test it until last night. tested fine last night. Now I will have to test the new one sooner than later.

Anonymous’s picture

Status: Fixed » Closed (fixed)