XML feeds over Excel uses IE

Aren Cambre - June 18, 2009 - 14:59
Project:Views Bonus Pack
Version:6.x-1.0-beta4
Component:Views Export
Category:support request
Priority:normal
Assigned:Unassigned
Status:closed
Description

This module returns nodes with missing fields if I request the XML feed with IE. If I request the same feed with Firefox, I consistently get data.

I have one feed pared down to one field type on a node. Attached file ie.png shows what I see when I load the feed using IE as the request initiator. Attached file ff.png shows the result if I access the exact same feed using FF. (Note that IE is the default XML viewer, so you see IE windows both times; trust me, however, I am really using FF to request the feed.)

An XML feed with multiple fields may have some fields that consistently display regardless of browser. E.g., if I have fields ABCDEF, a FF request will get all of them, but an IE request will consistently get ABDF and not C or E. Even if I reorder the nodes, I still will not get C or E with IE.

Because of this IE problem, I am missing data when I import the XML feed directly into Microsoft Excel since it uses IE for HTTP requests.

I used Wireshark to review the underlying TCP traffic. Firefox shows the network data before any application logic on my PC can possibly mess with it. Drupal is absolutely sending my client an incomplete feed.

I presume that the only way to explain the difference between what I'm observing with IE and Firefox is to compare the HTTP GET requests. I have done so and found little meaningful difference. See the Excel file in attachment IEvsFF.zip. The only differences I noticed are:

  • Accept header in HTTP GET are both different, but neither specifies the text/xml content type that is actually returned (unless some of the wildcards in the FF Accept string include this?).
  • IE does not submit Accept-Charset or Keep-Alive headers.
  • Other minor differences that I think are trivial, like capitalization and Accept-Language header value differences.

I even closed out all instances of IE, open a new fresh instance, cleared my history, cookies, and cache, and turned on InPrivate browsing. This should totally isolate IE from any residue from prior sessions. Even then I have missing fields.

AttachmentSize
IEvsFF.zip7.62 KB
ie.png43.95 KB
ff.png57.86 KB

#1

Aren Cambre - June 18, 2009 - 15:39

The above data was collected on a laptop. Just to provide further ammunition, I also ran Firefox on the desktop PC that is hosting Drupal in a Ubuntu VM. That Firefox sees the exact same results: Drupal declines to return some data when IE requests it.

HOWEVER: I cannot reproduce this on my desktop PC. When I request with IE on the desktop, I get perfect results every time.

The laptop is Vista 32 bit with IE 8. My desktop PC is Vista x64 with IE 8.

I attached a new version of the HTTP headers that includes my desktop IE 8. It says MSIE 7.0 because it was in compatibility view mode. ("Compatibility view" makes the browser act like IE 7.) I turned off compatibility view mode and still get perfect data from Drupal every time.

AttachmentSize
IEvsFFv2.zip 8.9 KB

#2

neclimdul - June 19, 2009 - 16:15

I'm curious since its only on that one server in that very constrained set of conditions, are there any caching/proxy setup on it that might be affecting the output? I don't see anything that I'm doing that could cause this to my knowledge so it seems likely this is being done by something outside the drupal code. Also, are both your setups hardened with the Suhosin?

#3

Aren Cambre - June 19, 2009 - 16:19

No caching or proxy. This is straight IP-to-IP communications with only passive network equipment in the middle.

This happens regardless of whether the laptop is on the same class B network as the server or communicating through a home internet connection.

Not sure of meaning of "both your setups," but Drupal is running on a vanilla Ubuntu LAMP server, so Suhosin is in its as-delivered state.

#4

neclimdul - July 23, 2009 - 01:55
Version:6.x-1.0-beta2» 6.x-1.0-beta4
Status:active» postponed (maintainer needs more info)

Is this still valid with the current beta? I think some things might have changed that fixed it.

#5

Aren Cambre - July 27, 2009 - 18:14
Title:XML feed fields are sometimes empty or missing with IE» XML feeds over Excel uses IE
Category:bug report» support request
Status:postponed (maintainer needs more info)» needs review

Just figured it out. Excel uses Internet Explorer's object model to pull data over HTTP. The IE object model retains the current user's cookies. If I am currently signed in through IE, Excel gets all fields. If I am not signed in, then I only get the fields that CCK's Content Permissions module allows for anonymous users.

Solutions:

  • Remove content restrictions for anon users for fields that you want exportable through XML.
  • Sign in to your Drupal site with IE before importing data with Excel.

Marked this issue as "needs review" in case you want to add anything else before closing it.

#6

neclimdul - July 27, 2009 - 21:20
Status:needs review» closed

No, I think that's enough to help anyone that stumbles across this sort of module/weird Windows interaction on their site in the future. Thanks for the detailed follow up explanation.

 
 

Drupal is a registered trademark of Dries Buytaert.