Working on a weird bug.

Yesterday, a lot of listings were wiped out. I debugged the query, its all okay, I can import all the listings again. I went through the process of trying to figure out why the listing was deleted, and I keep coming back to this point in process_results method:


$this->handle_expired($in_rets, $connection->conid, $class);

It's called correctly, outside of the while loop, but what if no queue items were processed? Is that ever possible? If it is, $in_rets would be an empty array(). Then, in handle_expired:


$results = db_select('drealty_listing', 'dl')
      ->fields('dl', array('id', 'rets_key'))
      ->condition('conid', $conid)
      ->condition('class', $class->cid)
      ->execute()
      ->fetchAllAssoc('rets_key');

    $diff = array_diff_key($results, $in_rets);

    foreach ($diff as $item) {
      switch ($class->expired_handler) {
        case 0:
          $listing = drealty_listing_load($item->id);
          $listing->delete();
          break;
        case 1:
          db_update('drealty_listing')
            ->fields(array('active' => 0))
            ->condition('id', $item->id)
            ->execute();
          break;
        default:
          $listing = drealty_listing_load($item->id);
          $listing->delete();
      }
    }

Doing array_key_diff on a populated array against an empty array will return a diff of all the keys in the first array, which would be just about every single listing not in the previous update.

Example:


// sample listing ids returned from a system query in handle_expired, but could be thousands of them
$results = array(4691, 4692, 4693, 4694, 4695, 4696, 4697, 4698);

// $in_rets is defined in process_results as an empty array(); to start with.
$in_rets = array();
  
$diff = array_diff_key($results, $in_rets);

foreach ($diff as $item) {
    var_dump($item);
    #$listing = drealty_listing_load($item->id);
    #$listing->delete();
}

results in:

4691
4692
4693
4694
4695
4696
4697
4698

Which, if you have selected 'delete expired listings' for the class, it seems like it will just delete the listing even though it is still valid.

I am still debugging this, and its just a hunch right now- but a fix could be:

// end of process_results method
if (count($in_rets)) {
  //handle expired listings
  $this->handle_expired($in_rets, $connection->conid, $class);
}

And just commenting out the default case in the switch statement in handle_expired. Still, I am not entirely confident that this is a bulletproof method of determining what listings are not in RETS anymore- assuming something goes bad during the item queuing process, we're assuming that all the rets IDs that ever existed would always be in the queue, when I think we should really be checking RETS as it is the master database.

I could be wrong, but at the moment I am not sure this is functioning correctly. I am going to give this another hour or two to make sure that a listing isn't deleted unless it really does not exist in RETS.

CommentFileSizeAuthor
#23 Screenshot_6_24_14_12_24_AM.png114.38 KBjday
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

kevinquillen’s picture

stockliasteroid’s picture

I agree, there needs to be something in the dRealty import processing to handle an underlying exception in phRETS. As it is now if it gets nothing back from the underlying phRETS call expire_listings wipes the entire set of listings. The only time I ran into an exception was in the issue you referenced, but it could also be possible that the RETS server throws a 500 due to maintenance or something like that, and as far as I can tell that's not going to get gracefully handled.

There's quite a few instances in phRETS where SearchQuery will return false instead of a handle to the result set. dRealty should check for this and immediately halt processing of listings (in fetch_listings_*), because when it proceeds it happily builds an empty result set, which causes handle_expired to wipe the class.

So it could be the same evenly divisible issue, but it looks to me like there's a number of posible conditions in phRETS that could cause this.

Edit: I see that dRealty does check $rets->error(), but phRETS doesn't handle some exceptions well internally in such a way that error gets set. So dRealty should probably check the resultset to make sure it makes sense before proceeding (i.e. check to make sure false isn't returned instead of a real result, etc).

kevinquillen’s picture

Yeah, I had a feeling on that too.

I am having issue with Acquia Solr at the moment returning a 403 when it tries to POST new content (listings) to index to the server, which kills drush immediately. I can't pinpoint where this went wrong, but something between Drupal / phRETS / Innovia stopped dRealty or fed it lies somewhere, causing errant deletes.

kevinquillen’s picture

I guess the first step in finding a solid solution to this would be plugging the holes in phRETS where it returns errors from methods we don't currently see- yet. If any are detected it just quits the entire process.

I've thought for a bit this afternoon about how to re-enable the listing delete mechanism, but I can't think of any where I'd have confidence in allowing the delete to happen.

One reason why I am deterred in deleting now is that we have custom data attached to listings that clients populate after they are in the system, like documents, PDFs, comments, administrative stuff. When a listing is deleted, all that goes too.

stockliasteroid’s picture

Yeah, I have an identical problem, I have a lot of custom content attached to dRealty entities.

I guess the worst failure (and the easiest to check for) is if handle_expired gets an empty result set for $in_rets, but a non empty result set for $results. In that case it's going to dump everything, so I guess a quick sanity check would be just to make sure $in_rets isn't empty, and "cowardly refuse to delete" listings in that instance. I suppose in some edge cases a listing class could be totally empty in RETS, though.

Secondly, in the fetch_listings_* functions, $search needs to be checked to see if it's FALSE (as it is in some situations when phRETS chokes), and if so the whole function should return false, which can be tested for in ProcessRetsClass(). If fetch* returns false the queue should probably be cleared, the class processing stopped, and a message logged.

That would be a fairly nonintrusive way to at least prevent mass deletions when something goes wrong on the RETS side, I think.

kevinquillen’s picture

Seems likely (in my experience) each listing class will have at least one listing in it. If it has no listings, probably isn't getting processed anyway.

My concern with anything empty is, theres no way to know if RETS listings are truly empty, or there was a fault in the RETS system, or someone made a mistake flagging the listing when changing its data (status for instance) by mistake. All we know on the other end is oh, those listings were not there, guess they are not in the system, delete.

Maybe a safer guard would be setting the listings to inactive, then having another drush function clean up any listings that have been inactive for more than X weeks, just so there is enough window there to prevent and/or fix mistakes in data or debug any connection/server issues. I'd be totally fine if the listing was deleted, then just reimported, but in this case there is way too much tied on to an entity in Drupal. Non-rets based fields, Apache Solr indices tied through Views- any time an entity id is in the Solr index but doesn't exist causes massive PHP errors that result in a WSOD- I am still tracking that one down. It's elusive.

But insta-delete is too hasty when we aren't 100% sure. Probably should institute some sort of window.

kevinquillen’s picture

Status: Active » Needs review

I've made a handful of commits on this issue to dev.

- inactive_date added to schema and update_7305
- New Drush hook for deleting inactive listings (only listings older than two weeks are deleted)
- handle_expired now sets active 0 and a timestamp only
- Changes to process_results and ProcessRetsClass to ensure we are processing a valid result set returned from phRETS

http://drupal.org/node/824254/commits

kevinquillen’s picture

I forgot to remove the settings from Drealty that asks if you want to delete listings or set as inactive- I'll need to remove those too.

I will also need to add a little more code for listing entity_view to return drupal_not_found() if the active property is 0, and update the property_info_alter hook.

kevinquillen’s picture

Just did all those mentioned in #8. Going to need tests for this before we consider a new release. I've tested most of the import process changes, and listing purging, as well as the drealty_listing_view modification.

TwoD’s picture

I've got some input on this issue, but I don't have time to sum it up right now. Need to check the current code and compare it to some changes I've got in a private branch. Going away over the weekend so I hope I've got time to do it before that.

kevinquillen’s picture

I've also noticed that some listings are being errantly marked inactive as well, even though they are active. I have one instance where 2401 listings were marked inactive, when some of those I was able to find in the main vendors listings.

Something is off about this method, and should be reviewed.

kevinquillen’s picture

Yes, more listings are marked inactive every day.

I think the issue is we are comparing array keys, so we should be doing this:

$diff = array_diff_key(array_keys($results), $in_rets);

instead of:

$diff = array_diff_key($results, $in_rets);

I did a test of this and was able to see that even with the same MLS numbers in $in_rets, it was coming back in the $diff when it should not be (it exists in the system). Adding array_keys() around $results gave me the correct $diff I am looking for.

Does anyone else have this problem?

kevinquillen’s picture

Version: 7.x-3.0-beta3 » 7.x-3.x-dev

I think the safe method here is to assume a listing that is gone is deactivated.

The safe way of approaching this (in my opinion) is simply marking the record as inactive (in Drupal), with a new drush command that prunes inactive listings after X amount of time.

This should give people a decent window to correct mistakes - such as a listing being marked inactive accidentally, creating a bad chain of events into Drupal if you have additional non-RETS data attached to your listings.

I believe this is working in dev, but will give it another look over to confirm.

kevinquillen’s picture

Status: Needs review » Needs work

Just noticed something else.

In the update_single_listing and process_results, if the record being processed is Active, the listing needs to be set to active and the inactive date set to null.

Example:

if (isset($item[$field_mappings['rets_status']->systemname]) && $item[$field_mappings['rets_status']->systemname] === 'Active') {
  $listing->active = 1;
  $listing->inactive_date = NULL;
}

Otherwise, a listing that is brought back to life isn't properly set.

TwoD’s picture

For me, it's not safe to assume that listings no longer available in the feed are inactive, since I can only get listings which have changed in the last 10 days (except for an initial "full" run, which needs other credentials).
This is why I created the first patch in #1555740: Ignore expired listings so nothing is done to "missing" listings.

I've not been able to update my production version of drealty since that and related patches became incompatible with the current code, but I'm still searching for ways to integrate this type of RETS feed with drealty.

It's also not safe to assume the value of the status field is "Active", in my case it's simply "A", and I'm required by the feed provider to include Status=A in all my queries (though doing so broke it recently).

I know they are using StratusRETS/1.7, which they claim "fully supports the features of the Real Estate Transactions Standard (RETS)", yet just about every meta data field it reports is wrong in some way. :(

kevinquillen’s picture

Yes, I have seen that too. I am puzzled over what the standard actually means, because I have seen vast differences in 1.5 and 1.7 in different vendors, particularly dealing with how Status is noted on a listing.

kevinquillen’s picture

TwoD, you are correct about the statuses. I just noticed rets_status property is not used in some meaningful way when it comes to that.

I just pushed two commits to dev that fix the problems above, as well as set inactive_date to NULL when processing results (for listings that come back from another status to Active).

In update_single_listing, there needs to be a way to check that data and see if the listing is 'Active' by the vendor designation, if so, set active = 1 and set inactive_date to NULL.

Exploratus’s picture

I am having the problem in #12. I tried your solution in #12, but I just got an error after the import, so I reverted.

I have a whole bunch of nodes that get marked inactive for no apparent reason. Trying to debug.

citricguy’s picture

I've noticed that along with being mass-disabled, some listings are being duplicated as well. drealty_listing.rets_key and drealty_listing.rets_id are not unique.

I have the same issue as #5 though too. This gets tricky as I'm storing a ton of context and additional data with each listing.

As noted in #18, the code edit in #12 applied against 7.x-3.0-beta5 causes an error during import:

WD php: EntityMetadataWrapperException: Unable to set the data [error]
property active as the parent data structure is not set. in
EntityStructureWrapper->setProperty() (line 473 of
/var/www/example.local/sites/all/modules/entity/includes/entity.wrapper.inc).
[37.43 sec, 69.42 MB]
EntityMetadataWrapperException: Unable to set the data property active as the parent data structure is not set. in EntityStructureWrapper->setProperty() (line 473 of /var/www/example.local/sites/all/modules/entity/includes/entity.wrapper.inc).
Drush command terminated abnormally due to an unrecoverable error. [error]
[37.43 sec, 69.42 MB]

Will try the above again with the current dev version and will report back.

citricguy’s picture

This issue has gone away after a full RETS refresh (including photos) and after updating to the newest version of PHRets.php.

Because the problem is intermittent, I can't be sure that these resolved the issue but will update if I run into the issue again.

kevinquillen’s picture

New version of phRETS? I'll have to check that out. What version/tag is it?

chrisfree’s picture

@kevinquillen and citricguy - Have you guys had any luck resolving this issue? I've recently inherited management of a project that is having the intermittent abandoned records (set to inactive) issue. Looking at the dev version of the module, I'm not seeing the suggested fixes in the code.

Any help here would be very much appreciated.

Best,
Chris

jday’s picture

Same problems here: duplicate listings, active listings with a drealty status of 0, I've flushed listings and started from scratch, same results. I'm ready to try anything as the site is pretty useless at this point.

kevinquillen’s picture

That's odd. Are you sure you aren't getting the listings from RETS like that? The street names are different and some have a status and some do not.

jday’s picture

I've tried using the developer tools page but have not been able to get a result with it. I really need updated listings to either update the already existing listing or delete the old one because there are too many with the same mls ID.

jday’s picture

I keep additional buyer and seller information in a separate content type and link it to the listing with an entity reference field. Keeping multiple versions of a listing is really messing up the functionality of my site. I will pay for support.

shauntyndall’s picture

Status: Needs work » Active
veronicaSeveryn’s picture

Status: Active » Closed (outdated)

handle_expired() is not utilized any longer.
Drealty Reconciler module is implemented instead.