Working on a weird bug.
Yesterday, a lot of listings were wiped out. I debugged the query, its all okay, I can import all the listings again. I went through the process of trying to figure out why the listing was deleted, and I keep coming back to this point in process_results method:
$this->handle_expired($in_rets, $connection->conid, $class);
It's called correctly, outside of the while loop, but what if no queue items were processed? Is that ever possible? If it is, $in_rets would be an empty array(). Then, in handle_expired:
$results = db_select('drealty_listing', 'dl')
->fields('dl', array('id', 'rets_key'))
->condition('conid', $conid)
->condition('class', $class->cid)
->execute()
->fetchAllAssoc('rets_key');
$diff = array_diff_key($results, $in_rets);
foreach ($diff as $item) {
switch ($class->expired_handler) {
case 0:
$listing = drealty_listing_load($item->id);
$listing->delete();
break;
case 1:
db_update('drealty_listing')
->fields(array('active' => 0))
->condition('id', $item->id)
->execute();
break;
default:
$listing = drealty_listing_load($item->id);
$listing->delete();
}
}
Doing array_key_diff
on a populated array against an empty array will return a diff of all the keys in the first array, which would be just about every single listing not in the previous update.
Example:
// sample listing ids returned from a system query in handle_expired, but could be thousands of them
$results = array(4691, 4692, 4693, 4694, 4695, 4696, 4697, 4698);
// $in_rets is defined in process_results as an empty array(); to start with.
$in_rets = array();
$diff = array_diff_key($results, $in_rets);
foreach ($diff as $item) {
var_dump($item);
#$listing = drealty_listing_load($item->id);
#$listing->delete();
}
results in:
4691
4692
4693
4694
4695
4696
4697
4698
Which, if you have selected 'delete expired listings' for the class, it seems like it will just delete the listing even though it is still valid.
I am still debugging this, and its just a hunch right now- but a fix could be:
// end of process_results method
if (count($in_rets)) {
//handle expired listings
$this->handle_expired($in_rets, $connection->conid, $class);
}
And just commenting out the default case in the switch statement in handle_expired. Still, I am not entirely confident that this is a bulletproof method of determining what listings are not in RETS anymore- assuming something goes bad during the item queuing process, we're assuming that all the rets IDs that ever existed would always be in the queue, when I think we should really be checking RETS as it is the master database.
I could be wrong, but at the moment I am not sure this is functioning correctly. I am going to give this another hour or two to make sure that a listing isn't deleted unless it really does not exist in RETS.
Comment | File | Size | Author |
---|
Comments
Comment #1
kevinquillen CreditAttribution: kevinquillen commentedPossibly related:
http://drupal.org/node/1658316
Comment #2
stockliasteroid CreditAttribution: stockliasteroid commentedI agree, there needs to be something in the dRealty import processing to handle an underlying exception in phRETS. As it is now if it gets nothing back from the underlying phRETS call expire_listings wipes the entire set of listings. The only time I ran into an exception was in the issue you referenced, but it could also be possible that the RETS server throws a 500 due to maintenance or something like that, and as far as I can tell that's not going to get gracefully handled.
There's quite a few instances in phRETS where SearchQuery will return false instead of a handle to the result set. dRealty should check for this and immediately halt processing of listings (in fetch_listings_*), because when it proceeds it happily builds an empty result set, which causes handle_expired to wipe the class.
So it could be the same evenly divisible issue, but it looks to me like there's a number of posible conditions in phRETS that could cause this.
Edit: I see that dRealty does check $rets->error(), but phRETS doesn't handle some exceptions well internally in such a way that error gets set. So dRealty should probably check the resultset to make sure it makes sense before proceeding (i.e. check to make sure false isn't returned instead of a real result, etc).
Comment #3
kevinquillen CreditAttribution: kevinquillen commentedYeah, I had a feeling on that too.
I am having issue with Acquia Solr at the moment returning a 403 when it tries to POST new content (listings) to index to the server, which kills drush immediately. I can't pinpoint where this went wrong, but something between Drupal / phRETS / Innovia stopped dRealty or fed it lies somewhere, causing errant deletes.
Comment #4
kevinquillen CreditAttribution: kevinquillen commentedI guess the first step in finding a solid solution to this would be plugging the holes in phRETS where it returns errors from methods we don't currently see- yet. If any are detected it just quits the entire process.
I've thought for a bit this afternoon about how to re-enable the listing delete mechanism, but I can't think of any where I'd have confidence in allowing the delete to happen.
One reason why I am deterred in deleting now is that we have custom data attached to listings that clients populate after they are in the system, like documents, PDFs, comments, administrative stuff. When a listing is deleted, all that goes too.
Comment #5
stockliasteroid CreditAttribution: stockliasteroid commentedYeah, I have an identical problem, I have a lot of custom content attached to dRealty entities.
I guess the worst failure (and the easiest to check for) is if handle_expired gets an empty result set for $in_rets, but a non empty result set for $results. In that case it's going to dump everything, so I guess a quick sanity check would be just to make sure $in_rets isn't empty, and "cowardly refuse to delete" listings in that instance. I suppose in some edge cases a listing class could be totally empty in RETS, though.
Secondly, in the fetch_listings_* functions, $search needs to be checked to see if it's FALSE (as it is in some situations when phRETS chokes), and if so the whole function should return false, which can be tested for in ProcessRetsClass(). If fetch* returns false the queue should probably be cleared, the class processing stopped, and a message logged.
That would be a fairly nonintrusive way to at least prevent mass deletions when something goes wrong on the RETS side, I think.
Comment #6
kevinquillen CreditAttribution: kevinquillen commentedSeems likely (in my experience) each listing class will have at least one listing in it. If it has no listings, probably isn't getting processed anyway.
My concern with anything empty is, theres no way to know if RETS listings are truly empty, or there was a fault in the RETS system, or someone made a mistake flagging the listing when changing its data (status for instance) by mistake. All we know on the other end is oh, those listings were not there, guess they are not in the system, delete.
Maybe a safer guard would be setting the listings to inactive, then having another drush function clean up any listings that have been inactive for more than X weeks, just so there is enough window there to prevent and/or fix mistakes in data or debug any connection/server issues. I'd be totally fine if the listing was deleted, then just reimported, but in this case there is way too much tied on to an entity in Drupal. Non-rets based fields, Apache Solr indices tied through Views- any time an entity id is in the Solr index but doesn't exist causes massive PHP errors that result in a WSOD- I am still tracking that one down. It's elusive.
But insta-delete is too hasty when we aren't 100% sure. Probably should institute some sort of window.
Comment #7
kevinquillen CreditAttribution: kevinquillen commentedI've made a handful of commits on this issue to dev.
- inactive_date added to schema and update_7305
- New Drush hook for deleting inactive listings (only listings older than two weeks are deleted)
- handle_expired now sets active 0 and a timestamp only
- Changes to process_results and ProcessRetsClass to ensure we are processing a valid result set returned from phRETS
http://drupal.org/node/824254/commits
Comment #8
kevinquillen CreditAttribution: kevinquillen commentedI forgot to remove the settings from Drealty that asks if you want to delete listings or set as inactive- I'll need to remove those too.
I will also need to add a little more code for listing entity_view to return drupal_not_found() if the active property is 0, and update the property_info_alter hook.
Comment #9
kevinquillen CreditAttribution: kevinquillen commentedJust did all those mentioned in #8. Going to need tests for this before we consider a new release. I've tested most of the import process changes, and listing purging, as well as the drealty_listing_view modification.
Comment #10
TwoDI've got some input on this issue, but I don't have time to sum it up right now. Need to check the current code and compare it to some changes I've got in a private branch. Going away over the weekend so I hope I've got time to do it before that.
Comment #11
kevinquillen CreditAttribution: kevinquillen commentedI've also noticed that some listings are being errantly marked inactive as well, even though they are active. I have one instance where 2401 listings were marked inactive, when some of those I was able to find in the main vendors listings.
Something is off about this method, and should be reviewed.
Comment #12
kevinquillen CreditAttribution: kevinquillen commentedYes, more listings are marked inactive every day.
I think the issue is we are comparing array keys, so we should be doing this:
instead of:
I did a test of this and was able to see that even with the same MLS numbers in $in_rets, it was coming back in the $diff when it should not be (it exists in the system). Adding array_keys() around $results gave me the correct $diff I am looking for.
Does anyone else have this problem?
Comment #13
kevinquillen CreditAttribution: kevinquillen commentedI think the safe method here is to assume a listing that is gone is deactivated.
The safe way of approaching this (in my opinion) is simply marking the record as inactive (in Drupal), with a new drush command that prunes inactive listings after X amount of time.
This should give people a decent window to correct mistakes - such as a listing being marked inactive accidentally, creating a bad chain of events into Drupal if you have additional non-RETS data attached to your listings.
I believe this is working in dev, but will give it another look over to confirm.
Comment #14
kevinquillen CreditAttribution: kevinquillen commentedJust noticed something else.
In the update_single_listing and process_results, if the record being processed is Active, the listing needs to be set to active and the inactive date set to null.
Example:
Otherwise, a listing that is brought back to life isn't properly set.
Comment #15
TwoDFor me, it's not safe to assume that listings no longer available in the feed are inactive, since I can only get listings which have changed in the last 10 days (except for an initial "full" run, which needs other credentials).
This is why I created the first patch in #1555740: Ignore expired listings so nothing is done to "missing" listings.
I've not been able to update my production version of drealty since that and related patches became incompatible with the current code, but I'm still searching for ways to integrate this type of RETS feed with drealty.
It's also not safe to assume the value of the status field is "Active", in my case it's simply "A", and I'm required by the feed provider to include
Status=A
in all my queries (though doing so broke it recently).I know they are using StratusRETS/1.7, which they claim "fully supports the features of the Real Estate Transactions Standard (RETS)", yet just about every meta data field it reports is wrong in some way. :(
Comment #16
kevinquillen CreditAttribution: kevinquillen commentedYes, I have seen that too. I am puzzled over what the standard actually means, because I have seen vast differences in 1.5 and 1.7 in different vendors, particularly dealing with how Status is noted on a listing.
Comment #17
kevinquillen CreditAttribution: kevinquillen commentedTwoD, you are correct about the statuses. I just noticed rets_status property is not used in some meaningful way when it comes to that.
I just pushed two commits to dev that fix the problems above, as well as set inactive_date to NULL when processing results (for listings that come back from another status to Active).
In update_single_listing, there needs to be a way to check that data and see if the listing is 'Active' by the vendor designation, if so, set active = 1 and set inactive_date to NULL.
Comment #18
Exploratus CreditAttribution: Exploratus commentedI am having the problem in #12. I tried your solution in #12, but I just got an error after the import, so I reverted.
I have a whole bunch of nodes that get marked inactive for no apparent reason. Trying to debug.
Comment #19
citricguy CreditAttribution: citricguy commentedI've noticed that along with being mass-disabled, some listings are being duplicated as well. drealty_listing.rets_key and drealty_listing.rets_id are not unique.
I have the same issue as #5 though too. This gets tricky as I'm storing a ton of context and additional data with each listing.
As noted in #18, the code edit in #12 applied against 7.x-3.0-beta5 causes an error during import:
Will try the above again with the current dev version and will report back.
Comment #20
citricguy CreditAttribution: citricguy commentedThis issue has gone away after a full RETS refresh (including photos) and after updating to the newest version of PHRets.php.
Because the problem is intermittent, I can't be sure that these resolved the issue but will update if I run into the issue again.
Comment #21
kevinquillen CreditAttribution: kevinquillen commentedNew version of phRETS? I'll have to check that out. What version/tag is it?
Comment #22
chrisfree CreditAttribution: chrisfree commented@kevinquillen and citricguy - Have you guys had any luck resolving this issue? I've recently inherited management of a project that is having the intermittent abandoned records (set to inactive) issue. Looking at the dev version of the module, I'm not seeing the suggested fixes in the code.
Any help here would be very much appreciated.
Best,
Chris
Comment #23
jday CreditAttribution: jday commentedSame problems here: duplicate listings, active listings with a drealty status of 0, I've flushed listings and started from scratch, same results. I'm ready to try anything as the site is pretty useless at this point.
Comment #24
kevinquillen CreditAttribution: kevinquillen commentedThat's odd. Are you sure you aren't getting the listings from RETS like that? The street names are different and some have a status and some do not.
Comment #25
jday CreditAttribution: jday commentedI've tried using the developer tools page but have not been able to get a result with it. I really need updated listings to either update the already existing listing or delete the old one because there are too many with the same mls ID.
Comment #26
jday CreditAttribution: jday commentedI keep additional buyer and seller information in a separate content type and link it to the listing with an entity reference field. Keeping multiple versions of a listing is really messing up the functionality of my site. I will pay for support.
Comment #27
shauntyndall CreditAttribution: shauntyndall at Inclind Inc commentedComment #28
veronicaSeveryn CreditAttribution: veronicaSeveryn at Inclind Inc commentedhandle_expired() is not utilized any longer.
Drealty Reconciler module is implemented instead.