I don't believe the query does that at the moment.

db_query("UPDATE {pift_test}
SET status = %d
WHERE type = %d
AND id IN (
  SELECT f.fid
  FROM {files} f
  LEFT JOIN {upload} u
    ON f.fid = u.fid
  LEFT JOIN {comment_upload} cu
    ON f.fid = cu.fid

  JOIN {project_issues} pi
    ON " . $clause . "
  JOIN {pift_project} p
    ON pi.pid = p.pid
  JOIN {project_release_nodes} r
    ON pi.rid = r.nid

  JOIN {node} n
    ON r.nid = n.nid
  JOIN {term_node} t
    ON (n.vid = t.vid AND t.tid IN (" . db_placeholders($api_versions, 'int') . "))

  WHERE pi.sid IN (" . db_placeholders($sids, 'int') . ")
)
AND status > %d
AND last_tested < %d",
array_merge(array(PIFT_STATUS_QUEUE, PIFT_TYPE_FILE), $api_versions, $sids,
            array(PIFT_STATUS_SENT, $retest_time)));
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

boombatower’s picture

Title: Ensure/Refactor cron_rest() query to only re-test the last file on an issue » Ensure/Refactor cron_retest() query to only re-test the last file on an issue
boombatower’s picture

Using the following test code I was able to determine that the query does infact affect all patches on an issues instead of just the last.

  $api_versions = pift_core_api_versions();
  $sids = variable_get('pift_status', array());
//   $retest_time = time() - PIFT_RETEST * 20;
  $retest_time = time();
//   $retest_time = 0;
  
  drupal_set_message(print_r($api_versions, TRUE));
  drupal_set_message(print_r($sids, TRUE));
  drupal_set_message($retest_time);
  
  foreach (array('u.nid = pi.nid', 'cu.nid = pi.nid') as $clause) {
    $result = db_query("SELECT *
              FROM {pift_test}
              WHERE type = %d
              AND id IN (
                SELECT f.fid
                FROM {files} f
                LEFT JOIN {upload} u
                  ON f.fid = u.fid
                LEFT JOIN {comment_upload} cu
                  ON f.fid = cu.fid

                JOIN {project_issues} pi
                  ON " . $clause . "
                JOIN {pift_project} p
                  ON pi.pid = p.pid
                JOIN {project_release_nodes} r
                  ON pi.rid = r.nid

                JOIN {node} n
                  ON r.nid = n.nid
                JOIN {term_node} t
                  ON (n.vid = t.vid AND t.tid IN (" . db_placeholders($api_versions, 'int') . "))

                WHERE pi.sid IN (" . db_placeholders($sids, 'int') . ")
              )
              AND status > %d
              AND last_tested < %d",
              array_merge(array(PIFT_TYPE_FILE), $api_versions, $sids,
                          array(PIFT_STATUS_SENT, $retest_time)));
    while ($file = db_fetch_array($result)) {
      drupal_set_message(print_r($file, TRUE));
    }
  }

I am not really sure how to fix this since we cannot merge the node and comment table. Otherwise, have to do a select and then update inside a loop which doesn't sound good.

boombatower’s picture

Current query with values:

SELECT *
FROM {pift_test}
WHERE type = 2
AND id IN (
  SELECT f.fid
  FROM {files} f
  LEFT JOIN {upload} u
    ON f.fid = u.fid
  LEFT JOIN {comment_upload} cu
    ON f.fid = cu.fid

  JOIN {project_issues} pi
    ON u.nid = pi.nid
  JOIN {pift_project} p
    ON pi.pid = p.pid
  JOIN {project_release_nodes} r
    ON pi.rid = r.nid

  JOIN {node} n
    ON r.nid = n.nid
  JOIN {term_node} t
    ON (n.vid = t.vid AND t.tid IN (87, 103))

  WHERE pi.sid IN (8, 14)
)
AND status > 2
AND last_tested < 1262816107

Untested, but this should be what we are looking at:

SELECT *
FROM
(
  (
    SELECT *
    FROM {pift_test}
    WHERE type = 2
    AND id IN (
      SELECT f.fid
      FROM {files} f
      LEFT JOIN {upload} u
        ON f.fid = u.fid
      LEFT JOIN {comment_upload} cu
        ON f.fid = cu.fid

      JOIN {project_issues} pi
        ON u.nid = pi.nid
      JOIN {pift_project} p
        ON pi.pid = p.pid
      JOIN {project_release_nodes} r
        ON pi.rid = r.nid

      JOIN {node} n
        ON r.nid = n.nid
      JOIN {term_node} t
        ON (n.vid = t.vid AND t.tid IN (87, 103))

      WHERE pi.sid IN (8, 14)
    )
    AND status > 2
    AND last_tested < 1262816107
  )
  UNION
  (
    SELECT *
    FROM {pift_test}
    WHERE type = 2
    AND id IN (
      SELECT f.fid
      FROM {files} f
      LEFT JOIN {upload} u
        ON f.fid = u.fid
      LEFT JOIN {comment_upload} cu
        ON f.fid = cu.fid

      JOIN {project_issues} pi
        ON cu.nid = pi.nid
      JOIN {pift_project} p
        ON pi.pid = p.pid
      JOIN {project_release_nodes} r
        ON pi.rid = r.nid

      JOIN {node} n
        ON r.nid = n.nid
      JOIN {term_node} t
        ON (n.vid = t.vid AND t.tid IN (87, 103))

      WHERE pi.sid IN (8, 14)
    )
    AND status > 2
    AND last_tested < 1262816107
  )
  ORDER BY fid DESC
) x
GROUP BY nid
Damien Tournoud’s picture

Let's just add the nid to pift_test, and be done with that:

SELECT nid, MAX(id) FROM {pift_test} WHERE status > 2 AND last_tested < 1262816107 GROUP BY nid
webchick’s picture

Subscriiiiibe!

catch’s picture

Subscribing, I assume this is why patch re-testing is switched off at the moment.

rfay’s picture

Title: Ensure/Refactor cron_retest() query to only re-test the last file on an issue » Ensure/Refactor cron_retest() query to only re-test the last file on an issue; Automatically retest RTBC patches
jthorson’s picture

.

jthorson’s picture

Similar historical issue located at #635334: Patches are not being re-tested automatically.

jthorson’s picture

FileSize
2.57 KB

Attached patch contains a new 'Select' query for validation purposes before implementing the actual update.

Note: This patch is dependent on #1348958-6: Add 'nid' column to pift_test landing first.

jthorson’s picture

FileSize
2.88 KB

Needs some optimization, but this appears to work:

jthorson’s picture

Ooops. Wrong version. Getting late. :(

jthorson’s picture

FileSize
2.79 KB

Again (with a select instead of update).

jthorson’s picture

Status: Active » Needs work

Edit: Had an issue with a named constant.

jthorson’s picture

FileSize
2.86 KB

New candidate for testing/validation, but hard-coded to only work once, and trigger a very small number of retests (instead of flooding us with 1100+ tests, which is what would happen if we put this live without a careful launch strategy).

jthorson’s picture

Status: Needs work » Needs review
FileSize
3.63 KB

Last patch was a non-starter.

This one works. :)

But because enabling this could totally overload the testbots for a few days, I've left some safety checks in place. To actually enable re-testing using this patch, comment out the $retest_time = 1; line near the beginning of pift.cron.inc and reinstate the line before it.

We'll need to come up with a plan for a 'staged' retesting until we've cleared the backlog ... my thought was that enabling for "drush pift-cron", but not for pift_cron() would allow us to manually trigger batch retests as needed.

jthorson’s picture

Deployment Options:

1. Add a new drush command which accepts an argument 'X' and processes 'X' retests upon execution; leaving the 'retest' variable at 'disabled' until the backlog is cleared ... manually queueing during slow periods, and setting it up as a jenkins job once we know a reasonable value for 'X'.

2. Add the current 'queue depth' to the communication from PIFR to PIFT, and queue x retests based on queue depth triggers.

TODO: Update the 'retest interval' options to include weeks/months.

boombatower’s picture

Not a 100% sure what we did originally when this feature was turned on. Can we simply have a drush command for retesting that we then call on a different interval say twice a day and it only sends 30-40 tests until we are caught up? I mean having the the retests run an a slower interval is probably better anyway since they are lower priority.

jthorson’s picture

Yup ... that was the thought (Option 1).

The second thought was more for documentation's sake, and would probably be a longer term feature ... added it here so that I didn't forget it the next morning.

sun’s picture

Status: Needs review » Needs work
+++ b/pift.cron.inc
@@ -16,47 +16,47 @@ function pift_cron_retest() {
+    //$retest_time = time() - PIFT_RETEST;

Stale commented out debug code?

+++ b/pift.cron.inc
@@ -16,47 +16,47 @@ function pift_cron_retest() {
+    // Get last file test from each nid which is in 'needs review' or 'rtbc'.
+    $result = db_query(

Can you run an EXPLAIN on this query and post the results?

That is, because this query looks very expensive and potentially very long-running to me.

+++ b/pift.cron.inc
@@ -16,47 +16,47 @@ function pift_cron_retest() {
+        JOIN {project_issues} pi ON pt.nid = pi.nid
+        JOIN {project_release_nodes} prn ON pi.rid = prn.nid
+        JOIN {node} n ON prn.nid = n.nid
+        JOIN {term_node} t

JOINs should always be explicit INNER JOIN or LEFT JOIN.

+++ b/pift.cron.inc
@@ -16,47 +16,47 @@ function pift_cron_retest() {
+    // Update status for tests with final results (Passed or Failed)
+    // and which have not been tested in the retest period.

I've troubles to understand this comment. If I get the query correctly, then matching patches/tests are getting reset to be "queued"?

+++ b/pift.cron.inc
@@ -16,47 +16,47 @@ function pift_cron_retest() {
+      "Update {pift_test} set status = %d
...
+        ORDER BY test_id ASC
+        LIMIT 50",

1) All SQL statement keywords, such as UPDATE and SET should be all-uppercase.

2) ORDER BY and LIMIT is seems bogus here.

+++ b/pift.cron.inc
@@ -16,47 +16,47 @@ function pift_cron_retest() {
+    // TODO:  Remove 'Limit 50' above and DSM() before enabling in pift.module!
+    drupal_set_message("Re-Tests Queued: " . db_affected_rows());

A cron callback should not trigger dsm().

jthorson’s picture

Issue summary: View changes
FileSize
10.71 KB

Updated for the D7 drupal.org upgrade ... D7 Project* changes made the query look worse.

Will begin testing this tomorrow.

jthorson’s picture

jthorson’s picture

Status: Needs work » Needs review
FileSize
13 KB

Updated version.

jthorson’s picture

FileSize
13.46 KB

Release candidate

jthorson’s picture

FileSize
13.78 KB

Committed version.

Will be included in the next PIFT deploy, but the timeline for enabling the feature will be dependent on i) the number of outstanding RTBC retests, ii) confirming that the retest query doesn't negatively impact the drupal.org database, and iii) sorting out which option we use for actually triggering the retests.

jthorson’s picture

Status: Needs review » Needs work

Close, but first tests on prod uncovered that the query needs a bit of refactoring.

Right now, it chooses the 'last passed file that has not already been sent' per RTBC issue, as opposed to the 'last passed file *if it* has not already been sent'.

jthorson’s picture

Status: Needs work » Needs review
FileSize
2.38 KB

Okay, this looks better!

jthorson’s picture

Okay, did some more work on this, to try and see why it was negatively affecting drupal.org when I tried to enable it. I've now got pift-cron running only once per minute, and cleared out a test linking to a deleted file that was gumming things up ... and the PIFT logs look a lot more 'normal' now. There's a good chance that this query was trying to run twice concurrently.

The retest query itself looks like this:

SELECT pd.test_id AS test_id
FROM
pift_data pd
INNER JOIN (SELECT MAX(pd2.test_id) AS maxid
FROM
pift_data pd2
GROUP BY nid) maxids ON pd.test_id = maxids.maxid
INNER JOIN node n ON pd.nid = n.nid
INNER JOIN field_data_field_issue_status fdfis ON n.nid =
fdfis.entity_id AND n.type = fdfis.bundle
INNER JOIN field_data_field_project fdfp ON n.nid = fdfp.entity_id
AND n.type = fdfp.bundle
WHERE (pd.type = 2) AND (pd.status =
4) AND (pd.last_tested <
1394402320) AND (fdfis.entity_type =
'node') AND (fdfis.bundle IN
('project_issue')) AND (fdfis.field_issue_status_value IN
('14')) AND (fdfp.field_project_target_id IN
(3060))
ORDER BY pd.last_tested ASC
LIMIT 3 OFFSET 0

With the EXPLAIN pastebinned at http://privatepaste.com/9b19b39c2b

So ... any mysql optimizers able to help improve this?

drumm’s picture

Add ORDER BY NULL to the subquery:

(SELECT MAX(pd2.test_id) AS maxid FROM pift_data pd2 GROUP BY nid ORDER BY NULL)

GROUP BY implies a sort, which isn't too relevant in a subquery.

drumm’s picture

If any of the WHERE conditions on pd can also be added to pd2 in the subquery, that will reduce the number of rows examined.

Otherwise, the query is okay, since it is not run concurrently.

jthorson’s picture

The 'where' conditions need to be outside, as otherwise we get the last patch that meets the criteria, instead of getting the last patch and THEN seeing if it meets the criteria.

In any case, I've got a patch that restructures this into an EFQ to get the RTBC issues, and then uses php to filter out the last test. Will look at deploying for testing today.

jthorson’s picture

Status: Needs review » Fixed

The revamped patch (using EFQ to get issues, then loading all tests and sorting via PHP) is now live, via a new jenkins job.

The job will requeue up to 5 RTBC issues every 15 minutes. There are in the neighborhood of 400 RTBC patches given all of core and contrib, but 2/3 of these are D6/contrib tests which complete testing within just a couple minutes each; so they should not have a heavy impact on the queue. Core alone has 108 tests, which includes all of D6/D7/D8.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.