I create this ticket to avoid duplicated work, today I work on rewriting the queries of this module. So a patch can be expected today.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Aron Novak’s picture

FileSize
46.78 KB

Here is the patch. For me, the cron works (the automates test does not include cron-part) and all the tests are passed.

Aron Novak’s picture

Status: Active » Needs review
alex_b’s picture

Priority: Normal » Critical
Status: Needs review » Needs work

Just tried to patch current Drupal HEAD, 6 out of 16 hunks fail:
http://skitch.com/alexbarth/icj4/zend-studio

This patch is critical: pretty much all other proposed aggregator patches depend on it.

webchick’s picture

subscribing.

Aron Novak’s picture

Status: Needs work » Needs review
FileSize
48.03 KB

I rerolled the the patch.

xamanu’s picture

Status: Needs review » Needs work

patching works.
i did some user tests and almost everything worked: view feeds, categories of the feeds, aggregate feeds, update items, remove items, edit title of a feed....

but i found a problem editing the feed while changing the category, update interval, feed url and amount of items in a block without changing the feed name to a new one.
another problem is that cron job doesn't work anymore after applying the patch. i get a white screen when cron checks updates feeds.

Jose Reyero’s picture

FileSize
46.12 KB

Re-rolled the patch (Didn't apply anymore because of the REQUEST_TIME update)

It mostly works, passes tests, but has some issues:

- Error on cron run
"Fatal error: Cannot use object of type stdClass as array in /home/workspace/drupal/modules/aggregator/aggregator.module on line 613"

- Notice on manual update of the feed
notice: Undefined variable: num_rows in /home/workspace/drupal/modules/aggregator/aggregator.module on line 860.

- Coding style: uses some non descriptive variable names, i.e:

$f_alias = $categories->leftJoin('aggregator_category_feed', 'f', 'c.cid = f.cid AND f.fid = :fid', array(':fid' => $edit['fid']));
....
$delete_c = db_delete('aggregator_category_item');
$delete_i = db_delete('aggregator_item');
$condition_c = db_or();
$condition_i = db_or();
......

and I'm not yet familiar with the new db api, but it seems to me some parts could be more clearly written.

Also I think this cron error should be catched by the tests, so some more tests may be needed, this may be a different issue though.

Aron Novak’s picture

Status: Needs work » Needs review
FileSize
47.99 KB

xamanu and Jose: thanks for the reviews
I updated the patch:
- cron works
- feed editing works in all the cases
- the reported notice was killed
- improved variable names

I'm not sure about: what is the optimal way to use DB:TNG? I mean is it good that a SELECT query is executed via the new abstract way or not?
I tried to rewrite much queries to the new style. But maybe for some queries it only adds some overhead.

Dries’s picture

I seems like you fixed bugs that have not been caught by SimpleTest. Would it make sense to extend the SimpleTests?

I'll do a deep review shortly. The biggest /challenge/ are how the queries are rewriten -- it is not exactly easier to read/understand.

Damien Tournoud’s picture

That's a nice work, please continue. Here are a few remarks that should take some burden from your shoulders:

-  $categories = db_query('SELECT c.cid, c.title, f.fid FROM {aggregator_category} c LEFT JOIN {aggregator_category_feed} f ON c.cid = f.cid AND f.fid = %d ORDER BY title', $edit['fid']);
-  while ($category = db_fetch_object($categories)) {
+  $categories = db_select('aggregator_category', 'c');
+  $category_feed_alias = $categories->leftJoin('aggregator_category_feed', 'f', 'c.cid = f.cid AND f.fid = :fid', array(':fid' => $edit['fid']));
+  $categories->addField('c', 'cid', 'cid');
+  $title_field = $categories->addField('c', 'title', 'title');
+  $categories->addField($category_feed_alias, 'fid', 'fid');
+  $result = $categories->orderBy($title_field)->execute();

You only have to convert:
- SELECT queries that are dynamic in nature (ie. can't be written only with placeholder substitution)
- all INSERT, UPDATE, and DELETE queries

This query is not dynamic and we don't need to convert it (this applies to several other queries in the patch). Using the new placeholder style would be enough.

+        // Update the feed data
+        db_merge('aggregator_feed')->key(array('fid' => $feed['fid']))->fields(array('checked' => REQUEST_TIME, 'link' => $channel['LINK'], 'description' => $channel['DESCRIPTION'], 'image' => $image, 'hash' => $md5, 'etag' => $etag, 'modified' => $modified))->execute();

This will be clearer if the array was build outside the query.

@@ -457,13 +462,14 @@ function aggregator_form_category(&$form
 function aggregator_form_category_validate($form, &$form_state) {
   if ($form_state['values']['op'] == t('Save')) {
     // Check for duplicate titles
+    $query = db_select('aggregator_category', 'c')->condition('title', $form_state['values']['title']);
     if (isset($form_state['values']['cid'])) {
-      $category = db_fetch_object(db_query("SELECT cid FROM {aggregator_category} WHERE title = '%s' AND cid <> %d", $form_state['values']['title'], $form_state['values']['cid']));
+      $category = $query->condition('cid', $form_state['values']['cid'], '<>')->countQuery()->execute()->fetchField();
     }
     else {
-      $category = db_fetch_object(db_query("SELECT cid FROM {aggregator_category} WHERE title = '%s'", $form_state['values']['title']));
+      $category = $query->countQuery()->execute()->fetchField();
     }
-    if ($category) {
+    if ($category != 0) {
       form_set_error('title', t('A category named %category already exists. Please enter a unique title.', array('%category' => $form_state['values']['title'])));
     }
   }

Those countQuery-> are unnecessary (and they probably not even works...). You could also put the execute()->fetchField() outside the if clause and only add the cid <> condition in the first branch.

I seems like you fixed bugs that have not been caught by SimpleTest. Would it make sense to extend the SimpleTests?

We already new that one query in the Aggregator caused exceptions while not been caught by any assertion of the test case. That's what my work on have Simpletest catch error and exceptions from the tested site is all about.

Crell’s picture

Component: aggregator.module » database system

Refiling over into my queue so I can review it this weekend.

Crell’s picture

Status: Needs review » Needs work

Quick visual review of #10 above...

1) Yes, use the Select builder only if the query is dynamic, vis, it changes between page runs. If not, just use db_query() as it's faster and easier to read.

2) For db_merge() etc. the preferred format is for the array to go down, like so:

db_merge('aggregator_feed')
  ->key(array('fid' => $feed['fid']))
  ->fields(array(
    'checked' => REQUEST_TIME, 
    'link' => $channel['LINK'], 
    'description' => $channel['DESCRIPTION'], 
    'image' => $image, 
    'hash' => $md5, 
    'etag' => $etag, 
    'modified' => $modified,
  ))->execute();

3) $query->countQuery()->execute()->fetchField(); is correct, if you are dealing with a dynamic Select query. In this case it doesn't look like you need a dynamic query, so just use db_query("SELECT COUNT(*) FROM...");

4) db_fetch*() is deprecated and should be replaced with ->fetch() and friends.

There are now almost complete docs available, which I recommend reading. Cheers!

Aron Novak’s picture

FileSize
47.51 KB

New version of the patch:
- db_select only for dynamic queries
- db_merge and db_insert code style is according to the suggestions

Aron Novak’s picture

Status: Needs work » Needs review

Any more issues with the patch?

Aron Novak’s picture

@Dries: http://drupal.org/node/312316 - Here is my patch to avoid the bug that I made here.

Crell’s picture

Status: Needs review » Needs work

Dear god some of these queries are long. :-)

The queries in aggregator_form_feed_validate() could be collapsed into a single dynamic query with a conditional method, but I don't know that it's worth it here. Just for reference.

Line 335 of aggregator.module, there's a missing } on the table name.

Line 380 of aggregator.module, the ->execute() should go on its own line, with the arrow aligned with the )) on the previous line. (I know my sample code above didn't do that. My fault.)

Line 386 of aggregator.module, this line is over-complicated:

db_delete('blocks')->condition(db_and()->condition('module', 'aggregator')->condition('delta', 'category-' . $edit['cid']))->execute();

Multiple ->condition() calls stack with AND automatically, so you don't need to use db_and(). The following is sufficient:

db_delete('blocks')
  ->condition('module', 'aggregator')
  ->condition('delta', 'category-' . $edit['cid'])
  ->execute();

(And yes, that query is long enough it should use long format like that.)

Line 398 of aggregator.module, db_last_insert_id() is deprecated. If you need the ID, you must use an insert statement. Its return value is the generated serial id.

Line 418 of aggregator.module, that query is definitely long enough that it should use long format, and break the fields array into long format as well, just like db_merge(). I'd say anything that is longer than 80 characters should be split to long form for readability. There's a bunch more queries like that, so I won't list them all individually.

Line 425 and following of aggregator.module, excellent use of the db_or() but not necessary. You can instead do this, which will be clearer and a faster executing query, too.

$iids = db_query('SELECT iid FROM {aggregator_item} WHERE fid = :fid', array(':fid' => $edit['fid']))->fetchCol().
db_delete('aggregator_category_item')
  ->condition('iid', $iids, 'IN')
  ->execute();

The above will result in "DELETE FROM {aggregator_category_item} WHERE iid IN (...)", all nicely prepared for you.

Line 447 of aggregator.module, execute goes on its own line as above. There's a couple of places that needs to be done, so I won't list 'em all. (Again, my bad.) Also, line 450 db_last_insert_id() goes away, as above.

Line 477 and following, same WHERE IN change as above.

Line 847 of aggregator.module, same thing as above. This could be done as a single dynamic query with conditional method calls. I can go either way here. Committers?

LIne 872 of aggregator.module, same WHERE IN comment as above.

Line 893 of aggregator.module, if you use an associative array in values() then there's no need to specify fields() as well. Or more specifically, pass the associative array to fields() and don't use values() at all and you're done.

Line 102 of aggregator.pages.inc, there's no need to loop.

$item->categories = db_query(...)->fetchAll();

That gives you the same end result, and will be a bit faster since fetchAll() is happening in C code. Actually, that whole outer foreach loop can condense to a single line:

foreach ($result as $item) {
$items[$item->iid] = db_query(...)->fetchAll();
}

Line 213 of aggregator.pages.inc, that's a perfect case for a multi-insert statement. Instead of looping on the whole insert object, try this:

if (count($selection)) {
  $insert = db_insert('aggregator_category_item')->fields('iid', 'cid');
  foreach ($selection as $cid) {
    if ($cid) {
      $insert->values(array(
        'iid' => $iid,
        'cid' => $cid,
      ));
    }
  }
  $insert->execute();
}

That way it all happens in a single query in MySQL or in a transaction in Postgres. Much faster. (The If check is because insert queries don't currently check themselves against empty values. They probably should, but don't yet.)

Line 350 of aggregator.pages.inc, I don't get why you're defining $sql separately this one time only. If that's existing code, you may want to just go ahead and merge it with line 351 like the rest of the code base for cleanliness. Same on line 356.

Line 361 of aggregator.pages.inc, $feeds = $result->fetchAll(); is all you need.

aggregator.test line 68, there's a db_rewrite_sql(). We can't get rid of that yet, but let's remember to get back to it. Leave a todo in a comment, perhaps?

aggregator.test line 451 and following, that WHERE IN thing again. :-) I'd also suggest using either $select or $query for the name of the object. $count suggests a numeric count of objects.

Sorry to nitpick, but this is the first module upgrade I'm really going through with a fine toothed comb so I want to make sure we start from a high standard. :-) Thanks for your hard work!

Aron Novak’s picture

Status: Needs work » Needs review
FileSize
47.89 KB

The patch is updated again.

Crell’s picture

Status: Needs review » Needs work

aggregator.admin.inc:

line 136 and 139: Can we break those queries across multiple lines? At minimum the replacement values array should be split across multiple lines.

line 260 and following, that loop isn't needed, thanks to the wonders of array_map(). Instead, you can do this:

$options = array_map('check_plain', db_query("...")->fetchAllKeyed());

aggregator.module:

lines 436 and 437, webchick has asked that we standardize on always breaking those queries across multiple lines, even if they're short. There's a couple more of these elsewhere, too.

line 446 and following, I find it more readable and easier to edit if the closing )) of the fields(array( clause are on their own line, and the 'image' => '' has an extra comma at the end. That's more in line with the way arrays are split up in menu hooks (or they should be). I'm not going to block the patch on that alone, but may as well fix it in case webchick wants to. :-) The same applies to other places the builders are used, but I won't mention each individually.

aggregator.pages.inc:

Line 350:

db_query('SELECT cid, title FROM {aggregator_category} WHERE cid = :cid', array(':cid' => arg(2)))->fetchRow();

Er, fetchRow()? I think that may be one of the few fetch mechanisms we don't have. :-) You probably want fetchObject() or fetchAssoc() there.

aggregator.test:

line 370, there's still a db_fetch_object() call in there.

Almost there. Thanks, Aron!

Aron Novak’s picture

Just one question before doing the update of the patch:
"lines 436 and 437, webchick has asked that we standardize on always breaking those queries across multiple lines, even if they're short. There's a couple more of these elsewhere, too."
Does this mean that even those queries should be splitted?

db_delete('aggregator_feed')->execute();
Crell’s picture

Hrm. I will defer to her on that one. Personally I would say don't split that, as it's actually more readable on one line. Just split those that chain 2 or more method calls. Let's check with the boss first, though. :-)

webchick’s picture

One line is fine on that one, yep. Thanks, Aron/Crell!

Aron Novak’s picture

Status: Needs work » Needs review
FileSize
49.05 KB

I rerolled the patch and applied the reflections.
I also introduced the following, please judge on it:
At the similar queries like below:

$iids = db_query('SELECT iid FROM {aggregator_item} WHERE fid = :fid',
    array(':fid' => $feed['fid']))
    ->fetchCol();

I break the lines this style if only one param is here.
But there are several params, I break the query in that way to make it really similar to db_insert and db_update query styles:

$entry = db_query("SELECT iid, timestamp FROM {aggregator_item} WHERE fid = :fid AND guid = :guid",
        array(
          ':fid' => $feed['fid'],
          ':guid' => $guid,
        ))
        ->fetchObject();

Do you like it? My basic approach was to make db_query calls similar to other db_ calls.

Dries’s picture

+  $iids = db_query('SELECT iid FROM {aggregator_item} WHERE fid = :fid',
+    array(':fid' => $feed['fid']))
+    ->fetchCol();

IMO, that is just too much line-wrapping. Just write it in one line. Personally, I think we're somewhat exaggerating with the DBTNG line wrapping rules. I recommend that we write the above as one line.

Crell’s picture

We really should nail down when to wrap. :-) In most places I've been putting the array( on the line before it, in the db_query( line, and then the array elements on their own line. Webchick asked for the fetch*() call to be on its own line.

At this point, I defer to Webchick and Dries on the intricacies of whitespace. #22 looks good to go to me otherwise.

Aron Novak’s picture

FileSize
48.36 KB

Hm. Here is a version where the db_query() and db_query_range() function lines are not wrapped at all.
And I'll post soon a version where those are wrapped if longer than 80 chars. (array( will remain at the db_query line and then one parameter per line)

Aron Novak’s picture

FileSize
48.88 KB

At this version of the patch I wrapped the db_query() lines according to #24
I hope #25 or #26 will be adequate.

Dries’s picture

Status: Needs review » Fixed

Reviewed, tested and committed to CVS HEAD. Thanks Aron et al! :)

alex_b’s picture

yeay!

Crell’s picture

Dries: For reference, which one did you commit? :-)

Aron Novak’s picture

As I see the #25 was committed.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

Crell’s picture

Issue tags: +DBTNG Conversion

Tagging.