Simple view crashes MySQL - probably because of Amazontools data model

asb - August 21, 2008 - 16:15
Project:Amazon associate tools
Version:5.x-1.5
Component:Code
Category:bug report
Priority:critical
Assigned:Unassigned
Status:postponed (maintainer needs more info)
Description

Hi,

I'm trying to build a (very simple) view which is supposed to return a page (view type: table view). It's supposed to display images provided by Amazon, side by side with the "real" images for certain products (which often don't match).

As soon as the view is executed, the MySQL deamon starts spawning an increasing number of threads and queries; after a few minutes, the site becomes inaccessible, this continues for about ten minutes, then MySQL dies. No output is given over this time.

Samples for the queries:

SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype = 'amazon' LIMIT 1 OFFSET 214

SELECT count(node.nid) FROM node node LEFT JOIN image_attach image_attach ON node.nid = image_attach.nid LEFT JOIN amazonnode amazonnode ON node.nid = amazonnode.nid LEFT JOIN amazonitem amazonitem ON amazonnode.asin = amazonitem.asin LEFT JOIN content_type_ausgabe node_data_field_seriennummer_integer ON node.vid = node_data_field_seriennummer_integer.vid LEFT JOIN content_type_ausgabe node_data_field_artikelnummer ON node.vid = node_data_field_artikelnummer.vid LEFT JOIN content_type_ausgabe node_data_field_isbn ON node.vid = node_data_field_isbn.vid LEFT JOIN content_type_ausgabe node_data_field_erscheinungsdatum ON node.vid = node_data_field_erscheinungsdatum.vid

A longer description of what I'm trying to accomplish and what happens has been posted in the Views Issues queue, including an export of the view.

Merlinofchaos, one of the most knowledgeable coders of the Views module responded: "If you have views that create inefficient queries that cause MySQL problems, then you need to refactor your views or not use them. [...] Maybe the amazon module needs some tweaking to store its data more sanely so that this kind of thing doesn't happen. [...]".

After this I did some testing with Views and data from AAT; indeed, all forever-running queries were related to AAT data; in most, but not in all cases, similar problems occured, even on different sites with different AAT data. It's quite safe to say that there seems to be some major problems with the Amazontools data models which render the available data quite useless for harvesting through Views. Even the most simple queries built with views very often did not execute, or killed mysqld.

I'd love to hear about experiences from other AAT + Views users - do you encounter similar problems? Have you found workarounds or functions that should not be used?

@Eaton: How much of the data model from AAT went into the new "Amazon" module? Is the "Amazon" module Views safe?

Thanks & greetings, -asb

#1

MGParisi - August 28, 2008 - 15:52

I can not comment on the "safety" issue Eaton talks about and I have not touched or upgraded this project, I just took it over. I also have not played with that functionality AND I can not "tweak" the data model for restrictions on its ability to be upgraded. This project is essentially dead, expect a cosmetic upgrade with a few other minor fixes to be deployed and then the project will just die out in favor of views based projects.

if you find a solution that does not involve changing the data model then I will be happy to deploy it. I would not expect any changes to the data structure to be deployed as per the request of the community. Ill try to post the updated version. It is not going to fix this issue.

I appreciate any help.

There are associated projects that are in a better status then this one. If its not to late I would suggest moving over to one of them. If Eaton has any improvements that can be made in the performance of this project, he is also welcome to contribute.

#2

eaton - August 28, 2008 - 16:16

I heard my name, and I arrive. ;-)

In looking at the query, it doesn't look like anything particularly disastrous is being done. The queries being generated are no more inefficient than the average node related stuff in terms of basic data. It might be worth double-checking to make sure that the .install file properly sets up indexes on all of the fields being used for those joins, though -- unindexed queries on large data sets are always a performance killer.

@Eaton: How much of the data model from AAT went into the new "Amazon" module? Is the "Amazon" module Views safe?

The new Amazon module for D6 uses a from-scratch schema, not the AAT one. It's important to note, though, that its schema is actually MORE normalized than AAT's, which could cause even more serious performance problems if simple queries like the one you pasted are causing issues. If you have access to a MySQL client or something like phpMyAdmin, I'd HIGHLY recommend you take that query, and some of the others causing those slowdoesn, and run them through EXPLAIN to see if there is something in particular causing the problems.

AAT's data schema is definitely not complicated -- it just has lots of columns, something that shouldn't affect query speed.

#3

MGParisi - August 28, 2008 - 19:23

If the indexing is to blame, then I can include such SQL in the upgrade path. Unfortunately I DO NOT have a test machine for Drupal Development at this time. Something I will be fixing soon! Give me the index SQL needed and I will include it in the next update. As Eaton has said it is most likely a failure to properly index the tables. ASIN IS NOT INDEXED! and the relationships you are using, and others are on ASIN... Therefor this is most likely the cause of the failure.

Try running the following two Lines of SQL and tell me if it fixes the issue

ALTER TABLE `amazonitem` ADD INDEX ( `asin` ) 
ALTER TABLE `amazonnode` ADD INDEX ( `asin` ) 

If it does, then when I get a testbox up and running I will include this in the install file and publish all results.

#4

asb - August 29, 2008 - 07:51

Hi MGParisi,

thanks you for the quick reply!

adding indexes to the "amazonitem", and "amazonnode" tables works fine:

mysql> ALTER TABLE `amazonitem` ADD INDEX ( `asin` );
Query OK, 14485 rows affected (0.88 sec)
Records: 14485  Duplicates: 0  Warnings: 0

mysql> ALTER TABLE `amazonnode` ADD INDEX ( `asin` );
Query OK, 16272 rows affected (0.23 sec)
Records: 16272  Duplicates: 0  Warnings: 0

However, the behaviour of the views is not changed. Sample output from Mytop, an Mysql monitor:

MySQL on localhost (5.0.32-Debian_7etch6)                                                                                            up 1+09:21:42 [09:27:29]
Queries: 24.0M  qps:  210 Slow:     0.0         Se/In/Up/De(%):    82/00/00/00
             qps now:   32 Slow qps: 0.0  Threads:   12 (  11/  33) 33/00/00/00
Key Efficiency: 98.2%  Bps in/out:   0.0/  0.5   Now in/out:   8.4/ 2.3k

      Id      User         Host/IP         DB      Time    Cmd Query or State
      --      ----         -------         --      ----    --- ----------
  103302      root       localhost drupa         0  Query show full processlist
   45018      root       localhost drupa         5  Sleep
  103290    drupal       localhost drupa        69  Query SELECT a.* FROM amazonitem a WHERE asin IN ('B001E3IKS6','B001E3EUAS')
  103291    drupal       localhost drupa        69  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
   68906      root       localhost drupa       108  Query EXPLAIN SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHE
  103260    drupal       localhost drupa       113  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103249    drupal       localhost drupa       123  Query SELECT a.* FROM amazonitem a WHERE asin IN ('3540229396','3936546282','020171499X','3540259953
  103244    drupal       localhost drupa       136  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103241    drupal       localhost drupa       150  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103239    drupal       localhost drupa       157  Query SELECT a.* FROM amazonitem a WHERE asin IN ('B000V3F38K','B000V3F394','B000V3F38U','B000VBC6G4
  103234    drupal       localhost drupa       167  Query update amazonitem set detailpageurl = 'http://www.amazon.de/Joomla-1-5-organisieren-gestalten-
  103219    drupal       localhost drupa       179  Query SELECT count(node.nid) FROM node node LEFT JOIN image_attach image_attach ON node.nid = image_
...

The "EXPLAIN" statement ("Query EXPLAIN SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin ...") issued from inside mytop by user "root" does not output anything.

The view does not result in any output for approx. 5 minutes, cut continues to grow queries. The output of Mytop now looks like this:

MySQL on localhost (5.0.32-Debian_7etch6)                                                                                            up 1+09:26:36 [09:32:23]
Queries: 24.1M  qps:  210 Slow:     0.0         Se/In/Up/De(%):    82/00/00/00
             qps now:   33 Slow qps: 0.0  Threads:   43 (  42/   2) 86/00/00/00
Key Efficiency: 98.2%  Bps in/out:   0.0/  0.4   Now in/out:   8.4/ 3.7k

      Id      User         Host/IP         DB      Time    Cmd Query or State
      --      ----         -------         --      ----    --- ----------
  103478      root       localhost drupa         0  Query show full processlist
   45018      root       localhost drupa         2  Sleep
  103481    drupal       localhost drupa         5  Query SELECT * FROM amazonitem WHERE asin = 'B000WL7FOQ'
  103480    drupal       localhost drupa         6  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103472    drupal       localhost drupa        19  Query SELECT * FROM amazonitem WHERE asin = 'B0007VVFTW'
  103469    drupal       localhost drupa        27  Query SELECT * FROM amazonitem WHERE asin = 'B000WL7FOQ'
  103468    drupal       localhost drupa        28  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103460    drupal       localhost drupa        47  Query SELECT * FROM amazonitem WHERE asin = 'B0007VVFTW'
  103455    drupal       localhost drupa        48  Query SELECT * FROM amazonitem WHERE asin = 'B000WL7FOQ'
  103454    drupal       localhost drupa        51  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103451    drupal       localhost drupa        57  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103447    drupal       localhost drupa        58  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103448    drupal       localhost drupa        58  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103449    drupal       localhost drupa        58  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103439    drupal       localhost drupa        75  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103440    drupal       localhost drupa        75  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103438    drupal       localhost drupa        76  Query SELECT a.* FROM amazonitem a WHERE asin IN ('B001E3IKS6','B001E3EUAS')
  103432    drupal       localhost drupa        87  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103429    drupal       localhost drupa        88  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103430    drupal       localhost drupa        88  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103431    drupal       localhost drupa        88  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103425    drupal       localhost drupa        97  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103426    drupal       localhost drupa        97  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103418    drupal       localhost drupa       113  Query SELECT * FROM amazonitem WHERE asin = 'B0007VVFTW'
  103414    drupal       localhost drupa       118  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103408    drupal       localhost drupa       139  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103391      root       localhost drupa       150  Query EXPLAIN SELECT count(node.nid) FROM node node LEFT JOIN image_attach image_attach ON node.nid
  103382    drupal       localhost drupa       209  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103380    drupal       localhost drupa       212  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103378    drupal       localhost drupa       216  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103361    drupal       localhost drupa       237  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103349    drupal       localhost drupa       258  Query SELECT * FROM amazonitem WHERE asin = '3899301978'
  103337    drupal       localhost drupa       268  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103290    drupal       localhost drupa       363  Query SELECT a.* FROM amazonitem a WHERE asin IN ('B001E3IKS6','B001E3EUAS')
  103291    drupal       localhost drupa       363  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
   68906      root       localhost drupa       402  Query EXPLAIN SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHE
  103260    drupal       localhost drupa       407  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103249    drupal       localhost drupa       417  Query SELECT a.* FROM amazonitem a WHERE asin IN ('3540229396','3936546282','020171499X','3540259953
  103244    drupal       localhost drupa       430  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103241    drupal       localhost drupa       444  Query SELECT n.nid, n.rating, a.* FROM amazonnode n JOIN amazonitem a ON a.asin = n.asin WHERE ntype
  103239    drupal       localhost drupa       451  Query SELECT a.* FROM amazonitem a WHERE asin IN ('B000V3F38K','B000V3F394','B000V3F38U','B000VBC6G4
  103234    drupal       localhost drupa       461  Query update amazonitem set detailpageurl = 'http://www.amazon.de/Joomla-1-5-organisieren-gestalten-
  103219    drupal       localhost drupa       473  Query SELECT count(node.nid) FROM node node LEFT JOIN image_attach image_attach ON node.nid = image_
...

At this point, all websites running on Drupal/Mysql are doing denial of service: "Unable to connect to database server ... The MySQL error was: Too many connections." The queries are still running and growing, until mysqld crashes, or is manually stopped and restarted.

Now we can check a view *without* touching the "Amazontools" tables, for example:

  $view = new stdClass();
  $view->name = 'Digitale_Bibliothek';
  $view->description = 'Digitale Bibliothek';
  $view->access = array (
);
  $view->view_args_php = '';
  $view->page = TRUE;
  $view->page_title = 'Digitale Bibliothek';
  $view->page_header = '\'\'\'Alle Bände der ›Digitalen Bibliothek‹ der Directmedia Publishing\'\'\':';
  $view->page_header_format = '4';
  $view->page_footer = '';
  $view->page_footer_format = '4';
  $view->page_empty = 'Keine Ausgaben aus den angegebenen Serien/Reihen gefunden!';
  $view->page_empty_format = '4';
  $view->page_type = 'table';
  $view->url = 'digibib';
  $view->use_pager = TRUE;
  $view->nodes_per_page = '25';
  $view->menu = TRUE;
  $view->menu_title = 'Digitale Bibliothek';
  $view->menu_tab = FALSE;
  $view->menu_tab_weight = '0';
  $view->menu_tab_default = FALSE;
  $view->menu_tab_default_parent = NULL;
  $view->menu_tab_default_parent_type = 'tab';
  $view->menu_parent_tab_weight = '0';
  $view->menu_parent_title = '';
  $view->sort = array (
    array (
      'tablename' => 'node_counter',
      'field' => 'totalcount',
      'sortorder' => 'DESC',
      'options' => '',
    ),
  );
  $view->argument = array (
  );
  $view->field = array (
    array (
      'tablename' => 'node',
      'field' => 'title',
      'label' => 'Titel',
      'handler' => 'views_handler_field_nodelink',
      'sortable' => '1',
      'defaultsort' => 'ASC',
      'options' => 'link',
    ),
    array (
      'tablename' => 'node',
      'field' => 'type',
      'label' => 'Typ',
      'sortable' => '1',
    ),
    array (
      'tablename' => 'users',
      'field' => 'name',
      'label' => 'Autor',
      'sortable' => '1',
    ),
    array (
      'tablename' => 'node',
      'field' => 'created',
      'label' => 'Datum',
      'handler' => 'views_handler_field_date_small',
      'sortable' => '1',
    ),
    array (
      'tablename' => 'term_node',
      'field' => 'name',
      'label' => 'Kategorien',
      'options' => 'link',
    ),
    array (
      'tablename' => 'node_counter',
      'field' => 'totalcount',
      'label' => '#',
      'sortable' => '1',
    ),
    array (
      'tablename' => 'votingapi_vote_vote_percent',
      'field' => 'value',
      'label' => 'Bewertung',
      'handler' => 'votingapi_views_formatter_cleaned',
      'sortable' => '1',
    ),
  );
  $view->filter = array (
    array (
      'tablename' => 'term_node_2',
      'field' => 'tid',
      'operator' => 'AND',
      'options' => '',
      'value' => array (
  0 => 'Digitale Bibliothek',
),
    ),
    array (
      'tablename' => 'node',
      'field' => 'status',
      'operator' => '=',
      'options' => '',
      'value' => '1',
    ),
    array (
      'tablename' => 'votingapi_vote_vote_percent',
      'field' => 'value',
      'operator' => '>',
      'options' => '',
      'value' => '0',
    ),
  );
  $view->exposed_filter = array (
    array (
      'tablename' => 'term_node_2',
      'field' => 'tid',
      'label' => '',
      'optional' => '0',
      'is_default' => '0',
      'operator' => '0',
      'single' => '0',
    ),
    array (
      'tablename' => 'votingapi_vote_vote_percent',
      'field' => 'value',
      'label' => '',
      'optional' => '0',
      'is_default' => '0',
      'operator' => '0',
      'single' => '0',
    ),
  );
  $view->requires = array(node_counter, node, users, term_node, votingapi_vote_vote_percent, term_node_2);
  $views[$view->name] = $view;

This one is executed quickly in less than one second after issuing the query. This does not prove much, but at least gives a hint that Mysql and Views seem to be working fine (IMHO).

Sorry for not being able to offer more help, but my skills as sql debugger are quite limited ;(

Thanks & greetings, -asb

#5

MGParisi - August 29, 2008 - 18:18

Check your indexes in the two Amazon tables. One of them has ISBN as a primary which will duplicate the key when using my two examples (this will not effect your output), I can easily fin out this table name, it is the seconf table that I am not 100% sure wasn't indexed, I need to make sure that it inst indexed already to include in a patch. I hope Eaton will take a look at this.

I would suggest cutting out portions of the SQL left join until you find the portion that is causing the problem.

Here is the SQL statement more readable

SELECT count(node.nid) FROM
node node
LEFT JOIN image_attach image_attach
ON node.nid = image_attach.nid
LEFT JOIN amazonnode amazonnode
ON node.nid = amazonnode.nid
LEFT JOIN amazonitem amazonitem
ON amazonnode.asin = amazonitem.asin
LEFT JOIN content_type_ausgabe node_data_field_seriennummer_integer
ON node.vid = node_data_field_seriennummer_integer.vid
LEFT JOIN content_type_ausgabe node_data_field_artikelnummer
ON node.vid = node_data_field_artikelnummer.vid
LEFT JOIN content_type_ausgabe node_data_field_isbn
ON node.vid = node_data_field_isbn.vid
LEFT JOIN content_type_ausgabe node_data_field_erscheinungsdatum
ON node.vid = node_data_field_erscheinungsdatum.vid

#6

MGParisi - September 17, 2008 - 20:26
Status:active» postponed (maintainer needs more info)

No New information from OP, cant reproduce problem :(

 
 

Drupal is a registered trademark of Dries Buytaert.