Firstly, thanks for the great work on Accumulative product import (drupal.org/node/1188994) - it works great!

My scenario:

1 CSV file, 2 feeds configured:

A product display import for product types
A product import for product variants (our SKUs).

We have 30 product types, each with multiple variants (between 1 and 2000 variants depending on the product type)

Our CSV file contains a row for each variant. My test CSV file has 12k rows. I can't alter the file format and the import needs to run every day.

The product import runs in 240 seconds - great. 12k SKUs created.

But the product display import (same file) takes 7200 seconds - not so great. 30 product displays created.

I'm guessing that batching might help?

Many thanks,
Mike.

Comments

pcambra’s picture

Yeah, that's something that may happen with a lot of data using the update mode.

Have you tried to use replace mode just to see if it's faster?

You can try batch mode and also drush: #608408: Drush integration for Feeds

UPMikeD’s picture

Replace is much faster, but you only get one product per product display.

Obvious when you think about it.

I'll try Drush next, but not not holding my breath as the code must be the same?

Mike.

pcambra’s picture

Replace is much faster, but you only get one product per product display.
Obvious when you think about it.

Yeah, obvious if the "slowerness" comes from the accumulative update, not so obvious if it comes from feeds itself.

The drush option will help you to abstract and use it at cron run.

UPMikeD’s picture

Thanks for the tips. Your help is much appreciated!

What would it take for someone to look at a refactor to improve performance? Do you think it's a big job?

Cheers,
Mike

UPMikeD’s picture

Thanks for the tips. Your help is much appreciated!

What would it take for someone to look at a refactor to improve performance? Do you think it's a big job?

Cheers,
Mike

pcambra’s picture

Not sure actually, would you be willing to provide some data examples for doing tests? Is there any threshold you've noticed for the loads?

UPMikeD’s picture

I've not done extensive tests, but will do so tomorrow. I suspect the products that have many SKU's are the issue.

I can provide some test data once I've pinpointed where the stress is. I'll post an update tomorrow.

Thanks,
Mike.

UPMikeD’s picture

StatusFileSize
new29.84 KB
new30.52 KB
new8.91 KB

Test Results using cut down data...

100 products and 1 product display - 31 sec
100 products and 100 product displays - 22 sec
1000 products and 1 product display - 901 sec
1000 products and 1000 product displays - 51 sec

Tests were run on a Kickstart 7x.1.2
Youll need to create a new product type and product display type to match the CSV that includes Range.
All done with update existing products on using the interactive upload via the import page.
Nodes / products were deleted before each test.
The actual parameter variations for size have been removed from the CSVs for simplicity.
Tests run on windows

CSV files and screen grabs of import setup attached.

Cheers,
Mike

pcambra’s picture

Priority: Major » Normal
Status: Active » Needs work

Thanks for the test data!

I'll do some benchmark and see if we can improve the multiple-products importer

UPMikeD’s picture

Excellent - thanks for your help.

Let me know if you need anything else.

Mike.

Mark Groves’s picture

Hi Pedro,

I am working with Mike on this project that requires multiple-products importer. This will start to become a bigger issue for us as the project progresses - is there a way that we can accelerate the fix, perhaps sponsor it?

Let me know if you need any more details or test data.

Many thanks

Mark

pcambra’s picture

Ok, gave this a deeper test and here are my results.

I can positively say that this performance issue is not due the commerce product reference processor, it will probably add some overhead but not the main slowness.
You can test this just by modifying the importer in #8 and remove the commerce product field, and just import the title getting more or less the same behavior, you can see #631962: FeedsNodeProcessor: Update when changed for more reference.

What I'd suggest for this case is to build an alternative FeedsNodeProcessor that avoids some of the node_loads and extra queries, see entityLoad & existingEntityId & entitySave

pcambra’s picture

Status: Needs work » Fixed

Marking this as fixed.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.