The user should be able to create a feed from the command line and refresh it.

CommentFileSizeAuthor
#153 interdiff-608408-147-153.txt2.49 KBmegachriz
#153 feeds-drush-integration-608408-153.patch28.59 KBmegachriz
#147 interdiff-608408-145-147.txt555 bytesmegachriz
#147 feeds-drush-integration-608408-147.patch28.39 KBmegachriz
#145 interdiff-608408-144-145.txt4.61 KBmegachriz
#145 feeds-drush-integration-608408-145.patch28.37 KBmegachriz
#144 diff-5.txt2.5 KBmegachriz
#144 diff-4.txt982 bytesmegachriz
#144 diff-2-3.txt1.79 KBmegachriz
#144 diff-1.txt1.87 KBmegachriz
#144 feeds-drush-integration-608408-144.patch26.88 KBmegachriz
#141 interdiff-608408-140-141.txt1.14 KBmegachriz
#141 feeds-drush-integration-608408-141.patch24.08 KBmegachriz
#140 interdiff-608408-139-140.txt10.53 KBmegachriz
#140 feeds-drush-integration-608408-140.patch23.94 KBmegachriz
#1 feeds.drush_.inc_.txt6.57 KBivanbueno
#3 feeds.drush_.inc_.txt6.67 KBivanbueno
#11 feeds-cron-alpha14.patch3.16 KBivanbueno
#11 feeds.drush_.inc_.txt3.88 KBivanbueno
#14 feeds.drush_.inc_.txt3.91 KBivanbueno
#14 feeds-cron-alpha14.patch2.6 KBivanbueno
#18 feeds.drush_.inc_.txt4.06 KBivanbueno
#19 feeds.drush_.inc_.txt3.46 KBericduran
#21 feeds.drush_.inc_.txt3.74 KBericduran
#23 608408-feeds_drush_integration-23.patch4.45 KBpcambra
#27 608408-feeds_drush_integration-27.patch9.36 KBpcambra
#29 608408-feeds_drush_integration-29.patch11.02 KBpcambra
#31 608408-feeds_drush_integration-31.patch11.35 KBpcambra
#39 feeds-profile-settings.png48.06 KBshaundychko
#41 608408-feeds_drush_integration-41.patch10.09 KBmvc
#42 feeds.drush_.inc_.txt10.04 KBsir_squall
#43 feeds_drush_integration-608408-42.patch10.52 KBtommyk
#45 feeds-drush-integration-608408-45.patch10.69 KBclemens.tolboom
#55 608408-45vs55-interdiff.patch3.12 KBandypost
#55 608408-feeds_drush_d6-55.patch11.08 KBandypost
#55 608408-31vs55-interdiff.patch7.8 KBandypost
#55 608408-feeds_drush_d7-55.patch12.11 KBandypost
#57 55vs57.interdiff.txt512 bytesandypost
#57 608408-feeds_drush_d7-57.patch12.11 KBandypost
#71 608408-interdiff-d7-54vs71.txt4.17 KBandypost
#71 608408-feeds_drush_vd7-71.patch12.1 KBandypost
#71 608408-interdiff-d6-54vs71.txt4.06 KBandypost
#71 608408-feeds_drush_vd6-71.patch11.08 KBandypost
#77 feeds-drush-integration-608408-77.patch12.69 KBjurriaanroelofs
#80 interdiff.txt2.79 KBqueenvictoria
#80 feeds-drush-integration-608408-80.patch12.01 KBqueenvictoria
#81 interdiff.txt3.36 KBqueenvictoria
#81 feeds-drush-integration-608408-81.patch14.52 KBqueenvictoria
#82 feeds-drush-integration-608408-81-82-interdiff.txt3.63 KBqueenvictoria
#82 feeds-drush-integration-608408-82.patch15.28 KBqueenvictoria
#84 feeds-drush-integration-608408-81-82-interdiff.txt3.69 KBqueenvictoria
#84 feeds-drush-integration-608408-83.patch15.3 KBqueenvictoria
#87 interdiff.txt1.58 KBmparker17
#87 feeds-drush_integration-608408-87.patch16.08 KBmparker17
#92 interdiff.txt822 bytesmparker17
#92 feeds-drush_integration-608408-92.patch16.2 KBmparker17
#102 feeds-drush_integration-608408-102.patch16.21 KBagnese.stelce
#102 interdiff-feeds-drush_integration-608408-92-102.txt626 bytesagnese.stelce
#103 feeds-drush_integration-608408-103.patch16.58 KBalan-ps
#103 interdiff-608408-102-103.txt3 KBalan-ps
#109 feeds-drush_integration-608408-109.patch14.6 KBtwistor
#110 interdiff.txt1.69 KBtwistor
#110 feeds-drush_integration-608408-110.patch14.57 KBtwistor
#113 interdiff.txt1.75 KBtwistor
#113 feeds-drush_integration-608408-113.patch15.36 KBtwistor
#128 feeds-drush_integration-608408-128.patch15.48 KBkenorb
#129 interdiff-feeds-drush_integration-608408-113.patch.txt7 KBkenorb
#135 feeds-drush-integration-608408-135.patch15.48 KBmegachriz
#135 interdiff-608408-128-135.txt8.87 KBmegachriz
#136 feeds-drush-integration-608408-136.patch15.37 KBmegachriz
#136 interdiff-608408-135-136.txt6.32 KBmegachriz
#137 feeds-drush-integration-608408-137.patch15.38 KBmegachriz
#137 interdiff-608408-136-137.txt2.08 KBmegachriz
#138 feeds-drush-integration-608408-138.patch17.8 KBmegachriz
#138 interdiff-608408-137-138.txt12.44 KBmegachriz
#139 feeds-drush-integration-608408-139.patch17.89 KBmegachriz
#139 interdiff-608408-138-139.txt7.08 KBmegachriz

Comments

ivanbueno’s picture

Version: » 6.x-1.0-alpha11
Status: Active » Needs review
StatusFileSize
new6.57 KB

I created some drush commands for the feeds module. (The file is coded for Drush 3.)

The commands available are:

drush feeds-config
* Displays all active importers or displays the config of a given importer (passed as arg).

drush feeds-refresh
* Refreshes a feed based on its schedule.

drush feeds-queue
* Adds a scheduled feed to the drupal_queue. (Needs to be run in conjunction with "drush queue cron".)

However, the feature needed above is not satisfied yet.

blakehall’s picture

Status: Needs review » Needs work

I gave this a quick test, it looks like the schema has changed. (last_scheduled_time -> last_executed_time)

Also running drush feeds-refresh gave me the following error:

Exception: Empty configuration identifier. in sites/all/modules/feeds/includes/FeedsConfigurable.inc on line 56

Marking code needs work, until I can give this further testing / work.

ivanbueno’s picture

Version: 6.x-1.0-alpha11 » 6.x-1.0-alpha12
Status: Needs work » Needs review
StatusFileSize
new6.67 KB

Here's an updated version for alpha12, where the schema changed happened. I haven't tested this yet for alpha14. Let me know how it goes.

alex_b’s picture

Version: 6.x-1.0-alpha12 » 6.x-1.x-dev
Status: Needs review » Needs work

I do not understand the intention of the patch in #3. Let me write out what it does to make sure we're on the same page:

1. drush feeds-refresh [importer_id] imports all feed subscriptions of a specific importer id IF they are up for a scheduled import.
2. drush feeds-queue [importer_id] queues all feed subscriptions of a specific importer id for import IF they are up for a scheduled import.

Right?

What I do not understand is:

Why do you not get away with writing a very simple wrapper for invoking feeds_scheduler()->cron(); ? This is essentially what's invoked on feeds_cron(). The logic of whether something is being queued or worked off immediately is present in FeedsScheduler::cron().

alex_b’s picture

ivanbueno: I knew we've started this discussion already #717390-12: What's the best way to refresh feeds faster with the Feeds module? - still my questions in #4 remain the same.

ivanbueno’s picture

alex_b: I did not use FeedsScheduler::cron() because it will act upon scheduled imports of all importer id's. I don't see a way to target a specific importer id. Let me know of the alternative or if I'm missing something.

alex_b’s picture

"because it will act upon scheduled imports of all importer id's"

Why is this a problem?

ivanbueno’s picture

What's executing the import for a particular importer is a daemon script that runs every 15 seconds. (We're not using cron to import the feeds.) When that importer runs, we don't want any other importers to run that might slow the process. The shell script has more control on which importer to run based on the importance, load, and nature of the importer.

On a side note, pubsubhubbub would have solved our problems, except we're not dealing with rss/atom feeds. It's where we want to go though.

alex_b’s picture

When that importer runs, we don't want any other importers to run that might slow the process.

I see. I'll assume now that these other importer's impact is not negligable. Here is how your use case needs to be addressed:

1) Add optional parameter importer_id to FeedsScheduler::cron(). If given, only work off subscriptions of feeds using importer_id.
2) Write a drush feeds-cron command that accepts an optional importer_id string that is passed on to FeedsScheduler::cron().
3) Optionally, introduce a Drupal variable that controls whether feeds_cron() invokes FeedsScheduler::cron() or not. This could allow you to shut off cron scheduling on cron.php altogether and move it to a separate process triggered by drush.

1) and 3) are a different issue than this one, 2) could be part of this issue.

On a side note, pubsubhubbub would have solved our problems, except we're not dealing with rss/atom feeds. It's where we want to go though.

What feeds are you dealing with? Pubsubhubbub is moving towards a content-type independent specification.

ivanbueno’s picture

Ok, 1, 2, and 3, works for my setup. I'll write a patch for 1 and 2. For #3, I'm not sure yet how to best handle that. (Additionally, I think the drush import process should use its own semaphore, instead of variable feeds_scheduler_cron... for cases where people have a drush script and cron running simultaneously.)

What feeds are you dealing with? Pubsubhubbub is moving towards a content-type independent specification

We're working with NewsML files (an xml standard for news press releases), where each post entry is an xml file. Content-type independent for pubsubhub would be great!

ivanbueno’s picture

Status: Needs work » Needs review
StatusFileSize
new3.16 KB
new3.88 KB

Please review this patch on feeds_scheduler::cron(). It's modified to cron($feed_name = NULL, $semaphore_type = NULL, $bypass_queue = FALSE).

$feed_name is added so cron can run a specific importer.
$semaphore_type is added so drush or any other caller can run their own cycles.
$bypass_queue is added so drush has an option to bypass cron-queue even if drupal_queue is enabled.

The drush file has been updated, and much cleaner:

% drush feeds-cron
Refreshes a feed right away. Uses feeds_scheduler::cron();

% drush feeds-queue
Adds a feed to queue. Nees "drush queue-cron" needs to be run afterwards. Also uses Uses feeds_scheduler::cron();

% drush feeds-config
Get all active importers for a site, or get details about an importer.

TODO:
Adding $bypass_drupal_cron for setting an importer to be ignored by drupal cron.

TESTED ON:
* Ran cron.php without drupal_queue
* Ran cron.php with drupal_queue, then drush queue-cron
* Ran drush feeds-cron
* Ran drush feeds-queue, then drush queue-cron

Thanks. Let me know if I'm missing anything.

alex_b’s picture

Status: Needs review » Needs work

- Why have an option to bypass the queue? If queue should not be used, drupal_queue module can be shut off.
- $feed_name should be $importer_id
- $semaphore_type can break the scheduler if callers use different semaphore types but work on the same importers. Let's not add different types of semaphores at all. This is what the queue is for.

In general, I'd love to keep it simple. Blowing up options on this sort of functionality is asking for trouble...

ivanbueno’s picture

I'm ok with importer_id and removal of semaphore_type. I need bypass_queue because we have other modules dependent on the drupal_queue module. We can't just shut it off to disable the feeds-queue.

ivanbueno’s picture

Status: Needs work » Needs review
StatusFileSize
new3.91 KB
new2.6 KB

here are the updated patches:

* $feed_name renamed to $importer_id
* $semaphore_type is removed.

I hope $bypass_queue stays. I need it.

Let me know if this works or not.

alex_b’s picture

I need bypass_queue because we have other modules dependent on the drupal_queue module. We can't just shut it off to disable the feeds-queue.

Hm. I guess what I'd like to know is why do you need to *not* use it? drush feeds-cron followed by drush queue-cron should do the job, doesn't it?

BTW, FeedsScheduler can be extended and the extending class can be injected by setting the feeds_scheduler_class variable (see FeedsScheduler::instance()).

It almost sounds like you have such a special use case that overriding FeedsScheduler to accomodate it makes a lot of sense...

ivanbueno’s picture

Hm. I guess what I'd like to know is why do you need to *not* use it? drush feeds-cron followed by drush queue-cron should do the job, doesn't it?

We need "drush feeds-cron" to be as atomic as possible, without any other processes that might slow it down. "drush queue-cron" might come with other scheduled jobs that could potentially slow down the refresh.

If there's no $bypass_queue, "drush feeds-cron" is not needed. "drush feeds-queue" would suffice because "drush queue-cron" is needed to be run anyways.

I don't think this is a special use case. It applies for anyone who will be using the drush commands because of performance/tight-scheduling reasons. Having "drush feeds-cron" and "drush feeds-queue" gives the shell script more options and control.

alex_b’s picture

Component: User interface » Code
Status: Needs review » Needs work

I don't plan to commit an option to bypass the queue - at least not in the suggested implementation. We should add a drush cron command that updates a single feed through the command line though - but not by calling through FeedsScheduler, but by calling FeedsSource::import() directly (see import menu callbacks).

ivanbueno’s picture

Status: Needs work » Needs review
StatusFileSize
new4.06 KB

here's a simplified drush for refreshing feeds. please review.

ericduran’s picture

StatusFileSize
new3.46 KB

I needed something like this.

This is a bit different than the other patches on this thread.

This has 3 drush commands feeds-list, feeds-import, and feeds-clear.

Feeds-list is a table display that displays all the feeds and their options such as name,description attached to, status, and state.

Feeds-import takes in the feed_name and does an import ignoring the schedular, also has the nid options if needs.

Feeds-clear is the same as import except it clears.

Sorry for being lazy and not dealing with fake cvsadd :-(

pcambra’s picture

suscribe

ericduran’s picture

StatusFileSize
new3.74 KB

Here's an update to my previous patch.

This time around it provides output to the screen of the batch process.

jelle_s’s picture

subscribe

pcambra’s picture

StatusFileSize
new4.45 KB

I've tested the file and works fine for me, Drush is drupal version agnostic so this patch could be applied both to D6 and D7 branches.

Here is the patch attached with git format and I've also added support for importing a file from drush into feeds (option file).

I don't like the way this patch import the feeds calling import() method directly, I think we should code it using batch api, see http://drupal.org/node/873132 I'll give it a try in a while but meantime this is fucntional enough.

ericduran’s picture

@pcambra, feeds already does the batch processing. Also see alex_b comment on #17.

pcambra’s picture

@ericduran, but why not use batch api with drush calling feeds_batch, I'll give it a try so it is more clear what I mean.

ericduran’s picture

@pcambra, I believe that the import will call feeds_batch itself. I might be wrong but that was my understanding.

pcambra’s picture

StatusFileSize
new9.36 KB

I don't think that import() method uses batch at all that's why you need to loop over it, here is a patch using drush and batch, also adds commands for deleting and reverting feeds and also enable/disable. When this gets into drush #1186480: Make the batch api benefit of backend show progress automatically. user will get better messages when doing batch operations.

shaundychko’s picture

Status: Needs review » Needs work

Wow, perfect timing. Thank you very much for this.

Bug: while the --nid option works great, I can't get the --file option to work.

I tried:
drush feeds-import res_core_user --file=sites/default/files/feeds/department.csv

resulting in the error message:
File sites/default/files/feeds/dept-commas-newemail_2.csv is not accessible.

Feeds is remembering a file name I uploaded previously, and doesn't seem to be accepting the new --file option. I also tried an absolute path. (and tried putting it in quotes...)

The workaround for me is to manually create an import node, which remembers the correct filename, and pass --nid=## to this drush command.

pcambra’s picture

Status: Needs work » Needs review
StatusFileSize
new11.02 KB

Ok, I see why now, here is the error fixed for #28, but as we don't really upload a file, we need to fake it, so if the file already exists in public://feeds, feeds-import will respect the older file, and if it doesn't, it will create it.
Probably it would make sense to check the file extension according to the feed config.

pcambra’s picture

Status: Needs review » Needs work

Discussing the subject with jonhattan, it seems that we'd need to improve the argument and options validation with a validate function to be more robust.

Also it may appear that these two lines might not be required for drush to handle batch.

$batch =& batch_get();
$batch['progressive'] = TRUE;
pcambra’s picture

Status: Needs work » Needs review
StatusFileSize
new11.35 KB

Another version of the patch that checks the file extension and removes unnecessary lines for drush to process batch.

jackbravo’s picture

sub!

reecemarsland’s picture

sub

tema’s picture

+1

andypost’s picture

#31 works, but there's no ability to see error log of import - drush finished silent

EDIT: errors are printed to stdout, Tested on D7!

brad.bulger’s picture

downloaded the patch from #31 and used it with an installation of Feeds 6.x-1.0-beta11 - works great!

we use the standalone form for importing - i added a "url" option to feeds-import and added this code to process it:

  elseif ($url = drush_get_option('url')) {
    $config = $feedsSource->getConfig();
    $config['FeedsHTTPFetcher']['source'] = $url;
    $feedsSource->setConfig($config);
    $feedsSource->save();
  }

seems to work OK.

i expect that needs some kind of validation as well, plus i wasn't sure about the naming conventions - if the argument should be "uri" not "url" - but though this might be of use to someone

shaundychko’s picture

When trying #31 I received the output:

Only files with the following extensions are allowed: .

When putting print_r($fetcher_config); on line 149 of feeds.drush.inc the output is:
Array
(
[direct] => 1
)
Only files with the following extensions are allowed: .

The extension is looking for $fetcher_config['allowed_extensions'], which doesn't exist in the $fetcher_config array, and which I can't find on the feeds config UI (outside of setting the parser to CSV file).

In any case, the --nid= option continues to work, but things would be a little more slick if I could get rid of these content types that exist only for importing, and instead use the --file option. Thanks for this improvement to feeds, as it makes a really important part of my project possible (regular data import triggered by cron, from a file).

pcambra’s picture

#37 you configure the allowed extensions in your feeds importer configuration, in the file upload settings, but maybe the patch can be improved to allow all the extensions if none is set.

shaundychko’s picture

StatusFileSize
new48.06 KB

Hi pcambra, thanks a lot for getting back to me. If I should open a new issue, please let me know, but perhaps the settings you're referring to are in D7? I'm on D6 (and this issue is tagged D6), and I've attached my file upload settings, using the latest feeds 6.x-1.0-beta11. There doesn't appear to be any place to configure allowed file extensions, or am I missing something?

mvc’s picture

Version: 6.x-1.x-dev » 7.x-2.x-dev

The patch in #31 looks like D7 code to me, for example:

$feed_dir = 'public://feeds';
file_prepare_directory($feed_dir, FILE_CREATE_DIRECTORY | FILE_MODIFY_PERMISSIONS);

file_prepare_directory() is a Drupal 7 function.

mvc’s picture

StatusFileSize
new10.09 KB

Here's a version of #31 that works for feeds 6.x, and bails with an error if the user attempts to import a file which doesn't exist.

sir_squall’s picture

StatusFileSize
new10.04 KB

Hello,

I have added the functionality to import by http, and not only by a local file.

tommyk’s picture

StatusFileSize
new10.52 KB

The added option to import via --http= parameter in #42 works for me on Drupal 6. Attached is the file from #42 in patch form.

tommyk’s picture

I've done some testing with this on Drupal 6 and have some questions. Here is my scenario:

---

* Nodes initially created via Import button on feed node.

Adding context and fields to the XPath XML parser settings on a feed importer and both detaching it from and leaving it attached to the content type that also contains the context and fields seems to creates duplicate nodes upon first import via drush with --http= option. Subsequent drush imports do not create duplicate nodes, but update the drush-created nodes. Something triggers new node creation when the importer changes that significantly.

* Users initially created via Import button on feed node.

Adding context and fields to the XPath XML parser settings on a feed importer and both detaching it from and leaving it attached to the content type that also contains the context and fields does not duplicate users upon first import via drush with --http= option.

---

Can anyone explain why drush feeds-import will not update existing nodes that were created with the import button on a feed node? I don't think it has to do with using Feeds XPath Parser, but could it? It's not the end of the world if I have to delete my already created nodes and start over. It is strange though that users are not affected in the same way.

clemens.tolboom’s picture

StatusFileSize
new10.69 KB

Attached patch contains minor fixes:
- Code style
- argument check
- importers count check

But I did not succeed in importing a file on a Drupal 7 site.

drush feeds-import test --verbose
drush feeds-import test --verbose --http=build2be.com/rss.xml

and the following fails miserably with error

drush feeds-import test --verbose --file=~/Downloads/my-data.csv 

PHP Fatal error:  Call to undefined function file_create_path() in /.../www/sites/default/modules/feeds/feeds.drush.inc on line 157

So the code is not Drupal 7 compatible as that function is D6: http://api.drupal.org/api/drupal/includes!file.inc/function/file_create_...
Using the feeds_ui everything is working fine.

pcambra’s picture

Status: Needs review » Needs work

There's no possible way of maintaining a single patch that fits both for D6 and D7.

The patch I posted in #31 was D7 compatible and doesn't use file_create_path

What I propose here is to attach 2 files, one for D6 and one for D7, or work in the D7 version and then backport, but this mix is clearly not going to work.

clemens.tolboom’s picture

Guess I took the wrong version then :( I took #43

This issue is about D7 so I hope someone takes the good things from #45 and fix it with code from other versions like #31

Unfortunately I have no time to fix this as other projects are running :(

lunk rat’s picture

Subscribing. What testing needs to be done to get this commited to 7.x?

If anyone can share their experience using this with 7.x; what patches, how it works, bugs to be fixed, etc. that would be helpful.

clemens.tolboom’s picture

@Lunk Rat : you should check #31 and #45 and make it work for D7.

lunk rat’s picture

I plan to, but I'm scared ;)

clemens.tolboom’s picture

Just give it a try ... we probably help you (if time permits)

andypost’s picture

Status: Needs work » Reviewed & tested by the community
Issue tags: +needs backport to 6.x

#45 Just fixes some comments and introduces '--http' import so let's commit it.

PS: #31 works in my production env from August 2011

lunk rat’s picture

Just used #31 to import a csv file into D7 with 4 passes using 4 different importers, run one-after-another, executed via a Python script. Everything worked great!

lunk rat’s picture

Info: I have found that #31 works much better with drush 5.1 when importing large csv files because of better batch-process handling in drush 5. I was able to import a csv with 320 rows and 20 columns no problem--that is, after I bumped up memory_limit in php.ini ...

andypost’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new3.12 KB
new11.08 KB
new7.8 KB
new12.11 KB

A combined patch for D7 & D6
Changes:
D7 - incorporated http fetcher from D6
both - a bit cleanup

okokokok’s picture

feeds.drush.inc in the d7 patch of #55 contains node_get_types() which is not working in d7.
Commenting it out and just putting dt('none') at least gives me some result for drush feeds-list

andypost’s picture

StatusFileSize
new512 bytes
new12.11 KB

@Kasper Souren thanx for review, fixed

mpv’s picture

Thank you! Worked great with http importer in drupal 7.

beatnikdude’s picture

Patch from #57 works great for me w/ 7.dev.
Edit: #57 v. #58

exeapps’s picture

Awesome work. No issues in D7.

Jason Dean’s picture

#57 is working for me, but...

I'm using the Unique Fields module solution as I need a specific field to be the unique target. This works fine when running the importer the usual ways, but the Drush feeds-import method doesn't respect the unique field, resulting in loads of duplicates.

I'll try to investigate this a bit further.

Jason Dean’s picture

Totally confused following my comment above.

The Unique Field patch simply adds an extra check to the loop that checks for existing feed item nodes (see last couple of quoted lines from FeedsNodeProcessor.inc) :

// Iterate through all unique targets and test whether they do already
    // exist in the database.
    foreach ($this->uniqueTargets($source, $result) as $target => $value) {
      switch ($target) {
        case 'nid':
          $nid = db_query("SELECT nid FROM {node} WHERE nid = :nid", array(':nid' => $value))->fetchField();
          break;
        case 'title':
          $nid = db_query("SELECT nid FROM {node} WHERE title = :title", array(':title' => $value))->fetchField();
          break;
        case 'feeds_source':
          if ($id = feeds_get_importer_id($this->config['content_type'])) {
            $nid = db_query("SELECT fs.feed_nid FROM {node} n JOIN {feeds_source} fs ON n.nid = fs.feed_nid WHERE fs.id = :id AND fs.source = :source", array(':id' => $id, ':source' => $value))->fetchField();
          }
          break;
        default:
          if (module_exists('unique_field')) {
            $scope = variable_get('unique_field_scope_'. $this->config['content_type'], UNIQUE_FIELD_SCOPE_TYPE);
            $nid = array_shift(unique_field_match_value($target, array(array('value' => $value)), variable_get('unique_field_scope_'. $this->config['content_type'], $scope)));
          }
      }

This works fine when using the Import button on the Feed node (updates existing), but not with Drush feeds-import command (imports all as new). Surely Drush feeds-import is using FeedsNodeProcessor.inc the same way?

twistor’s picture

Status: Needs review » Needs work
+++ b/feeds.drush.incundefined
@@ -0,0 +1,367 @@
+ * Implementation of hook_drush_command().

Implements

+++ b/feeds.drush.incundefined
@@ -0,0 +1,367 @@
+      'feed-name' => 'A space delimited list of feeds importers. Mandatory.',

importer

+++ b/feeds.drush.incundefined
@@ -0,0 +1,367 @@
+      'feeds-names' => 'A space delimited list of feeds importers . Mandatory.',

importers

+++ b/feeds.drush.incundefined
@@ -0,0 +1,367 @@
+    $fetcher_config = $feedsSource->importer->fetcher->getConfig();

Camel case?

+++ b/feeds.drush.incundefined
@@ -0,0 +1,367 @@
+    $regex = '/\.(' . preg_replace('/ +/', '|', preg_quote($fetcher_config['allowed_extensions'])) . ')$/i';

Why not pathinfo($filename, PATHINFO_EXTENSION)?

What about other types of fetchers?

twistor’s picture

It seems to me, that rather than custom coding support for the file and http fetcher, this should be made more generic.

attheshow’s picture

Patch #57 is working well for me.

attheshow’s picture

So strange. It was working for me yesterday to run commands like this:

drush @d7upgrade feeds-clear feed --nid=109
drush @d7upgrade feeds-import feed --nid=109

But today I'm getting errors on both:

Command feeds-import needs the following module(s) enabled to run:   [error]
feeds.
The drush command 'feeds-import feed' could not be executed.         [error]

I verified that both the feeds and feeds_ui modules ARE enabled for the site I'm working with. Any ideas what could be wrong?

markwk’s picture

Amazing work. Tested patch in #57 so that I can use drush to run the import and avoid timeout limits. I didn't do a code review or test all the parts but worked great.

patcon’s picture

I'll dig into it, but what are the thoughts on removing the file format restrictions on the command line? I would think this would be advanced users and automation attempts, so it seems a little overbearing for force.

In my case, I'm trying to set up a script that imports the first 10 lines of a file for demo purposes:

head -5 /path/to/my/file.csv | drush feeds-import my_feed_importer --file=/dev/stdin

Yes, a --limit flag is probably the prettiest way, but this is one of the many things that ppl can solve for themselves in the meantime if we just remove the barriers :)

4alldigital’s picture

Version: 7.x-2.x-dev » 7.x-2.0-alpha7
Component: Code » Feeds Import
Category: feature » support

First post on drupal.org, so shout if Im doing anything wrong.

Im using:
drush feeds-import --verbose my_importer_id path/to/file.csv

to import taxonomy terms into Drupal 7. When I do immediately after site installation (which is my goal), it does not work. When I run the importer in the UI (site.com/import/my_importer_id), that works fine. Then I delete imported terms using the UI and run :

drush feeds-import --verbose my_importer_id path/to/file.csv

again, and it works fine. So the UI must be doing something that the command line version isn't. any ideas?

Thanks

4alldigital’s picture

Priority: Normal » Minor
Status: Needs work » Fixed

my #69 issue resolved.

FYI - it was my own fault. I was not specifying the file correctly in the drush command.

drush feeds-import --verbose my_importer_id --file=path/to/file.csv

is what I should have been calling.

andypost’s picture

Version: 7.x-2.0-alpha7 » 7.x-2.x-dev
Category: support » task
Priority: Minor » Major
Status: Fixed » Needs review
StatusFileSize
new4.17 KB
new12.1 KB
new4.06 KB
new11.08 KB

New patch for D7, addresses mostly all #63
1) normalized parameter names
2) fixes camel-case

@twistor we should beware functions that does not respect locale, we have drupal_realpath() to work with miltibyte filenames - as php.net tells "pathinfo() is locale aware"

It seems to me, that rather than custom coding support for the file and http fetcher, this should be made more generic.

I suppose we should commit this to D7 as is and try to figure out a generic way in follow-up

@joe701 don't change status! Fixed ~ commited

markabur’s picture

Status: Needs review » Needs work

Cool, the feeds-import command is exactly what I was looking for. I noticed a few small things:

+  if (!isset($feed_name) || $feed_name == "") {

How about: if (empty($feed_name)) (There are two occurrences of this)

+ * Clears a given feed_name.
+ *
+ * The feed will be clear regardless of it's scheduled.

The feed will be cleared regardless of its schedule.

+ * Imports a given feed_name.
+ *
+ * The feed will be refreshed regardless of it's scheduled.

The feed will be refreshed regardless of its schedule.

+    // If the file already exists, create it in database if necessary and
+    // reuse it.

This would fit on one line.

+      // If the file doesn't exists, create it in files and database.

If the file doesn't exist...

markabur’s picture

Also, I'm confused by this behavior:

+ // If the file already exists, create it in database if necessary and
+ // reuse it.
+ if (file_exists($filename_schema)) {

I'm needing to re-import the same filename as before (this is a log file and now has more entries that I need to import), but this code is re-using the old locally-cached copy of the file. To me it would make more sense to assume the local file always needs to be replaced (otherwise why re-import it in the first place?) and delete it rather than re-use it. Or should I just delete the file manually whenever I'm done importing it? Or maybe the local file should be reused only if the remote file isn't newer?

mikran’s picture

feeds-list command description says

+ 'description' => "Displays all importers or displays the config of a given importer (passed as arg).",

but args are not accepted

-enzo-’s picture

Hello Guys

I did a personal implementation for Enable to import a individual feed via drush at http://drupal.org/node/1865210 with sandbox https://drupal.org/sandbox/enzo/1865202.

Please check if this is the same direction and if we can merge or let me know if this idea is a subset of current development.

andypost’s picture

@-enzo- can you roll a patch against current 7.x version of feeds?

jurriaanroelofs’s picture

Component: Feeds Import » Code
StatusFileSize
new12.69 KB

I added some code that allows you to override any configuration in JSON format. Not the prettiest solution but it beats writing a ton of glue code.

Example usage:

drush feeds-import my_importer --override='{"FeedsXPathCrawler":{"num_pages":"3","delay":"1"}}'

(don't forget the single quotes around the JSON string)

To see what configuration is available add the --verbose modifier to the feeds-import command and it will spit out the complete configuration in JSON which you can then edit and feed back to the override parameter. It will output something like this:

{"FeedsXPathCrawler":{"source":"https:\/\/www.example.com\/o\/profiles\/browse\/?q=drupal+-joomla","crawled":false,"num_pages":"2","delay":"1","first_run":"0","xpath":"(\/\/li[@class=\"next\"]\/a)[1]\/@href"},"FeedsQueryPathParser":[],"FeedsCrawler":{"crawled":true}}
wutu’s picture

I get this result:

drush feeds-import --verbose import_zbozi --file=/var/www/html/importISS/zbozi.xml
Initialized Drupal 7.22 root directory at /var/www/html/atok [notice]
Initialized Drupal site default at sites/default [notice]
Undefined index: allowed_extensions feeds.drush.inc:158 [notice]
Undefined index: allowed_extensions feeds.drush.inc:160 [notice]
Only files with the following extensions are allowed: .
Command dispatch complete

queenvictoria’s picture

Patch in #71 doesn't include the fix in #57 node_get_types -> node_type_get_types for D7

queenvictoria’s picture

StatusFileSize
new2.79 KB
new12.01 KB

Here is a patch and interdiff that addresses all the issues in #72, #74 and #79.
I've only done 7.x-2.x -- apologies I can't test D6 at the moment.

queenvictoria’s picture

Status: Needs work » Needs review
StatusFileSize
new3.36 KB
new14.52 KB

This is a patch (and Interdiff) based on the patch in #80. It is an attempt to work in the feature that @-enzo- was working on. It adds two new drush methods.

$ drush feeds-import-all [feed-name]
This imports all feeds importer instances (!?!). If a feed-name is specified then it only imports feeds in that importer. This is because my feeds are attached to a content type and feeds-import requires a feed-nid in order to import anything at all. With this command I can import everything all at once.

$ drush feeds-list-all [feed-name]
This lists (in a style similar to @-enzo- sandbox) all the feed importer instances (!?!?). I've added this because its a useful input if you are trying to feeds-import a single feed attached to a content type.

There are some issues.

The language. feeds importer instances. What is the correct term for these?

The method names. feeds-list-all overloads the feeds-list meaning and shows very different output. Should they be blended together? Should they be further separated?

The query. I've been looking high and low for the equivalent of feeds_importer_load_all for feed importer instances. In the end I've used @-enzo- db_query. There must be a better alternative using feeds-native calls.

queenvictoria’s picture

Well that one wasn't so good. Looks like calling feeds_drush_import directly causes some sort of recursion. I've switched to @-enzo- method. Patch and interdiff attached. I've also added a --limit argument so you can just import 100 at a time (for example).

Status: Needs review » Needs work

The last submitted patch, feeds-drush-integration-608408-82.patch, failed testing.

queenvictoria’s picture

Status: Needs work » Needs review
StatusFileSize
new3.69 KB
new15.3 KB

Oops. Limit cannot be zero.

micromegas’s picture

I got the following error testing with #84:

prompt$ drush feeds-import my_import --file=~/Projects/nightly-import/shipping-list-import083.csv
File public://feeds/shipping-list-import083.csv doesn't exist in feeds folder, creating it.                    [ok]
WD php: PDOException: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '' for key 'uri':  [error]
INSERT INTO {file_managed} (uid, filename, uri, filemime, filesize, status, timestamp) VALUES
(:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3,
:db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6); Array
(
    [:db_insert_placeholder_0] => 1
    [:db_insert_placeholder_1] =>
    [:db_insert_placeholder_2] =>
    [:db_insert_placeholder_3] => application/octet-stream
    [:db_insert_placeholder_4] => 0
    [:db_insert_placeholder_5] => 0
    [:db_insert_placeholder_6] => 1371590619
)
 in drupal_write_record() (line 7136 of /Users/user/Sites/test/includes/common.inc).
PDOException: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '' for key 'uri': INSERT INTO {file_managed} (uid, filename, uri, filemime, filesize, status, timestamp) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6); Array
(
    [:db_insert_placeholder_0] => 1
    [:db_insert_placeholder_1] =>
    [:db_insert_placeholder_2] =>
    [:db_insert_placeholder_3] => application/octet-stream
    [:db_insert_placeholder_4] => 0
    [:db_insert_placeholder_5] => 0
    [:db_insert_placeholder_6] => 1371590619
)
 in drupal_write_record() (line 7136 of /Users/user/Sites/test/includes/common.inc).
Drush command terminated abnormally due to an unrecoverable error.                                             [error]

When this occurs, the attempted-uploaded file is not present in sites/default/files/feeds.

The situation is that the content is created by importing one CSV file, and then a second CSV file is imported to update the changes in that content--in this case, it is a single user reference field. The changed file uploads fine through the import page, but does not when using drush feeds-import.

This importer was initially url-based, and then I later changed it to file-based for testing... in case that has to do with anything.

j0rd’s picture

I've posted another patch against enzo's drush_feeds_import. It now allows you to import all {feed_source}.id's or individual {feed_source}.id + {feed_source}.feed_nid

His modules been working very well for me, and is quite simple. You can apply this patch off his latest -dev.
https://drupal.org/node/1888356#comment-7661871

mparker17’s picture

StatusFileSize
new1.58 KB
new16.08 KB

I needed the ability to import a file on a remote site from a local computer, so I added a --stdin parameter to feeds-import so I could run drush @myremotesite feeds-import importer_name --stdin < some_file.csv.

kenorb’s picture

Testing by using XML files (feeds_xpathparser):

$ drush feeds-import page_import_xml --file /Users/kenorb/scripts/../data_migration/xml/pages-uk.xml
Only files with the following extensions are allowed: txt csv tsv xml opml html htm.

It seems that drush_get_option('file') returns 1

$ drush feeds-import page_import_xml --stdin < /Users/kenorb/scripts/../data_migration/xml/pages-uk.xml
Resource is not a file or it is an empty directory:                                                                                                [error]
The specified file temporary://fileUWOgKU could not be copied, because the destination directory is not properly configured. This may be caused by [error]
a problem with file or directory permissions. More information is available in the system log.

No watchdog errors, temp dir has the right permissions as it works in UI.

Maybe I'm doing it wrong.

Other commands are working like:

$ drush feeds-list
$ drush feeds-list-all

Tested with 7.x-2.0-alpha8 and dev version.

mparker17’s picture

@kenorb, the --stdin flag causes a temporary file to be created in a feeds folder in the private file system path (it saves the file to 'private://feeds/drush_feeds_' . $feed_name . '_' . time()).

Ensure that a private file system path is set and writeable by Drupal and Drush by going to /admin/config/media/file-system.

If that still doesn't work, try running the same command with the -v argument to make Drush run in verbose mode. You'll see a message saying Saved !num bytes to !filename.. Check that the file exists and contains the contents of your file.

If the file isn't there, Drupal and Drush probably cannot write to that path.

If the file is there but is empty, either something went wrong reading from STDIN, sending the file to the Drupal installation, or there's a filesystem permissions problem somewhere.

If the file is there and has the contents of the original file, something else went wrong. Try going to /import/page_import_xml... the file should be pre-populated. Can you import it there?

If there are any relevant watchdog logs or terminal output, please feel free to send them to me using my contact form. Let's not clutter up the main issue with my (potentially) buggy code if we can. :P

kenorb’s picture

Status: Needs review » Needs work

In UI private folder is set to: sites/default/protected
The temp file is created as /tmp/fileIp1jba and then I believe is copied to /private/tmp/fileZnF4mY and then it's copied to the final destination to:
sites/default/protected/feeds/drush_feeds_page_import_xml_1376335061
Anyway, the permissions are ok, but feeds directory doesn't exist, probably that's the reason.

So regarding stdin, probably the creation of dir is necessary:

file_prepare_directory('private://feeds', FILE_CREATE_DIRECTORY | FILE_MODIFY_PERMISSIONS);

before doing file_save_data().

--- a/public_html/sites/all/modules/feeds/feeds.drush.inc
+++ b/public_html/sites/all/modules/feeds/feeds.drush.inc
@@ -271,9 +271,11 @@ function drush_feeds_import($feed_name = NULL, $feed_nid = 0) {
   }
 
   if (drush_get_option('stdin')) {
+    $feed_dir = 'private://feeds';
     drush_log(dt('Importing !feed from stdin.', array('!feed' => $feed_name)), 'notice');
     $data = stream_get_contents(STDIN);
-    $file = file_save_data($data, 'private://feeds/drush_feeds_' . $feed_name . '_' . time());
+    file_prepare_directory($feed_dir, FILE_CREATE_DIRECTORY | FILE_MODIFY_PERMISSIONS);
+    $file = file_save_data($data, $feed_dir . $feed_name . '_' . time());

Tested and it works.

mparker17’s picture

@kenorb: Looks good to me! Thanks very much! I'll re-roll the patch.

mparker17’s picture

Status: Needs work » Needs review
StatusFileSize
new822 bytes
new16.2 KB

Re-rolled.

kenorb’s picture

Status: Needs review » Reviewed & tested by the community

Regarding:

$ drush feeds-import page_import_xml --file pages-uk.xml 
Only files with the following extensions are allowed: txt csv tsv xml opml html htm.

Drush actually requires --file=FILE, then it works.

Tested using file and stdin and it works fine.

kenorb’s picture

Title: Drush integration » Drush integration for Feeds
Status: Reviewed & tested by the community » Needs review

Also one question.

When importing the file with the same name (after changing it), it seems that it's ignoring the changed file and importing the existing one (from db).

$ drush feeds-import page_import_xml --file=xml/sample.xml
File public://feeds/sample.xml already exists in feeds folder, reusing it from there.                                                              [ok]
Created 2 nodes.                                                                                                                                   [status]

It's expected behaviour?

twistor’s picture

Category: task » feature
Status: Needs review » Needs work

The $feed_name variable, and all references in help text needs to be changed to $importer.

drush_feeds_import() is a giant mess and needs to be re-factored.

jmichelgarcia’s picture

Status: Needs work » Needs review
mparker17’s picture

Still passes! :D

johnennew’s picture

Issue summary: View changes
Status: Needs review » Active

This is working great - thanks for all this work!

Just one comment,

--file option on feeds-import fails if you pass a directory into it, but this is valid in the interface where you might import every feed present in a directory.

kenorb’s picture

Status: Active » Needs review

Please change it to appropriate status (Needs work/Needs review).

dbassendine’s picture

Thanks for this, it will be great to have these tools available in drush! The main commands all work well for me with a standalone importer - I haven't tested separate instances of the same importer.

However I did find when I set the --http option for drush feeds-import, I received a cURL timeout error:

drush feeds-import import_co2 --http="http://scrippsco2.ucsd.edu/data/in_situ_co2/monthly_mlo.csv"
cURL error (28) connect() timed out! for http://scrippsco2.ucsd.edu/data/in_situ_co2/monthly_mlo.csv

This is a known good URL, as it's the default URL used in the importer anyway.

megachriz’s picture

Status: Needs review » Needs work

According to comment #95, this issue needs work.

agnese.stelce’s picture

Added small change to patch on #92 - set the Feeds file folder to value stored on configuration not use hardcoded value.

alan-ps’s picture

Seems that import with --stdin parameter doesn't work correctly. The file will saved without extension (ie. drush_feeds_manual_page_import_1418654544). So, we can add extension as the --stdin value and save with file extension.

For example: drush feeds-import test --stdin=xml < test.xml

Tested using stdin and it works fine.

joelpittet’s picture

Status: Needs work » Needs review

Marking as needs review.

mglaman’s picture

Status: Needs review » Reviewed & tested by the community

Tested, and this works. We needed a way to clear and import feeds via Drush. This is a huge need for the module since Feeds handle a lot of data and most cases requiring command line operations.

goldii’s picture

#103 tested and works fine! Thanks!

b-prod’s picture

#103 seems to work fine with standalone importers. Not tested with importers attached to nodes.

forestmars’s picture

+1 RTBC, let's get this in!

twistor’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new14.6 KB

The stdin, http, and file arguments aren't going in this patch. It's too much and a very wrong approach. We can create a follow-up.

No interdiff since most lines have changed.

twistor’s picture

StatusFileSize
new1.69 KB
new14.57 KB

Minor fixes.

dillix’s picture

twistor, how can I run import of local csv file with this patch? Can you write an example, so I will test your patch on large data...

twistor’s picture

Status: Needs review » Postponed
Related issues: +#2531858: Add a FeedsSource::pushImport() method.

You can't at the moment.

Fine, we'll add that ability, but it needs to be done in a simple way for all configurations.

twistor’s picture

Status: Postponed » Active
StatusFileSize
new1.75 KB
new15.36 KB

@dillix, to import a csv file, you can do

drush feeds-import my_importer --nid 10 --file=path/to/file

twistor’s picture

twistor’s picture

Status: Active » Needs review
andypost’s picture

Issue tags: +Needs manual testing

Looks great! needs to be tested manually

ysaid’s picture

Hello @twistor,

I tried to import a csv file using drush feeds-import my_importer --file=path/to/file.

if nid option is not used, script get stuck in l.227 exception anyway. I think it's better to include the try/catch in a drush_get_option('nid') condition. What do you think ?

laVera’s picture

Hello:

I also run it with no nid (my feed came from features), it does actually do the import, but the process is never closed. My work around is to add a kill in the command, it not the cleanes, but does the work:
$drush feeds-import my_importer --file=path/to/file.csv & sleep 60 ; kill $!

But my huge issue, is that drush don't recognize the feed until I have used once trough web UI (The feed my_importer: 0 does not exist.). This defeat most of the propose of implementing drush for feeds, be able to automatice. It should have something to do that my feeds importers came from features?

Any ideas are welcome :)

johnny5th’s picture

This patch seems to not run my hook_feeds_after_import code, but it DOES run my hook_feeds_after_parse code. It runs fine straight from the GUI.
I'm expecting a node to be deleted when a cancel date is 30 days older than today, but it doesn't get removed until I run the importer from the GUI.

<?php

function dealer_feeds_feeds_after_import(FeedsSource $source) {
  if($source->id == "dealer_importer") {
    // Query all expired + 30 days fields for deletion
    $query = new EntityFieldQuery();
    $query->entityCondition('entity_type', 'node')
      ->entityCondition('bundle', 'dealer')
      ->fieldCondition('field_dealers_cancel_date', 'value', date('Y-m-d H:i:s', strtotime("-30 days")), '<=');
    $result = $query->execute();

    // Loop through results and save nids to an array
    $nodes_to_delete = array();
    foreach($result['node'] as $node) {
      $nodes_to_delete[] = $node->nid;
    }
    // Delete all selected nids
    node_delete_multiple($nodes_to_delete);
  }
}

function dealer_feeds_feeds_after_parse(FeedsSource $source, FeedsParserResult $result) {
  if($source->id == "dealer_importer") {
    foreach($result->items as &$item) {
      // Published by default
      $is_published = 1;

      // Check if a cancel date is set
      if(isset($item['dealer_published']) && !empty($item['dealer_published'])) {
        // Check if dealer cancel date is in the past
        if(strtotime($item['dealer_published']) < time()) {
          $is_published = 0; // Unpublish
        }
      }

      $item['dealer_published'] = $is_published;
    }
  }
}
csakiistvan’s picture

I try to run code with #113 patch, but it was failed:

Drush command terminated abnormally due to an unrecoverable error.                                            [error]
Error: Call to undefined method FeedsSource::pushImport() in
www/sites/all/modules/contrib/feeds/feeds.drush.inc, line 251
kebne’s picture

I have the same problem with #113 patch, seems like the method FeedsSource::pushImport () is missing in feeds.
Found that there's a patch to FeedsConfigurable.inc at https://www.drupal.org/node/2531858 which is supposed to add pushimport.
I could however not apply that patch with git. I'm using Feeds 7.x-2.0-beta1 which might be the culprit.

megachriz’s picture

@csakiistvan, kebne
Try to apply the patch on the dev version of Feeds, FeedsSource::pushImport() was added after the 7.x-2.0-beta1 release.

cmarcera’s picture

Updated Feeds yesterday and now my drush commands aren't working:

Initialized Drupal 7.41 root directory at /var/www/html          [notice]
Initialized Drupal site default at sites/default          [notice]
+Command feeds-import needs the following module(s) enabled to run: feeds.          [error]
The drush command 'feeds-import xml_files_to_stories' could not be executed.          [error]

Feeds is definitely enabled.

So far I've:

  • Run 'drush cc all'
  • Run 'drush updatedb'
  • Disabled and re-enabled Feeds

None of those things have helped. I'm stumped.

eiriksm’s picture

Status: Needs review » Needs work

I am using this, so wanted to see if we could get this in. But I am going to be a little party spoiler here and set back to needs work.

  1. +++ b/feeds.drush.inc
    @@ -0,0 +1,524 @@
    +    'examples' => array(
    +      'drush feeds-list-all' => 'List all instances of all feeds.',
    +      'drush feeds-list-all rss_feed' => 'List all feed instances of the rss_feed importer.',
    +    ),
    

    These examples are incorrect. feeds-list-all does not exist

  2. +++ b/feeds.drush.inc
    @@ -0,0 +1,524 @@
    +      'nid' => 'The nid of the Feeds importer that will be refreshed. Optional.',
    

    Here we are talking about importing, not refreshing

  3. +++ b/feeds.drush.inc
    @@ -0,0 +1,524 @@
    +      'nid' => 'The nid of the Feeds importer that will be clear. Optional.',
    

    Nitpick. Should probably be "that will be cleared"

Also, all of the commands and their descriptions, types and so on should ideally be wrapped in dt for max multilingual profit :)

Thanks for this great work, this has really helped me. Let's get this to RTBC! :)

Edit: Sorry, the second comment is about the description in the import command, that was not so obvious from the my comment :)

mikran’s picture

My importer is using file upload as source. Periodic import is set to 'Off' and when I run drush feeds-import my_importer --file=path/to/my.csv the import process is repeated indefinitely. I was expecting the file to be imported just once.

controla’s picture

Same as mikran, feed is importing correctly using --file=csv but it doesn't stop at the end and starts adding nodes again.
Other than that non-minor bug, this thing is working perfectly, and is hopefuly going to save my neck.

grahamc’s picture

Looping issue with --file=csv is not really part of the Drush integration, so I created a new issue #2624344: Import via pushImport() keeps looping / never completes (with possible fix)

kenorb’s picture

Status: Needs work » Needs review
Issue tags: -needs backport to 6.x
StatusFileSize
new15.48 KB

Added suggestions from #124

kenorb’s picture

jerry’s picture

Patch in #128 is working fine for me with 7.x-2.0-beta2.

johnny5th’s picture

Patch #128 seems to work on 7.x-2.0-beta2, though it updated all of my nodes when the GUI did not. After the second try, no new nodes were updated.

On my hook_feeds_after_import(), the SQL query I had timed out and threw a "gone away" error message. Feeds is processing a 30mb file taking 30 seconds to run, and my my.cnf settings were set to 30 seconds. After I upped my.cnf to 60 seconds, it ran fine. hook_feeds_after_import() was not having this issue from the GUI.

bès’s picture

Status: Needs review » Needs work

Hi, same problem than #118
drush feeds-import qa_importer --file=../ressources/QA.csv
return "The feed qa_importer: 0 does not exist." until I do the import once via the UI.

An exception is throw during drush_feeds_import() when testing the existance.
$source = feeds_source($importer, $feed_nid)->existing();

The FeedsSource existing method run this code :

$this->importer->existing();
return parent::existing();

The importer test is ok but the parent one fails with 'Object is not persistent.' exception probably because we don't use a node feed so FeedsSource consider our source as non persistent...

Anyway, this prevents any use via features.

Edit : maybe there is a link to : https://www.drupal.org/node/1197536

bès’s picture

megachriz’s picture

Assigned: Unassigned » megachriz

So looking at a patch for this issue for the first time, here is my review:

The drush commands are working great so far! I found a few small bugs. Furthermore, the documentation, UX and texts could use some improvement.

Details

  1. Overall
    • I don't think that dt() should be used in hook_drush_command(). At least, I don't see other modules doing that.
    • When using drush help [feeds-command] examples sometimes are listed with a zero, because there is no description for the example. All examples should have descriptions, in my opinion.
    • The order of the command definitions in feeds_drush_command() should match the function order. "feeds-delete" and "feeds-revert" come after "feeds-enable" and "feeds-disable" in feeds_drush_command(), but it is the other way around in the function definitions.
    • It would be great to have aliases for some commands.
  2. feeds-list-importers
    • (bug) "Attached to" is not shown when the importer is attached to a content type.
    • Other than that, this command works fine.
  3. feeds-list-feeds
    • (bug) When using the HTTP Fetcher, the source is not shown.
    • Other than that, this command works fine.
  4. feeds-import
    • A confirmation message would be nice and UX/texts could be improved. Other than that, I found no issues with this command so far.
  5. feeds-import-all
    • When one of the feed sources is "broken" it goes into a endless loop. Other than that, this command works fine. The broken feed source should be fixed in an other issue.
  6. feeds-clear
    • Before using it, it wasn't clear to me what this command did. It deletes all imported items, so we should document it like that.
    • A confirmation message would be nice and UX/texts could be improved. Other than that, I found no issues with this command.
  7. feeds-enable
    • Works good. Also nice that there is a warning when you try to enable an already enabled importer.
  8. feeds-disable
    • Works good, just like feeds-enable.
  9. feeds-delete
    • Deleting a feed importer that exists in the database works.
    • Deleting a feed importer that exists in code, but is overridden, is reverted instead. Probably OK.
    • (bug) Deleting a feed importer that exists in code has no effect. A warning should be displayed instead.
  10. feeds-revert
    • Reverting a feed importer works.
    • A warning message is shown when reverting a feed importer that exist in the database (and thus can't be reverted). This is good.

I have worked on improving the help texts and the UX and on fixing the bugs that I had found. Since that resulted in a lot of changes, I'm posting these in a series of patches, so it is easier to follow. Hang on...

megachriz’s picture

Status: Needs work » Needs review
StatusFileSize
new15.48 KB
new8.87 KB

Moved drush_feeds_delete() and drush_feeds_revert() to match the command definitions order in feeds_drush_command() with the function order. Minor improvement.

More to come...

megachriz’s picture

Removed dt() from command definitions.

megachriz’s picture

Fixing some bugs. drush_log() needs a second parameter in order to show up on the command line.

megachriz’s picture

Improved documentation, added drush_feeds_help() and improved messages.

megachriz’s picture

Renamed variable $importer to $importer_id at some places. This way $importer always represents an importer object and $importer_id always an importer ID.

Two more patches to come...

megachriz’s picture

Here are some bigger changes to improve the UX.

Confirmation messages are added to drush_feeds_import() and drush_feeds_clear(). When a combination of importer ID and feed NID cannot be found, it tries to be helpful to guide the user in the right direction. For example, when someone specifies a feed NID and the importer is not attached to a content type (so the importer does not work with feed nodes) it warns the user about that. The other way around it goes similar: when the importer is attached to a content type, the user is being asked to specify a feed NID.

In drush_feeds_delete(), if an importer cannot be deleted because it only exists in code, the user is informed about that is the reason that the importer was not deleted.

In drush_feeds_revert(), a more clear reason is given when an importer cannot be reverted. Because it is not overridden or because it does not exist in code.

When the commands for enable, disable, delete and revert are executed without an importer specified, a warning is shown that the user should specify that.

megachriz’s picture

And finally, some command aliases!

  • feeds-list-importers > feeds
  • feeds-list-feeds > feeds-lf
  • feeds-enable > feeds-en
  • feeds-disable > feeds-dis

For feeds-import and feeds-import-all I was thinking of using the aliases fi and fia, but I'm not sure if that would collide with drush commands defined by other modules. According to http://drushcommands.com/drush-7x/, at least the module Features does not use these aliases.

This was the last patch of the series ;).

megachriz’s picture

Assigned: megachriz » Unassigned
megachriz’s picture

StatusFileSize
new26.88 KB
new1.87 KB
new1.79 KB
new982 bytes
new2.5 KB

I've been testing the Drush patch some more and implemented the following improvements:

  1. Import file from current directory
    Previously, when supplying a file to import, the full absolute path to the file must be specified.
    I improved this by making it also possible to specify the file relative to the current directory. A function called _drush_feeds_find_file() is added for this. The drush function drush_cwd() is used for this solution. See diff 1.
  2. Don't require an existing feeds source when supplying a file or url
    I noticed that it wasn't possible to import from a file or url for an importer for which no feeds_source entry exist yet. This made its use limited: it wasn't possible to enable a feature module that contained an importer and immediately import content using that importer.
    I improved this in the following way: if no existing feeds source is found for an importer, the importer is not attached to a content type and either a file, url of "stdin" is specified, a new feeds source is created and saved. See diff 2-3.
  3. Cryptic error message when trying to initiate an import for an importer for which no source exists yet
    When trying to initiate an import for an importer using the standalone form and for which no source exist yet the following error message was shown:

    There is no feed node with ID 0 for importer 'my_importer'. To show a list of available feed nodes for this importer, use 'drush feeds-list-feeds my_importer'.

    I thought this error message made no sense for an importer not attached to a content type and changed it to this:

    No source exists yet for the importer 'my_importer'. There is nothing to import. You can import from a file or url using the options '--file' or '--url' or go to /import/my_importer to configure the source to import. See 'drush help feeds-import' for more information.

    See diff 2-3.

  4. Disabled importer is reported as non-existing
    When an importer is disabled, 'drush feeds-import' and 'drush feeds-clear' reported that the importer in question did not exist. I changed this message to the following:

    The importer 'my_importer' does not exist or is not enabled.

    See diff 4.

  5. Show titles of feed nodes in overview
    When listing the available feeds using the "drush feeds-list-feeds" command, I thought it would be useful to also display the titles of feed nodes, rather than only the feed node ID's. I adjusted this. See diff 5.
megachriz’s picture

I made some more improvements:

  • The drush command 'feeds-import-all' now asks for confirmation for each source to import. This is for consistency with similar drush commands from other modules, like the drush command 'features-revert-all' from the features module.
  • The drush command 'feeds-import' got 'feeds-im' as alias. The drush command 'feeds-import-all' got the aliases 'feeds-im-all' and 'feeds-ia'.
  • A constant is created for the default value for the option 'limit' (used by the commands 'feeds-list-feeds' and 'feeds-import-all').
  • A few corrections in the help texts.
g33kg1rl’s picture

This is exactly what I needed. I am using this patch now. :) I will let you know if I run into any issues, but so far so good! +1

megachriz’s picture

I noticed the menu's weren't updated after enabling a Feeds importer using drush feeds-enable. This resulted into a 404 page when navigating to the importer's log page.

New patch.

jm_drupal’s picture

Patch in #147 is working fine. Thanks MegaChriz. The drush command 'feeds-import' asks for confirmation, is it possible to add another option 'force' or sth similar to bypass this confirmation step? I require to run the drush command as a cron job included in my Crontab file.

nwom’s picture

@jm_drupal I run it as a cron job too, and just added "-y" to the end of the drush command. I hope this helps!

jm_drupal’s picture

@NWOM Thanks.

tedwyer’s picture

This is probably a dumb question but how do I apply this patch. It seems to patch a file that doesn't exist in my install. Thanks

nwom’s picture

@tedwyer: On a module's project page click on "Version control" to see how to download the newest dev via GIT. In this case the page is https://www.drupal.org/project/feeds/git-instructions. The branch you want to download is 7.x-2.x. It describes how to apply a patch as well.

This patch does add a new file indicated by "new file mode".

megachriz’s picture

Found one more bug: when supplying a file or url for an standalone importer for which no source exist yet, a source gets created, but the import is aborted anyway.

This issue has been open for long enough and I found the patch to be quite stable during the last 4 weeks (with only a few minor bugs found). So if the test results turn green, this patch will get in today!

  • MegaChriz committed 144a78b on 7.x-2.x
    Issue #608408 by MegaChriz, andypost, queenvictoria, pcambra, twistor,...
megachriz’s picture

Status: Needs review » Fixed

Committed #153.

Thanks all!

Finally this issue is done after six years, eight months and seven days!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.