Closed (fixed)
Project:
Feeds
Version:
7.x-2.x-dev
Component:
Code
Priority:
Major
Category:
Feature request
Assigned:
Unassigned
Issue tags:
Reporter:
Created:
19 Oct 2009 at 11:27 UTC
Updated:
10 Jul 2016 at 19:54 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #1
ivanbueno commentedI created some drush commands for the feeds module. (The file is coded for Drush 3.)
The commands available are:
drush feeds-config
* Displays all active importers or displays the config of a given importer (passed as arg).
drush feeds-refresh
* Refreshes a feed based on its schedule.
drush feeds-queue
* Adds a scheduled feed to the drupal_queue. (Needs to be run in conjunction with "drush queue cron".)
However, the feature needed above is not satisfied yet.
Comment #2
blakehall commentedI gave this a quick test, it looks like the schema has changed. (last_scheduled_time -> last_executed_time)
Also running drush feeds-refresh gave me the following error:
Exception: Empty configuration identifier. in sites/all/modules/feeds/includes/FeedsConfigurable.inc on line 56Marking code needs work, until I can give this further testing / work.
Comment #3
ivanbueno commentedHere's an updated version for alpha12, where the schema changed happened. I haven't tested this yet for alpha14. Let me know how it goes.
Comment #4
alex_b commentedI do not understand the intention of the patch in #3. Let me write out what it does to make sure we're on the same page:
1. drush feeds-refresh [importer_id] imports all feed subscriptions of a specific importer id IF they are up for a scheduled import.
2. drush feeds-queue [importer_id] queues all feed subscriptions of a specific importer id for import IF they are up for a scheduled import.
Right?
What I do not understand is:
Why do you not get away with writing a very simple wrapper for invoking feeds_scheduler()->cron(); ? This is essentially what's invoked on feeds_cron(). The logic of whether something is being queued or worked off immediately is present in FeedsScheduler::cron().
Comment #5
alex_b commentedivanbueno: I knew we've started this discussion already #717390-12: What's the best way to refresh feeds faster with the Feeds module? - still my questions in #4 remain the same.
Comment #6
ivanbueno commentedalex_b: I did not use FeedsScheduler::cron() because it will act upon scheduled imports of all importer id's. I don't see a way to target a specific importer id. Let me know of the alternative or if I'm missing something.
Comment #7
alex_b commented"because it will act upon scheduled imports of all importer id's"
Why is this a problem?
Comment #8
ivanbueno commentedWhat's executing the import for a particular importer is a daemon script that runs every 15 seconds. (We're not using cron to import the feeds.) When that importer runs, we don't want any other importers to run that might slow the process. The shell script has more control on which importer to run based on the importance, load, and nature of the importer.
On a side note, pubsubhubbub would have solved our problems, except we're not dealing with rss/atom feeds. It's where we want to go though.
Comment #9
alex_b commentedI see. I'll assume now that these other importer's impact is not negligable. Here is how your use case needs to be addressed:
1) Add optional parameter importer_id to FeedsScheduler::cron(). If given, only work off subscriptions of feeds using importer_id.
2) Write a
drush feeds-croncommand that accepts an optional importer_id string that is passed on to FeedsScheduler::cron().3) Optionally, introduce a Drupal variable that controls whether feeds_cron() invokes FeedsScheduler::cron() or not. This could allow you to shut off cron scheduling on cron.php altogether and move it to a separate process triggered by drush.
1) and 3) are a different issue than this one, 2) could be part of this issue.
What feeds are you dealing with? Pubsubhubbub is moving towards a content-type independent specification.
Comment #10
ivanbueno commentedOk, 1, 2, and 3, works for my setup. I'll write a patch for 1 and 2. For #3, I'm not sure yet how to best handle that. (Additionally, I think the drush import process should use its own semaphore, instead of variable feeds_scheduler_cron... for cases where people have a drush script and cron running simultaneously.)
We're working with NewsML files (an xml standard for news press releases), where each post entry is an xml file. Content-type independent for pubsubhub would be great!
Comment #11
ivanbueno commentedPlease review this patch on feeds_scheduler::cron(). It's modified to cron($feed_name = NULL, $semaphore_type = NULL, $bypass_queue = FALSE).
$feed_name is added so cron can run a specific importer.
$semaphore_type is added so drush or any other caller can run their own cycles.
$bypass_queue is added so drush has an option to bypass cron-queue even if drupal_queue is enabled.
The drush file has been updated, and much cleaner:
% drush feeds-cron
Refreshes a feed right away. Uses feeds_scheduler::cron();
% drush feeds-queue
Adds a feed to queue. Nees "drush queue-cron" needs to be run afterwards. Also uses Uses feeds_scheduler::cron();
% drush feeds-config
Get all active importers for a site, or get details about an importer.
TODO:
Adding $bypass_drupal_cron for setting an importer to be ignored by drupal cron.
TESTED ON:
* Ran cron.php without drupal_queue
* Ran cron.php with drupal_queue, then drush queue-cron
* Ran drush feeds-cron
* Ran drush feeds-queue, then drush queue-cron
Thanks. Let me know if I'm missing anything.
Comment #12
alex_b commented- Why have an option to bypass the queue? If queue should not be used, drupal_queue module can be shut off.
- $feed_name should be $importer_id
- $semaphore_type can break the scheduler if callers use different semaphore types but work on the same importers. Let's not add different types of semaphores at all. This is what the queue is for.
In general, I'd love to keep it simple. Blowing up options on this sort of functionality is asking for trouble...
Comment #13
ivanbueno commentedI'm ok with importer_id and removal of semaphore_type. I need bypass_queue because we have other modules dependent on the drupal_queue module. We can't just shut it off to disable the feeds-queue.
Comment #14
ivanbueno commentedhere are the updated patches:
* $feed_name renamed to $importer_id
* $semaphore_type is removed.
I hope $bypass_queue stays. I need it.
Let me know if this works or not.
Comment #15
alex_b commentedHm. I guess what I'd like to know is why do you need to *not* use it? drush feeds-cron followed by drush queue-cron should do the job, doesn't it?
BTW, FeedsScheduler can be extended and the extending class can be injected by setting the feeds_scheduler_class variable (see FeedsScheduler::instance()).
It almost sounds like you have such a special use case that overriding FeedsScheduler to accomodate it makes a lot of sense...
Comment #16
ivanbueno commentedWe need "drush feeds-cron" to be as atomic as possible, without any other processes that might slow it down. "drush queue-cron" might come with other scheduled jobs that could potentially slow down the refresh.
If there's no $bypass_queue, "drush feeds-cron" is not needed. "drush feeds-queue" would suffice because "drush queue-cron" is needed to be run anyways.
I don't think this is a special use case. It applies for anyone who will be using the drush commands because of performance/tight-scheduling reasons. Having "drush feeds-cron" and "drush feeds-queue" gives the shell script more options and control.
Comment #17
alex_b commentedI don't plan to commit an option to bypass the queue - at least not in the suggested implementation. We should add a drush cron command that updates a single feed through the command line though - but not by calling through FeedsScheduler, but by calling FeedsSource::import() directly (see import menu callbacks).
Comment #18
ivanbueno commentedhere's a simplified drush for refreshing feeds. please review.
Comment #19
ericduran commentedI needed something like this.
This is a bit different than the other patches on this thread.
This has 3 drush commands feeds-list, feeds-import, and feeds-clear.
Feeds-list is a table display that displays all the feeds and their options such as name,description attached to, status, and state.
Feeds-import takes in the feed_name and does an import ignoring the schedular, also has the nid options if needs.
Feeds-clear is the same as import except it clears.
Sorry for being lazy and not dealing with fake cvsadd :-(
Comment #20
pcambrasuscribe
Comment #21
ericduran commentedHere's an update to my previous patch.
This time around it provides output to the screen of the batch process.
Comment #22
jelle_ssubscribe
Comment #23
pcambraI've tested the file and works fine for me, Drush is drupal version agnostic so this patch could be applied both to D6 and D7 branches.
Here is the patch attached with git format and I've also added support for importing a file from drush into feeds (option file).
I don't like the way this patch import the feeds calling import() method directly, I think we should code it using batch api, see http://drupal.org/node/873132 I'll give it a try in a while but meantime this is fucntional enough.
Comment #24
ericduran commented@pcambra, feeds already does the batch processing. Also see alex_b comment on #17.
Comment #25
pcambra@ericduran, but why not use batch api with drush calling feeds_batch, I'll give it a try so it is more clear what I mean.
Comment #26
ericduran commented@pcambra, I believe that the import will call feeds_batch itself. I might be wrong but that was my understanding.
Comment #27
pcambraI don't think that import() method uses batch at all that's why you need to loop over it, here is a patch using drush and batch, also adds commands for deleting and reverting feeds and also enable/disable. When this gets into drush #1186480: Make the batch api benefit of backend show progress automatically. user will get better messages when doing batch operations.
Comment #28
shaundychkoWow, perfect timing. Thank you very much for this.
Bug: while the --nid option works great, I can't get the --file option to work.
I tried:
drush feeds-import res_core_user --file=sites/default/files/feeds/department.csv
resulting in the error message:
File sites/default/files/feeds/dept-commas-newemail_2.csv is not accessible.
Feeds is remembering a file name I uploaded previously, and doesn't seem to be accepting the new --file option. I also tried an absolute path. (and tried putting it in quotes...)
The workaround for me is to manually create an import node, which remembers the correct filename, and pass --nid=## to this drush command.
Comment #29
pcambraOk, I see why now, here is the error fixed for #28, but as we don't really upload a file, we need to fake it, so if the file already exists in public://feeds, feeds-import will respect the older file, and if it doesn't, it will create it.
Probably it would make sense to check the file extension according to the feed config.
Comment #30
pcambraDiscussing the subject with jonhattan, it seems that we'd need to improve the argument and options validation with a validate function to be more robust.
Also it may appear that these two lines might not be required for drush to handle batch.
Comment #31
pcambraAnother version of the patch that checks the file extension and removes unnecessary lines for drush to process batch.
Comment #32
jackbravo commentedsub!
Comment #33
reecemarsland commentedsub
Comment #34
tema commented+1
Comment #35
andypost#31 works, but there's no ability to see error log of import - drush finished silent
EDIT: errors are printed to stdout, Tested on D7!
Comment #36
brad.bulger commenteddownloaded the patch from #31 and used it with an installation of Feeds 6.x-1.0-beta11 - works great!
we use the standalone form for importing - i added a "url" option to feeds-import and added this code to process it:
seems to work OK.
i expect that needs some kind of validation as well, plus i wasn't sure about the naming conventions - if the argument should be "uri" not "url" - but though this might be of use to someone
Comment #37
shaundychkoWhen trying #31 I received the output:
Only files with the following extensions are allowed: .
When putting print_r($fetcher_config); on line 149 of feeds.drush.inc the output is:
Array
(
[direct] => 1
)
Only files with the following extensions are allowed: .
The extension is looking for $fetcher_config['allowed_extensions'], which doesn't exist in the $fetcher_config array, and which I can't find on the feeds config UI (outside of setting the parser to CSV file).
In any case, the --nid= option continues to work, but things would be a little more slick if I could get rid of these content types that exist only for importing, and instead use the --file option. Thanks for this improvement to feeds, as it makes a really important part of my project possible (regular data import triggered by cron, from a file).
Comment #38
pcambra#37 you configure the allowed extensions in your feeds importer configuration, in the file upload settings, but maybe the patch can be improved to allow all the extensions if none is set.
Comment #39
shaundychkoHi pcambra, thanks a lot for getting back to me. If I should open a new issue, please let me know, but perhaps the settings you're referring to are in D7? I'm on D6 (and this issue is tagged D6), and I've attached my file upload settings, using the latest feeds 6.x-1.0-beta11. There doesn't appear to be any place to configure allowed file extensions, or am I missing something?
Comment #40
mvcThe patch in #31 looks like D7 code to me, for example:
file_prepare_directory() is a Drupal 7 function.
Comment #41
mvcHere's a version of #31 that works for feeds 6.x, and bails with an error if the user attempts to import a file which doesn't exist.
Comment #42
sir_squall commentedHello,
I have added the functionality to import by http, and not only by a local file.
Comment #43
tommyk commentedThe added option to import via
--http=parameter in #42 works for me on Drupal 6. Attached is the file from #42 in patch form.Comment #44
tommyk commentedI've done some testing with this on Drupal 6 and have some questions. Here is my scenario:
---
* Nodes initially created via Import button on feed node.
Adding context and fields to the XPath XML parser settings on a feed importer and both detaching it from and leaving it attached to the content type that also contains the context and fields seems to creates duplicate nodes upon first import via drush with
--http=option. Subsequent drush imports do not create duplicate nodes, but update the drush-created nodes. Something triggers new node creation when the importer changes that significantly.* Users initially created via Import button on feed node.
Adding context and fields to the XPath XML parser settings on a feed importer and both detaching it from and leaving it attached to the content type that also contains the context and fields does not duplicate users upon first import via drush with
--http=option.---
Can anyone explain why drush feeds-import will not update existing nodes that were created with the import button on a feed node? I don't think it has to do with using Feeds XPath Parser, but could it? It's not the end of the world if I have to delete my already created nodes and start over. It is strange though that users are not affected in the same way.
Comment #45
clemens.tolboomAttached patch contains minor fixes:
- Code style
- argument check
- importers count check
But I did not succeed in importing a file on a Drupal 7 site.
and the following fails miserably with error
So the code is not Drupal 7 compatible as that function is D6: http://api.drupal.org/api/drupal/includes!file.inc/function/file_create_...
Using the feeds_ui everything is working fine.
Comment #46
pcambraThere's no possible way of maintaining a single patch that fits both for D6 and D7.
The patch I posted in #31 was D7 compatible and doesn't use
file_create_pathWhat I propose here is to attach 2 files, one for D6 and one for D7, or work in the D7 version and then backport, but this mix is clearly not going to work.
Comment #47
clemens.tolboomGuess I took the wrong version then :( I took #43
This issue is about D7 so I hope someone takes the good things from #45 and fix it with code from other versions like #31
Unfortunately I have no time to fix this as other projects are running :(
Comment #48
lunk rat commentedSubscribing. What testing needs to be done to get this commited to 7.x?
If anyone can share their experience using this with 7.x; what patches, how it works, bugs to be fixed, etc. that would be helpful.
Comment #49
clemens.tolboom@Lunk Rat : you should check #31 and #45 and make it work for D7.
Comment #50
lunk rat commentedI plan to, but I'm scared ;)
Comment #51
clemens.tolboomJust give it a try ... we probably help you (if time permits)
Comment #52
andypost#45 Just fixes some comments and introduces '--http' import so let's commit it.
PS: #31 works in my production env from August 2011
Comment #53
lunk rat commentedJust used #31 to import a csv file into D7 with 4 passes using 4 different importers, run one-after-another, executed via a Python script. Everything worked great!
Comment #54
lunk rat commentedInfo: I have found that #31 works much better with drush 5.1 when importing large csv files because of better batch-process handling in drush 5. I was able to import a csv with 320 rows and 20 columns no problem--that is, after I bumped up memory_limit in php.ini ...
Comment #55
andypostA combined patch for D7 & D6
Changes:
D7 - incorporated http fetcher from D6
both - a bit cleanup
Comment #56
okokokok commentedfeeds.drush.inc in the d7 patch of #55 contains node_get_types() which is not working in d7.
Commenting it out and just putting dt('none') at least gives me some result for drush feeds-list
Comment #57
andypost@Kasper Souren thanx for review, fixed
Comment #58
mpv commentedThank you! Worked great with http importer in drupal 7.
Comment #59
beatnikdude commentedPatch from #57 works great for me w/ 7.dev.
Edit: #57 v.
#58Comment #60
exeapps commentedAwesome work. No issues in D7.
Comment #61
Jason Dean commented#57 is working for me, but...
I'm using the Unique Fields module solution as I need a specific field to be the unique target. This works fine when running the importer the usual ways, but the Drush feeds-import method doesn't respect the unique field, resulting in loads of duplicates.
I'll try to investigate this a bit further.
Comment #62
Jason Dean commentedTotally confused following my comment above.
The Unique Field patch simply adds an extra check to the loop that checks for existing feed item nodes (see last couple of quoted lines from FeedsNodeProcessor.inc) :
This works fine when using the Import button on the Feed node (updates existing), but not with Drush feeds-import command (imports all as new). Surely Drush feeds-import is using FeedsNodeProcessor.inc the same way?
Comment #63
twistor commentedImplements
importer
importers
Camel case?
Why not pathinfo($filename, PATHINFO_EXTENSION)?
What about other types of fetchers?
Comment #64
twistor commentedIt seems to me, that rather than custom coding support for the file and http fetcher, this should be made more generic.
Comment #65
attheshow commentedPatch #57 is working well for me.
Comment #66
attheshow commentedSo strange. It was working for me yesterday to run commands like this:
But today I'm getting errors on both:
I verified that both the feeds and feeds_ui modules ARE enabled for the site I'm working with. Any ideas what could be wrong?
Comment #67
markwk commentedAmazing work. Tested patch in #57 so that I can use drush to run the import and avoid timeout limits. I didn't do a code review or test all the parts but worked great.
Comment #68
patcon commentedI'll dig into it, but what are the thoughts on removing the file format restrictions on the command line? I would think this would be advanced users and automation attempts, so it seems a little overbearing for force.
In my case, I'm trying to set up a script that imports the first 10 lines of a file for demo purposes:
Yes, a
--limitflag is probably the prettiest way, but this is one of the many things that ppl can solve for themselves in the meantime if we just remove the barriers :)Comment #69
4alldigital commentedFirst post on drupal.org, so shout if Im doing anything wrong.
Im using:
drush feeds-import --verbose my_importer_id path/to/file.csvto import taxonomy terms into Drupal 7. When I do immediately after site installation (which is my goal), it does not work. When I run the importer in the UI (site.com/import/my_importer_id), that works fine. Then I delete imported terms using the UI and run :
drush feeds-import --verbose my_importer_id path/to/file.csvagain, and it works fine. So the UI must be doing something that the command line version isn't. any ideas?
Thanks
Comment #70
4alldigital commentedmy #69 issue resolved.
FYI - it was my own fault. I was not specifying the file correctly in the drush command.
is what I should have been calling.
Comment #71
andypostNew patch for D7, addresses mostly all #63
1) normalized parameter names
2) fixes camel-case
@twistor we should beware functions that does not respect locale, we have drupal_realpath() to work with miltibyte filenames - as php.net tells "pathinfo() is locale aware"
I suppose we should commit this to D7 as is and try to figure out a generic way in follow-up
@joe701 don't change status! Fixed ~ commited
Comment #72
markabur commentedCool, the feeds-import command is exactly what I was looking for. I noticed a few small things:
How about:
if (empty($feed_name))(There are two occurrences of this)The feed will be cleared regardless of its schedule.
The feed will be refreshed regardless of its schedule.
This would fit on one line.
If the file doesn't exist...
Comment #73
markabur commentedAlso, I'm confused by this behavior:
+ // If the file already exists, create it in database if necessary and
+ // reuse it.
+ if (file_exists($filename_schema)) {
I'm needing to re-import the same filename as before (this is a log file and now has more entries that I need to import), but this code is re-using the old locally-cached copy of the file. To me it would make more sense to assume the local file always needs to be replaced (otherwise why re-import it in the first place?) and delete it rather than re-use it. Or should I just delete the file manually whenever I'm done importing it? Or maybe the local file should be reused only if the remote file isn't newer?
Comment #74
mikran commentedfeeds-list command description says
+ 'description' => "Displays all importers or displays the config of a given importer (passed as arg).",but args are not accepted
Comment #75
-enzo- commentedHello Guys
I did a personal implementation for Enable to import a individual feed via drush at http://drupal.org/node/1865210 with sandbox https://drupal.org/sandbox/enzo/1865202.
Please check if this is the same direction and if we can merge or let me know if this idea is a subset of current development.
Comment #76
andypost@-enzo- can you roll a patch against current 7.x version of feeds?
Comment #77
jurriaanroelofs commentedI added some code that allows you to override any configuration in JSON format. Not the prettiest solution but it beats writing a ton of glue code.
Example usage:
(don't forget the single quotes around the JSON string)
To see what configuration is available add the --verbose modifier to the feeds-import command and it will spit out the complete configuration in JSON which you can then edit and feed back to the override parameter. It will output something like this:
Comment #78
wutu commentedI get this result:
drush feeds-import --verbose import_zbozi --file=/var/www/html/importISS/zbozi.xml
Initialized Drupal 7.22 root directory at /var/www/html/atok [notice]
Initialized Drupal site default at sites/default [notice]
Undefined index: allowed_extensions feeds.drush.inc:158 [notice]
Undefined index: allowed_extensions feeds.drush.inc:160 [notice]
Only files with the following extensions are allowed: .
Command dispatch complete
Comment #79
queenvictoria commentedPatch in #71 doesn't include the fix in #57 node_get_types -> node_type_get_types for D7
Comment #80
queenvictoria commentedHere is a patch and interdiff that addresses all the issues in #72, #74 and #79.
I've only done 7.x-2.x -- apologies I can't test D6 at the moment.
Comment #81
queenvictoria commentedThis is a patch (and Interdiff) based on the patch in #80. It is an attempt to work in the feature that @-enzo- was working on. It adds two new drush methods.
$ drush feeds-import-all [feed-name]This imports all feeds importer instances (!?!). If a feed-name is specified then it only imports feeds in that importer. This is because my feeds are attached to a content type and feeds-import requires a feed-nid in order to import anything at all. With this command I can import everything all at once.
$ drush feeds-list-all [feed-name]This lists (in a style similar to @-enzo- sandbox) all the feed importer instances (!?!?). I've added this because its a useful input if you are trying to feeds-import a single feed attached to a content type.
There are some issues.
The language. feeds importer instances. What is the correct term for these?
The method names. feeds-list-all overloads the feeds-list meaning and shows very different output. Should they be blended together? Should they be further separated?
The query. I've been looking high and low for the equivalent of feeds_importer_load_all for feed importer instances. In the end I've used @-enzo- db_query. There must be a better alternative using feeds-native calls.
Comment #82
queenvictoria commentedWell that one wasn't so good. Looks like calling feeds_drush_import directly causes some sort of recursion. I've switched to @-enzo- method. Patch and interdiff attached. I've also added a --limit argument so you can just import 100 at a time (for example).
Comment #84
queenvictoria commentedOops. Limit cannot be zero.
Comment #85
micromegas commentedI got the following error testing with #84:
When this occurs, the attempted-uploaded file is not present in sites/default/files/feeds.
The situation is that the content is created by importing one CSV file, and then a second CSV file is imported to update the changes in that content--in this case, it is a single user reference field. The changed file uploads fine through the import page, but does not when using
drush feeds-import.This importer was initially url-based, and then I later changed it to file-based for testing... in case that has to do with anything.
Comment #86
j0rd commentedI've posted another patch against enzo's drush_feeds_import. It now allows you to import all {feed_source}.id's or individual {feed_source}.id + {feed_source}.feed_nid
His modules been working very well for me, and is quite simple. You can apply this patch off his latest -dev.
https://drupal.org/node/1888356#comment-7661871
Comment #87
mparker17I needed the ability to import a file on a remote site from a local computer, so I added a
--stdinparameter tofeeds-importso I could rundrush @myremotesite feeds-import importer_name --stdin < some_file.csv.Comment #88
kenorb commentedTesting by using XML files (feeds_xpathparser):
It seems that drush_get_option('file') returns 1
No watchdog errors, temp dir has the right permissions as it works in UI.
Maybe I'm doing it wrong.
Other commands are working like:
Tested with 7.x-2.0-alpha8 and dev version.
Comment #89
mparker17@kenorb, the --stdin flag causes a temporary file to be created in a
feedsfolder in the private file system path (it saves the file to'private://feeds/drush_feeds_' . $feed_name . '_' . time()).Ensure that a private file system path is set and writeable by Drupal and Drush by going to
/admin/config/media/file-system.If that still doesn't work, try running the same command with the
-vargument to make Drush run in verbose mode. You'll see a message sayingSaved !num bytes to !filename.. Check that the file exists and contains the contents of your file.If the file isn't there, Drupal and Drush probably cannot write to that path.
If the file is there but is empty, either something went wrong reading from STDIN, sending the file to the Drupal installation, or there's a filesystem permissions problem somewhere.
If the file is there and has the contents of the original file, something else went wrong. Try going to
/import/page_import_xml... the file should be pre-populated. Can you import it there?If there are any relevant watchdog logs or terminal output, please feel free to send them to me using my contact form. Let's not clutter up the main issue with my (potentially) buggy code if we can. :P
Comment #90
kenorb commentedIn UI private folder is set to: sites/default/protected
The temp file is created as /tmp/fileIp1jba and then I believe is copied to /private/tmp/fileZnF4mY and then it's copied to the final destination to:
sites/default/protected/feeds/drush_feeds_page_import_xml_1376335061
Anyway, the permissions are ok, but feeds directory doesn't exist, probably that's the reason.
So regarding stdin, probably the creation of dir is necessary:
before doing file_save_data().
Tested and it works.
Comment #91
mparker17@kenorb: Looks good to me! Thanks very much! I'll re-roll the patch.
Comment #92
mparker17Re-rolled.
Comment #93
kenorb commentedRegarding:
Drush actually requires --file=FILE, then it works.
Tested using file and stdin and it works fine.
Comment #94
kenorb commentedAlso one question.
When importing the file with the same name (after changing it), it seems that it's ignoring the changed file and importing the existing one (from db).
It's expected behaviour?
Comment #95
twistor commentedThe $feed_name variable, and all references in help text needs to be changed to $importer.
drush_feeds_import() is a giant mess and needs to be re-factored.
Comment #96
jmichelgarcia commented92: feeds-drush_integration-608408-92.patch queued for re-testing.
Comment #97
mparker17Still passes! :D
Comment #98
johnennew commentedThis is working great - thanks for all this work!
Just one comment,
--file option on feeds-import fails if you pass a directory into it, but this is valid in the interface where you might import every feed present in a directory.
Comment #99
kenorb commentedPlease change it to appropriate status (Needs work/Needs review).
Comment #100
dbassendine commentedThanks for this, it will be great to have these tools available in drush! The main commands all work well for me with a standalone importer - I haven't tested separate instances of the same importer.
However I did find when I set the --http option for drush feeds-import, I received a cURL timeout error:
This is a known good URL, as it's the default URL used in the importer anyway.
Comment #101
megachrizAccording to comment #95, this issue needs work.
Comment #102
agnese.stelce commentedAdded small change to patch on #92 - set the Feeds file folder to value stored on configuration not use hardcoded value.
Comment #103
alan-ps commentedSeems that import with --stdin parameter doesn't work correctly. The file will saved without extension (ie. drush_feeds_manual_page_import_1418654544). So, we can add extension as the --stdin value and save with file extension.
For example: drush feeds-import test --stdin=xml < test.xml
Tested using stdin and it works fine.
Comment #104
joelpittetMarking as needs review.
Comment #105
mglamanTested, and this works. We needed a way to clear and import feeds via Drush. This is a huge need for the module since Feeds handle a lot of data and most cases requiring command line operations.
Comment #106
goldii commented#103 tested and works fine! Thanks!
Comment #107
b-prod commented#103 seems to work fine with standalone importers. Not tested with importers attached to nodes.
Comment #108
forestmars commented+1 RTBC, let's get this in!
Comment #109
twistor commentedThe stdin, http, and file arguments aren't going in this patch. It's too much and a very wrong approach. We can create a follow-up.
No interdiff since most lines have changed.
Comment #110
twistor commentedMinor fixes.
Comment #111
dillix commentedtwistor, how can I run import of local csv file with this patch? Can you write an example, so I will test your patch on large data...
Comment #112
twistor commentedYou can't at the moment.
Fine, we'll add that ability, but it needs to be done in a simple way for all configurations.
Comment #113
twistor commented@dillix, to import a csv file, you can do
drush feeds-import my_importer --nid 10 --file=path/to/file
Comment #114
twistor commentedComment #115
twistor commentedComment #116
andypostLooks great! needs to be tested manually
Comment #117
ysaid commentedHello @twistor,
I tried to import a csv file using drush feeds-import my_importer --file=path/to/file.
if nid option is not used, script get stuck in l.227 exception anyway. I think it's better to include the try/catch in a drush_get_option('nid') condition. What do you think ?
Comment #118
laVera commentedHello:
I also run it with no nid (my feed came from features), it does actually do the import, but the process is never closed. My work around is to add a kill in the command, it not the cleanes, but does the work:
$drush feeds-import my_importer --file=path/to/file.csv & sleep 60 ; kill $!But my huge issue, is that drush don't recognize the feed until I have used once trough web UI (The feed my_importer: 0 does not exist.). This defeat most of the propose of implementing drush for feeds, be able to automatice. It should have something to do that my feeds importers came from features?
Any ideas are welcome :)
Comment #119
johnny5th commentedThis patch seems to not run my hook_feeds_after_import code, but it DOES run my hook_feeds_after_parse code. It runs fine straight from the GUI.
I'm expecting a node to be deleted when a cancel date is 30 days older than today, but it doesn't get removed until I run the importer from the GUI.
Comment #120
csakiistvanI try to run code with #113 patch, but it was failed:
Comment #121
kebne commentedI have the same problem with #113 patch, seems like the method FeedsSource::pushImport () is missing in feeds.
Found that there's a patch to FeedsConfigurable.inc at https://www.drupal.org/node/2531858 which is supposed to add pushimport.
I could however not apply that patch with git. I'm using Feeds 7.x-2.0-beta1 which might be the culprit.
Comment #122
megachriz@csakiistvan, kebne
Try to apply the patch on the dev version of Feeds,
FeedsSource::pushImport()was added after the 7.x-2.0-beta1 release.Comment #123
cmarcera commentedUpdated Feeds yesterday and now my drush commands aren't working:
Feeds is definitely enabled.
So far I've:
None of those things have helped. I'm stumped.
Comment #124
eiriksmI am using this, so wanted to see if we could get this in. But I am going to be a little party spoiler here and set back to needs work.
These examples are incorrect. feeds-list-all does not exist
Here we are talking about importing, not refreshing
Nitpick. Should probably be "that will be cleared"
Also, all of the commands and their descriptions, types and so on should ideally be wrapped in dt for max multilingual profit :)
Thanks for this great work, this has really helped me. Let's get this to RTBC! :)
Edit: Sorry, the second comment is about the description in the import command, that was not so obvious from the my comment :)
Comment #125
mikran commentedMy importer is using file upload as source. Periodic import is set to 'Off' and when I run
drush feeds-import my_importer --file=path/to/my.csvthe import process is repeated indefinitely. I was expecting the file to be imported just once.Comment #126
controla commentedSame as mikran, feed is importing correctly using --file=csv but it doesn't stop at the end and starts adding nodes again.
Other than that non-minor bug, this thing is working perfectly, and is hopefuly going to save my neck.
Comment #127
grahamcLooping issue with --file=csv is not really part of the Drush integration, so I created a new issue #2624344: Import via pushImport() keeps looping / never completes (with possible fix)
Comment #128
kenorb commentedAdded suggestions from #124
Comment #129
kenorb commentedComment #130
jerry commentedPatch in #128 is working fine for me with 7.x-2.0-beta2.
Comment #131
johnny5th commentedPatch #128 seems to work on 7.x-2.0-beta2, though it updated all of my nodes when the GUI did not. After the second try, no new nodes were updated.
On my hook_feeds_after_import(), the SQL query I had timed out and threw a "gone away" error message. Feeds is processing a 30mb file taking 30 seconds to run, and my my.cnf settings were set to 30 seconds. After I upped my.cnf to 60 seconds, it ran fine. hook_feeds_after_import() was not having this issue from the GUI.
Comment #132
bès commentedHi, same problem than #118
drush feeds-import qa_importer --file=../ressources/QA.csvreturn "The feed qa_importer: 0 does not exist." until I do the import once via the UI.
An exception is throw during drush_feeds_import() when testing the existance.
$source = feeds_source($importer, $feed_nid)->existing();The FeedsSource existing method run this code :
The importer test is ok but the parent one fails with 'Object is not persistent.' exception probably because we don't use a node feed so FeedsSource consider our source as non persistent...
Anyway, this prevents any use via features.
Edit : maybe there is a link to : https://www.drupal.org/node/1197536
Comment #133
bès commentedComment #134
megachrizSo looking at a patch for this issue for the first time, here is my review:
The drush commands are working great so far! I found a few small bugs. Furthermore, the documentation, UX and texts could use some improvement.
Details
dt()should be used inhook_drush_command(). At least, I don't see other modules doing that.drush help [feeds-command]examples sometimes are listed with a zero, because there is no description for the example. All examples should have descriptions, in my opinion.feeds_drush_command()should match the function order. "feeds-delete" and "feeds-revert" come after "feeds-enable" and "feeds-disable" infeeds_drush_command(), but it is the other way around in the function definitions.I have worked on improving the help texts and the UX and on fixing the bugs that I had found. Since that resulted in a lot of changes, I'm posting these in a series of patches, so it is easier to follow. Hang on...
Comment #135
megachrizMoved
drush_feeds_delete()anddrush_feeds_revert()to match the command definitions order infeeds_drush_command()with the function order. Minor improvement.More to come...
Comment #136
megachrizRemoved
dt()from command definitions.Comment #137
megachrizFixing some bugs.
drush_log()needs a second parameter in order to show up on the command line.Comment #138
megachrizImproved documentation, added
drush_feeds_help()and improved messages.Comment #139
megachrizRenamed variable
$importerto$importer_idat some places. This way$importeralways represents an importer object and$importer_idalways an importer ID.Two more patches to come...
Comment #140
megachrizHere are some bigger changes to improve the UX.
Confirmation messages are added to
drush_feeds_import()anddrush_feeds_clear(). When a combination of importer ID and feed NID cannot be found, it tries to be helpful to guide the user in the right direction. For example, when someone specifies a feed NID and the importer is not attached to a content type (so the importer does not work with feed nodes) it warns the user about that. The other way around it goes similar: when the importer is attached to a content type, the user is being asked to specify a feed NID.In
drush_feeds_delete(), if an importer cannot be deleted because it only exists in code, the user is informed about that is the reason that the importer was not deleted.In
drush_feeds_revert(), a more clear reason is given when an importer cannot be reverted. Because it is not overridden or because it does not exist in code.When the commands for enable, disable, delete and revert are executed without an importer specified, a warning is shown that the user should specify that.
Comment #141
megachrizAnd finally, some command aliases!
For
feeds-importandfeeds-import-allI was thinking of using the aliasesfiandfia, but I'm not sure if that would collide with drush commands defined by other modules. According to http://drushcommands.com/drush-7x/, at least the module Features does not use these aliases.This was the last patch of the series ;).
Comment #142
megachrizComment #143
kenorb commentedComment #144
megachrizI've been testing the Drush patch some more and implemented the following improvements:
Previously, when supplying a file to import, the full absolute path to the file must be specified.
I improved this by making it also possible to specify the file relative to the current directory. A function called
_drush_feeds_find_file()is added for this. The drush functiondrush_cwd()is used for this solution. See diff 1.I noticed that it wasn't possible to import from a file or url for an importer for which no feeds_source entry exist yet. This made its use limited: it wasn't possible to enable a feature module that contained an importer and immediately import content using that importer.
I improved this in the following way: if no existing feeds source is found for an importer, the importer is not attached to a content type and either a file, url of "stdin" is specified, a new feeds source is created and saved. See diff 2-3.
When trying to initiate an import for an importer using the standalone form and for which no source exist yet the following error message was shown:
I thought this error message made no sense for an importer not attached to a content type and changed it to this:
See diff 2-3.
When an importer is disabled, 'drush feeds-import' and 'drush feeds-clear' reported that the importer in question did not exist. I changed this message to the following:
See diff 4.
When listing the available feeds using the "drush feeds-list-feeds" command, I thought it would be useful to also display the titles of feed nodes, rather than only the feed node ID's. I adjusted this. See diff 5.
Comment #145
megachrizI made some more improvements:
Comment #146
g33kg1rl commentedThis is exactly what I needed. I am using this patch now. :) I will let you know if I run into any issues, but so far so good! +1
Comment #147
megachrizI noticed the menu's weren't updated after enabling a Feeds importer using
drush feeds-enable. This resulted into a 404 page when navigating to the importer's log page.New patch.
Comment #148
jm_drupal commentedPatch in #147 is working fine. Thanks MegaChriz. The drush command 'feeds-import' asks for confirmation, is it possible to add another option 'force' or sth similar to bypass this confirmation step? I require to run the drush command as a cron job included in my Crontab file.
Comment #149
nwom commented@jm_drupal I run it as a cron job too, and just added "-y" to the end of the drush command. I hope this helps!
Comment #150
jm_drupal commented@NWOM Thanks.
Comment #151
tedwyer commentedThis is probably a dumb question but how do I apply this patch. It seems to patch a file that doesn't exist in my install. Thanks
Comment #152
nwom commented@tedwyer: On a module's project page click on "Version control" to see how to download the newest dev via GIT. In this case the page is https://www.drupal.org/project/feeds/git-instructions. The branch you want to download is 7.x-2.x. It describes how to apply a patch as well.
This patch does add a new file indicated by "new file mode".
Comment #153
megachrizFound one more bug: when supplying a file or url for an standalone importer for which no source exist yet, a source gets created, but the import is aborted anyway.
This issue has been open for long enough and I found the patch to be quite stable during the last 4 weeks (with only a few minor bugs found). So if the test results turn green, this patch will get in today!
Comment #155
megachrizCommitted #153.
Thanks all!
Finally this issue is done after six years, eight months and seven days!