Project:Import HTML
Version:master
Component:Code
Category:feature request
Priority:normal
Assigned:Unassigned
Status:closed (fixed)

Issue Summary

Is any work being done on this?

Comments

#1

Not yet.
Although I tried reading the docs for import/export API to see if we can now start using that at the internals.
.. :(

.. I'll have to read it 2X more to grok it tho'!

#2

Status:active» postponed

Update.
Import/Export is behaving WAY too intensively on the system for me to use for bulk imports. Serious timeouts and mem-maxing due to the complex way the API holds so much stuff in memory.
This module already has timeout issues. With import/export I can only slurp a dozen pages at a time.
This is a killer, so until/if I can port the functionality into a daemon process (like search indexing does) that's off the cards.

As for a Drupal 5 port without the API, ... we'll see. Anyone is welcome to have a go.

#3

subscribing.

#4

I'm also interested in utilizing this module on Drupal 5.x. Could the maintainer please provide a "best estimate" of a time frame for Drupal 5 compatibility. I'd like to use this estimate to make a business decision between manual conversion and deferring until Import HTML for Drupal 5 is available. Thanks!

#5

I tried a port of one of my other modules last week, and it's not too painful.
Available options are :
- Wait around until it spontaneously happens (possibly by the end of the month)
- Offer an incentive of sorts and I'll be able to dedicate a day to it away from current paid work :) $300 would be about right.
- Hope somebody else has a go in the meantime
- try using it with a 4.7 install and then 'upgrading' to 5.0 (No idea how if that's a good idea or not)
- for less than 200 pages, I'd suggest just doing it by hand. Really.

.dan.

#6

Status:postponed» needs review

This patch is against the 4.7.x-dev branch. I tested typical functionality.

Biggest changes are:

  • I moved some of the menus around. Now all pages including settings are under admin/import_html .
  • I moved the shared logic in import_html_import_files_page() to _import_html_import_files() and factored out form specific logic into import_html_demo_form_submit().
  • I seperated form builders from pages
  • other 5.x porting changes
  • I probably also fixed some bugs that were present in the existing code.

I really wanted to make path and menu dependencies, but I resisted the temptation.
Does anyone want to send me $300? :)

AttachmentSize
import_html_47xto5x.patch 23.27 KB

#7

I had originally thought about making tabs on the menu, but I forgot to clean up that code. This patch has those changes cleaned up.

AttachmentSize
import_html_47xto5x_1.patch 23.25 KB

#8

subscribe

#9

Ho-Kay...
I've created a 5.0 branch for this patch, although it had to be rooted from the October release of the code for the patch to work. That was the last version actually marked as a release not dev, So I guess that made sense :(

Thus we are missing the latest 4 features that have been added - taxonomy, CCK, nodewords, user support, plus many bugfixes.

I'll mess around to see if CVS is clever enough to actually merge it all, but somehow I doubt it and it may be a lot of diffing to get things to up to date.

In good news however, the port DOES WORK in 5.0, (no serious testing, but the UI does its thing)
Thanks larrychu.

I think I can give you CVS on this branch if you are keen to take it on - the taxonomy additions are very useful.

Unfortunately the revisions between now and then look a bit of a mess. Reformatting, among other things makes it look bigger than it really should be.

#10

Status:needs review» needs work

Took me a heck of a while, but I merged in all the changes I could see. a 5.0, branched from todays HEAD, will be rolled tonight.
There may well be a couple of bits broken, but as above, the UI works. Hope that's all there is to it.

Open for testing

#11

I should probably open another bug for this but, I noticed that I could only access the Import HTML pages correctly as User #1. I fixed this by doing a global search and replace for "access admin" and replaced with "access administration pages" in import_html.module.
I didn't test this in 4.7, but only in the patched code I submitted.

#12

The access restriction was sorta on purpose ... and it seems sorta unimplimented. It dates back a while :)

I thought I'd documented that running this process pretty much required every permission going (admin), as I couldn't imagine doing permissions checks at every single step.
Thus yeah, I assumed user 1.
It looks like the user_access('access admin') may have been a total red herring however, and never would have worked. And at one point I thought I'd try and create

<?php
function import_html_perm() {
  return array (
   
'access import_html'
 
);
}
?>

... but never followed through ! It was over a year ago now when I was first starting on modules :-/

So certainly an old bug.

... I may have done something wrong in the tagging with new release numbers, there doesn't seem to have been a 5.0 rollout from the CVS system. sigh

#13

i suck at cvs and i've had the issues.

process i go by thanks to dopry:

- cvs tag -b DRUPAL-5
- cvs tag DRUPAL-5-1--1-0 (whatever this exact version will be)

then in the project/whatever on drupal, you need to make a release for that tag.

thought id spill that incase you needed this info. im waiting on this module too now :)

#14

:-/

My CVS graph shows that the DRUPAL-5 branch happened ...

but my release tag after that was DRUPAL-5--0-1 ... possibly not quite right !

I'll try again and see what happens.

Thanks a lot for the suggest! I hung around on IRC for a few hours last night trying to get someone who knew...

#15

Ho-kay.
I didn't actually MAKE a release node. My Bad.
That used to be automatic, or irrelevant when there was only one branch, or something. It may roll tonight.

#16

Status:needs work» fixed

Long fixed. Time to clean up the queue.

#17

Status:fixed» closed (fixed)
nobody click here