u can now use new arc2/sparql rdf

how to:
download semsol arc2: https://github.com/semsol/arc2/downloads
and move files to /sites/all/libraries/ARC2/.
if u do not have folders libraries and ARC2 just create them.
your folder looks like this

/sites/all/libraries/ARC2/ARC2.php
....

after that
install File RDF module
disable RDF module -> do not uninstall it or your data gets lost!!!

goto /admin/settings/file/file_rdf
and run the converter script. this will copy your rdf data
to new arc2 tables.

if something goes wrong or it does not work u can
switch back and enable RDF module again.

it would be nice to review new File RDF module.

CommentFileSizeAuthor
#1 rdf_file_test_error.txt45.4 KBjvieille
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

jvieille’s picture

FileSize
45.4 KB

I tried on a test site
- an existing file gets no data in properties - preview is gone
- uploading a new file gets wrong - attached the error generated
- admin/settings/file/file_rdf/database reports also an error
user warning: Table 'shareontheweb.file_rdf_dbs' doesn't exist query: SELECT * FROM file_rdf_dbs in /home/shareontheweb/public_html/sites/all/modules/fileframework/modules/rdf/file_rdf.module on line 204.
The database does not seem to have been correctly updated. Running update does not help.

Could you clarify this:
- does the ARC2 library need to be in a ARC2 folder (will arc, arc2 work?) - the site status is always happy whatever is the AR2 directory
- the RDF module is a dependency for DAV, File relation server, File taxonomy server. the new file RDF module will definitely invalidate these modules

Hope this helps!

johanneshahn’s picture

hi jvieille,
first of all, thx for taking time to test it.

normally before installing the file_rdf module u have to
upload ARC2 into /sites/all/libraries/ARC2/

it looks like this:
/sites/all/libraries/ARC2/ARC2.php
/sites/all/libraries/ARC2/parsers/
......

then the required db tables are created.
file_rdf_dbs
...

u can uninstall file_rdf completely and install again
after that goto /admin/settings/file/file_rdf/database
there have to be a list with "ARC2 default Database | file_rdf ..."
after that u have to copy the file meta data from old rdf tables
to new ones goto: /admin/settings/file/file_rdf
and click the convert/copy button

does it help?

the RDF module is a dependency for DAV, File relation server, File taxonomy server. the new file RDF module will definitely invalidate these modules

i have to check how these modules work together with rdf directly.
normally fileframework has wrapper functions to work with both file_rdf and rdf module.
i think we dont need these modules in future if fileframwork can handle files like "File taxonomy server",
"File relation server"
#281298: Batch upload of multiple files
#1456426: Provide a file bulk download

jvieille’s picture

I tried again, with not much success. The tables are not created, no ARC database shows up. Instead, I get this self explaining warning

user warning: Table 'shareontheweb.file_rdf_dbs' doesn't exist query: SELECT * FROM file_rdf_dbs in /home/shareontheweb/public_html/sites/all/modules/fileframework/modules/rdf/file_rdf.module on line 204.

Also, when I uninstall and reinstall file-rdf, I got these errors each time

warning: Illegal offset type in isset or empty in /home/shareontheweb/public_html/includes/bootstrap.inc on line 997.
warning: Illegal offset type in /home/shareontheweb/public_html/includes/bootstrap.inc on line 998.
warning: Illegal offset type in /home/shareontheweb/public_html/includes/bootstrap.inc on line 1002.

I hope this helps

johanneshahn’s picture

hi jvieille,
sorry for late answer, but im currently very busy in my
job.
do u have already a other / older installed ARC version in your project?
the new file rdf module is looking for other ARC Classes before include
the one inside /libraries/ARC

jvieille’s picture

Don't aploogize, this is not a blocker. I was myself late testing the feature.

Yes, I have 2 others ARC2 libraries...

one in libraries/arc
one in the RDF module - which is the one reported in admin/reports/status
I can delete the /arc

jvieille’s picture

Actually, the /arc library is required by RDF, it has to be in /libraries/arc
The one in the RDF module itself is probably a useless copy of the library Id done earlier...

Why don't you use the same library / directory - libraries/arc instead of libraries/ARC2?

Update : it seems that the RDF module needs the arc library inside the module itself

jvieille’s picture

I am back to test this new feature, because I have a performance problem that seems to be related to the RDF module.
Every page with file views process 6000 queries, most are RDF... (about 1000 files in the bitcache)

I finally succeeded with this module. For that:

1) disable RDF module, save the module page
2) Enable RDF file and save again

I did the 2 things at once, this was the problem.
you might update the explanation above

It seems that this removed a huge roadblock in my site... A page that took 30 seconds to load takes now 6
Thanks!

elaman’s picture

1. Downloaded latest dev version.
2. Downloaded ARC2 and placed to sites/all/libraries/ARC2 folder.
3. Disabled RDF module and all dependent modules.
4. Enabled File RDF.
5. Converted old RDF data to new ARC2. About ~12000 records with no problems.
6. Enable disabled modules except RDF.
7. Disable File RDF.

Maybe we need this to complete:
1. If install from blank - RDF module is required. Need to implement own rdf_create_repository function.
2-4. Move requirements into .install
- Require RDF module to be disabled or disable it automatically.
- Require ARC2 to be in place or download automatically if drush en used.
5. Add success message.
6. DAV API, DAV file system, File taxonomy server modules still requires RDF.

jvieille’s picture

I would be more explicit:
"once the File RDF module is operating, the file repository is no longer compatible with database API, DAV file system, File taxonomy and File relation server modules i.e. the files uploaded after the change will never be accessible to these modules / functions even if enabing again the RDF module and disabling File RDF. If the RDF module is uninstalled, then these modules will not see again any files".

This is why I reactivated this issue that becomes critical
http://drupal.org/node/281298#comment-6999974

jvieille’s picture

#8 is incorrect.
The right process:
1. Download latest dev version.
2. Download ARC2 and placed to sites/all/libraries/ARC2 folder.
3. Disable RDF module (and all dependent modules: they won't work anymore),
4 save modules (you should not do 4 and 6 together)
5. Enable File RDF.
6. Converte old RDF data to new ARC2 at admin/settings/file/file_rdf About ~12000 records with no problems.Takes a while
6. Enable disabled modules except RDF.
7. Disable File RDF.

What about finishing this feature? The old RDF approach is not scalable.

jvieille’s picture

Back on this issue.
I had to switch to this new rdf_file module to address a serious performance issue when file information was used in a displayed Views.

But increasingly I was facing another serious performance concern with File Framework.
In the meantime, my application had grown, with bigger database (400 tables, 1.5 G) and large installed code (415 modules)

The issue was now when uploading a file. Before I decided to look closer at this issue, It took about 4 minutes to upload a very small file.

I disabled useless converters, which improved things a little bit, but still taking 2 to 3 minutes for upload.
I used Devel to drill down whant happened exactly, and found a first culprit : the ARC2 library fires an "optimizeTables" function that ate up about half this time. I replaced this function in ARC2-Store.php

  function optimizeTables($level = 2) {
    if ($this->v('ignore_optimization')) return 1;
    return $this->processTables($level, 'optimize');
  }

by this one

  function optimizeTables($level = 2) {
  return 1;
  }

db_maintenance taking is taking care of table optimization at cron, so I don't think this change will be a problem - just wondering why ARC2 does that job so intensively.

This cut the upload time by 2-3, Xhprof reported:

Overall Summary
Total Incl. Wall Time (microsec): 85,145,734 microsecs
Total Incl. CPU (microsecs): 4,792,000 microsecs
Total Incl. MemUse (bytes): 145,132,120 bytes
Total Incl. PeakMemUse (bytes): 179,822,384 bytes
Number of Function Calls: 787,015

62 stream_get_contents take 63 seconds
1920 mysqli queries take 13 seconds

85 seconds is still unacceptable.

So I decided to reacitvate the RDF module. I got these numbers for the same file

Overall Summary
Total Incl. Wall Time (microsec): 31,163,134 microsecs
Total Incl. CPU (microsecs): 3,872,000 microsecs
Total Incl. MemUse (bytes): 140,091,000 bytes
Total Incl. PeakMemUse (bytes): 178,827,936 bytes
Number of Function Calls: 599,486

62 stream_get_contents take 18 seconds
2482 mysqli queries take 9 seconds

Finally, because this RDF features seems to induces so many database INSERTS, I changed the following tables from innodb to mysam
file_rdf_file_rdf_g2t
file_rdf_file_rdf_o2val
file_rdf_file_rdf_triple
file_rdf_file_rdf_s2val
Now, the results match the old module performance

Overall Summary
Total Incl. Wall Time (microsec): 29,243,029 microsecs
Total Incl. CPU (microsecs): 4,916,000 microsecs
Total Incl. MemUse (bytes): 140,554,264 bytes
Total Incl. PeakMemUse (bytes): 179,244,368 bytes
Number of Function Calls: 754,052

stream_get_contents take 18 seconds
1808 mysqli_queriy take 6 seconds

OK, this is still bad, but this is not the direct fault to FF, I guess.

Just wondering the Drupal craziness - near the million of function calls for just saving a node is really puzzling...

gobinathm’s picture

Issue summary: View changes
Status: Needs review » Closed (outdated)

Closing the issue. It was inactive for a long time & it's related to a Drupal Version which is not supported anymore