Jump to:
| Project: | Swish-E Indexer |
| Version: | 5.x-1.x-dev |
| Component: | Code |
| Category: | support request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | active |
Issue Summary
Hi,
When I (or cron.php) does an index I see in the drupal log:
Indexing Data Source: "File-System"
Indexing "/var/www/drupal5/files"Checking dir "/var/www/drupal5/files"...
PBSProERS_71.pdf - Using DEFAULT (HTML2) parser - (no words indexed)
SCM-080114.doc - Using DEFAULT (HTML2) parser - (no words indexed)
PBSProUG_7.1.pdf - Using DEFAULT (HTML2) parser - (no words indexed)
KnowledgeTreeUserManua.pdf - Using DEFAULT (HTML2) parser - (no words indexed)
PBSProQS_7.1.pdf - Using DEFAULT (HTML2) parser - (no words indexed)
PBSProAG_7.1.pdf - Using DEFAULT (HTML2) parser - (no words indexed)
BiografX.pdf - Using DEFAULT (HTML2) parser - (no words indexed)
ADMEnsa.TXT - Using DEFAULT (HTML2) parser - (68 words)
DRAFT EPIX Pharmaceuticals AI License Agreement 31 Oct 2007.doc - Using DEFAULT (HTML2) parser - (no words indexed)
Software_inventory_2008-02-15-2.xls - Using DEFAULT (HTML2) parser - (no words indexed)
And the indexing is not done..
And in the appache error log:
Error: Couldn't open file '/var/www/drupal5/files/PBSProERS\_71\.pdf'
catdoc: No such file or directory
Error: Couldn't open file '/var/www/drupal5/files/PBSProUG\_7\.1\.pdf'
Error: Couldn't open file '/var/www/drupal5/files/KnowledgeTreeUserManua\.pdf'
Error: Couldn't open file '/var/www/drupal5/files/PBSProQS\_7\.1\.pdf'
Error: Couldn't open file '/var/www/drupal5/files/PBSProAG\_7\.1\.pdf'
Error: Couldn't open file '/var/www/drupal5/files/BiografX\.pdf'
catdoc: No such file or directory
/var/www/drupal5/files/Software\_inventory\_2008\-02\-15\-2\.xls: No such file or directory
Error: Couldn't open file 'files/KnowledgeTreeUserManual_2007-05-17.pdf'I cannot find when and where those backslashes are added, and if indeed this the reason files are not indexed.
Comments
#1
I suspect this is an issue with Swish-E escaping the files when it is pulling them into the search index.
#2
I have the same escaping problem
when i index the file , the file who contain's escape are not open and i have this massage:
catdoc: No such file or directorycatdoc: No such file or directory
pdftotext version 3.02
Copyright 1996-2007 Glyph & Cog, LLC
Usage: pdftotext [options] <PDF-file> [<text-file>]
-f <int> : first page to convert
-l <int> : last page to convert
-layout : maintain original physical layout
-raw : keep strings in content stream order
-htmlmeta : generate a simple HTML file, including the meta information
-enc <string> : output text encoding name
-eol <string> : output end-of-line convention (unix, dos, or mac)
-nopgbrk : don't insert page breaks between pages
-opw <string> : owner password (for encrypted files)
-upw <string> : user password (for encrypted files)
-q : don't print any messages or errors
-cfg <string> : configuration file to use in place of .xpdfrc
-v : print copyright and version info
-h : print usage information
-help : print usage information
--help : print usage information
-? : print usage information
catdoc: No such file or directory
catdoc: No such file or directory
catdoc: No such file or directory
catdoc: No such file or directory
catdoc: No such file or directory
catdoc: No such file or directory
catdoc: No such file or directory
catdoc: No such file or directory
catdoc: No such file or directory
catdoc: No such file or directory
catdoc: No such file or directory
i have no probleme for no escaping files