Well, I have tried to set up this module. I have uploaded a couple of files, indexed them but they do not appear in the search results. I have checked the database - the fulltext fields are filled with text from the files. Does anyone know how to solve this problem?

Comments

joeBro’s picture

Same with me here.
The settings are all set. Swish-e recognizes all plug-ins (or at least Drupal recognizes the plug-ins' paths), the indexing seems to work (corresponding entries in the database), but a search returns no result. It is maybe interesting to note, that the "Beginn Swish-E Indexing"-Checkbox stays unchecked after checking and saving the settings (I get no error-message, but only an empty bullet point and a "Your Swish-E settings has been saved"-statement).

(see also http://drupal.org/node/163518)

berlinonline2’s picture

Hi Marcin, Hi joeBro,

I think I have a solution for your problem if you are using swish-e indexer on a windows system. I had the same issue and after an hour I found out that there was a problem with the Slash/Backslash-Handling in the Code. Try to change the following lines in the files mentioned below. That worked fine for me. Good luck.

In File: swish.integration.inc

Line 13 find
$file_path = getcwd() .'/' . file_directory_path();
and replace with: 
$file_path = getcwd() .'\\' . file_directory_path();

Line 59 find
$swish_indx_cmd .= " -f $file_path/my_swish_index"; // save the index to the files directory
and replace with:
$swish_indx_cmd .= " -f $file_path\my_swish_index"; // save the index to the files directory

Line 60 find
exec (escapeshellcmd($swish_indx_cmd), $results, $rv);
and replace with:
exec ($swish_indx_cmd, $results, $rv);

In File: swish.module

Line 61 find
$swish_index = getcwd() .'/' . file_directory_path().'/'. 'my_swish_index';
and replace with:
$swish_index = getcwd() .'\\' . file_directory_path().'\\'. 'my_swish_index';

Line 78 find
$swish_command = variable_get("swish_path","/usr/local/bin/swish-e") . escapeshellcmd(" -m 50 -f $swish_index  -w ").$words;      
and replace with:
$swish_command = variable_get("swish_path","/usr/local/bin/swish-e") . " -m 50 -f $swish_index  -w ".$words;      

After applying the changes go to the config menue for swish-e and try again the "Begin Swish-E Indexing". After that run cron.php and then try a search. In the advanced search menue you will find a tab called "files" an there you can search within the files.

nrasmus’s picture

I'm having the same issue here--and I'm on Debian Etch, so I don't think the slash issue is at play here . . .

nrasmus’s picture

Just checking in about this--I have a multisite install, and am experiencing the same behavior on several sites. For each site where swish is enabled, it looks like each cron run is creating a new swishstring file in its respective tmp directory. Indexing via command line works, but nothing is getting indexed for the particular site. Any ideas?

Miszel’s picture

Hi nrasmus

I have given up using swich indexer and installed this module: http://drupal.org/project/search_attachments
It works all right for me.

For my new project, a multisite, I am trying to use Xapian: http://www.trellon.com/blog/xapian-search-drupal
It looks promising. I have installed it and indexed all my nodes. However I have not tried to index any external files.

geme4472’s picture

In the common.conf file, set reporting to 4, run the on-the-fly indexing, and see if there are troubles on the indexing side of things.

IndexReport 4
ParserWarnLevel 4

You could also rip open the index file and check that there are words in there. It is semi-human-readable.

If it still looks like everything is fine, the issue is truly on the search side. There's always the chance that apache doesn't have rights to run swish--which may be why you can search from CLI and get results.

Swish-e is insanely fast, and, depending on your needs, might be a great option, so don't give up quite yet!

miiimooo’s picture

I get this - any ideas why it doesnt use the converters?

html.doc - Using DEFAULT (HTML2) parser - (no words indexed)
Email_Internet.doc - Using DEFAULT (HTML2) parser - (no words indexed)
dreamweaver front.doc - Using DEFAULT (HTML2) parser - (no words indexed)
PGP.doc - Using DEFAULT (HTML2) parser - (no words indexed)
dreamweaver.doc - Using DEFAULT (HTML2) parser - (no words indexed)

Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 65 words alphabetically
Writing header ...
Writing index entries ...
Writing word text: ... Writing word text: Complete
Writing word hash: ... Writing word hash: 10% Writing word hash: 20% Writing word hash: 30% Writing word hash: 40% Writing word hash: 50% Writing word hash: 60% Writing word hash: 70% Writing word hash: 80% Writing word hash: 90% Writing word hash: 100% Writing word hash: Complete
Writing word data: ... Writing word data: Complete
65 unique words indexed.
Sorting property: swishdocpath Sorting property: swishtitle Sorting property: swishdocsize Sorting property: swishlastmodified 4 properties sorted.
8 files indexed. 601,911 total bytes. 70 total words.
Elapsed time: 00:00:00 CPU time: 00:00:00
Indexing done!
airliner’s picture

After editing the files like berlinonline said I got this error:
Indexing aborted. err: IncludeConfigFile: requires one value

Any ideas?

populist’s picture

Priority: Critical » Normal