swish in UTF-8 environment: input/output conversion
schildi - October 7, 2007 - 10:46
| Project: | Swish-E Indexer |
| Version: | 5.x-1.x-dev |
| Component: | Code |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed |
Jump to:
Description
when running in an UTF-8 environment there are currently problems handling special characters like german umlaut (äöü ß).
In the indexing process the text is now converted to UTF-8 prior to storing in DB if it is not yet in this format.
When searching for key words these are converted from UTF-8 to ISO-8859-1 (regardless of input character set --> code needs work!)
Regards
Schildi
| Attachment | Size |
|---|---|
| swish-UTF8-environment.patch | 1.19 KB |

#1
great contribution and german characters are now on the scene.
#2
Could you test this in different environments (UTF-8, ISO-???) ?
Regards
P.S.
I had to disable function swish_update_index
// An experiementfunction swish_update_index() {
return;
since it overwrites the results of the core search functionality.
#3
swish_update_index() is a function that has been removed from the module since the 4.7 release I believe. I tested your patch against some items, but will try to get to more thorough testing later since we need to test against mac + linux + windows with ppt, pdf, doc, xls, rtf. i also only have english documents so will need to find some german and other UTF8 documents. if you have any of those or interest in testing the latest release in 5.x that would be really helpful def.
also there is a 6.x version in development!
#4
I will check this!
#5
Automatically closed -- issue fixed for two weeks with no activity.