Links

Sandbox: https://www.drupal.org/sandbox/gh640/2240267

Git clone: git clone --branch 7.x-1.x http://git.drupal.org/sandbox/gh640/2240267.git mecabsearch

Automated review: http://pareview.sh/pareview/httpgitdrupalorgsandboxgh6402240267git

Smooth Review: Quick tips checklist

Problem/Motivation

Using Drupal core's Search module to index Japanese text is problematic because the text structure of Japanese is different from English: notably, words in Japanese are not separated by spaces, which results in the Search module's database tables becoming huge and inefficient.

This module helps Drupal core's Search module to index Japanese text by using the MeCab morphological analysis library through the php-mecab binding to preprocess strings of Japanese text before the Search module's indexer runs. The preprocessor extracts words and inserts spaces between them.

Alternatives to MeCab include using Apache Solr, TSearch2, or other morphological analysis tools; but they aren't available on every server environment. But for some environments, only MeCab is available because it is relatively easy to install, which is why I would like to make this available for common use.

Proposed resolution

Give hgoto permission to promote mecabsearch to a full project.

Manual test notes

To make this module work correctly, you need to install MeCab and php-mecab.

To insall MeCab on Mac, please use Homebrew:

brew install mecab
brew install mecab-ipadic

(The first one is morphological analysis engine, and second one is dictionary.)

To install php-mecab on Mac, please follow the instructions on it's GitHub page:
https://github.com/rsky/php-mecab/blob/master/README.md#install

Comments

hgoto’s picture

Issue summary: View changes
hgoto’s picture

Assigned: hgoto » Unassigned

changed 'assigned to'.

PA robot’s picture

We are currently quite busy with all the project applications and we prefer projects with a review bonus. Please help reviewing and put yourself on the high priority list, then we will take a look at your project right away :-)

Also, you should get your friends, colleagues or other community members involved to review this application. Let them go through the review checklist and post a comment that sets this issue to "needs work" (they found some problems with the project) or "reviewed & tested by the community" (they found no major flaws).

I'm a robot and this is an automated message from Project Applications Scraper.

MattWithoos’s picture

Status: Needs review » Needs work

I would suggest updating the git clone link to something like:

git clone --branch 7.x-1.x http://git.drupal.org/sandbox/gh640/2240267.git mecabsearch

Your README.txt file must end in a single new line character. Also best practice dictates the mecabsearch.module file should have /** on the second, not the third, line. Just remove that line break between <?php and /**

Otherwise looking good.

znaeff’s picture

Hi.

1. Please remove next lines in mecabsearch.info file:
version = "7.x-alpha0"
core = "7.x"
project = "mecabsearch"
They will be added automatically later.

2. Move mecabsearch_admin_settings_form() to separated mecabsearch.admin.inc file.
So menu function will be:

function mecabsearch_menu() {
  // Define MeCab Search admin page.
  $items['admin/config/search/mecab'] = array(
    'title' => 'Mecab Search Settings',
    'page callback' => 'drupal_get_form',
    'page arguments' => array('mecabsearch_admin_settings_form'),
    'access arguments' => array('administer search'),
    'file' => 'mecabsearch.admin.inc',
    'description' => 'Configure MeCab search index settings.',
  );

  return $items;
}
hgoto’s picture

Hi, MattWithoos, znaeff,

I'm sorry to respond late though you kindly posted comments.

Thank you for your kind feedback.
Your advices are quite right and I would like to change the points you noted and submit the module again.

hgoto’s picture

Issue summary: View changes
hgoto’s picture

Status: Needs work » Needs review

Thank you for your kind review, MattWithoos, znaeff.

I've improved the code mainly in .module and .info file and added new file mecabsearch.admin.inc according to the review comment #4 by MattWithoos and #5 by znaeff.

I'm glad if someone review my module again. Thank you in advance.

PA robot’s picture

Status: Needs review » Needs work

There are some errors reported by automated review tools, did you already check them? See http://pareview.sh/pareview/httpgitdrupalorgsandboxgh6402240267git

I'm a robot and this is an automated message from Project Applications Scraper.

hgoto’s picture

Status: Needs work » Needs review

I fixed some points that `PA robot` pointed.

I would like someone to review my code again. Thank you in advance.

4d’s picture

Hi, I'm a Japanese Drupal user. Thanks for your module.

Tested with mecabed environment and Japanese content. It works well. (for instance, typical sentence 'すもももももももものうち') ;)

Well documented and simple, clean coding. Very nice work for every Japanese users.

hgoto’s picture

4d san, hello. I'm sorry to respond late.

Thank you for your time and kind comment! I'm happy to get such a feedback.

I'll work to make this module contribute one continuously.

gwprod’s picture

Fails automated testing:

http://pareview.sh/pareview/httpgitdrupalorgsandboxgh6402240267git

In addition, this module should detect whether MeCab is loaded and make it a requirement for enabling. To not do so would be unnecessarily confusing to your users, I believe.

gwprod’s picture

Status: Needs review » Needs work
hgoto’s picture

gwprod, thank you for your comment.

As you told me, I should fix code which breaks coding standard and add code which checks MeCab is loaded or not.
I fixed those points in the new commit.
Thank you.

hgoto’s picture

Status: Needs work » Needs review

I fixed code which doesn't follow Drupal coding standard. And add `hook_requirements()` to check MeCab library (which is must for this module) is properly loaded or not.

naveenvalecha’s picture

Issue summary: View changes

Updated issue summary.

mparker17’s picture

Assigned: Unassigned » mparker17
Category: Task » Feature request
Issue summary: View changes

Updating the issue summary and assigning to myself for manual review.

mparker17’s picture

Issue summary: View changes

Code review

Looks like the automated project reviewer reports no more problems.

Overall, the code looks great. The code is clean, well-documented, and easy to understand. I cannot see any security issues, and I cannot see anything that needs to be changed.

Some entirely optional improvements you could make:

  • In mecabsearch.module::mecabsearch_search_preprocess(), it might be good to log an error if MeCab Search is available but the mecab_split() function is not available.
  • You could move mecabsearch_admin_settings_form_submit() from mecabsearch.module to mecabsearch.admin.inc.
  • In mecabsearch.module::mecabsearch_admin_settings_form_submit(), you set the overlap_cjk variable (part of the core Search module) to disable it's Simple CJK Handling. It would be nice to either remind the user to re-set it (via drupal_set_messageI()), or restore it's original setting (by saving the old value to a variable, e.g.: mecabsearch_original_overlap_cjk) when the mecabsearch module is disabled (i.e.: in hook_disable()).

I've updated the issue summary with the results of the code review.

mparker17’s picture

Assigned: mparker17 » Unassigned

Automated Review

Done, no problems identified.

Manual Review

Individual user account
Yes
No duplication
Does not cause module duplication and/or fragmentation.
Master Branch
Follows the guidelines for master branch.
Licensing
Follows the licensing requirements.
3rd party assets/code
Follows the guidelines for 3rd party assets/code.
README.txt/README.md
Follows the guidelines for in-project documentation and/or the README Template.
The README could be expanded to include an introduction and configuration section; but that's not urgent.
Code long/complex enough for review
Follows the guidelines for project length and complexity.
Secure code
Meets the security requirements.
Coding style & Drupal API usage
Looks good to me.

This review uses the Project Application Review Template.

mparker17’s picture

I tried installing php-mecab for testing on my Debian 8 "Jesse" machine and I got "configure: error: wrong MeCab library version or lib not found. Check config.log for more information" errors that I couldn't figure out how to fix, so I was unfortunately unable to manually test this.

It appears that @4d was able to test in #11, but the code has changed since then. @4d if you are still able to test successfully with the latest code, then I think you can mark it as RTBC.

PA robot’s picture

Status: Needs review » Closed (duplicate)
Multiple Applications
It appears that there have been multiple project applications opened under your username:

Project 1: https://www.drupal.org/node/2501525

Project 2: https://www.drupal.org/node/2241231

As successful completion of the project application process results in the applicant being granted the 'Create Full Projects' permission, there is no need to take multiple applications through the process. Once the first application has been successfully approved, then the applicant can promote other projects without review. Because of this, posting multiple applications is not necessary, and results in additional workload for reviewers ... which in turn results in longer wait times for everyone in the queue. With this in mind, your secondary applications have been marked as 'closed(duplicate)', with only one application left open (chosen at random).

If you prefer that we proceed through this review process with a different application than the one which was left open, then feel free to close the 'open' application as a duplicate, and re-open one of the project applications which had been closed.

I'm a robot and this is an automated message from Project Applications Scraper.

avpaderno’s picture

Category: Feature request » Task
Related issues: +#2501525: [D7] Japan Postal Code