We're having lots of issues with files that used to be defined as audio, image, or video that are now being classified as 'undefined'. This requires integrating modules to both use hook_file_mimetype_mapping_alter() and hook_file_default_types_alter(). Now if we add a new file mimetype in core or things change we have to update the default and people that have altered the description or something else on the file type, no longer get the change.

I think it would be best for usability that the three main media file types use a wildcard selection of audio/*, image/* and video/*. I think this will give us a much better default config that admins don't have to go back and edit every time there is a new mime type.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Dave Reid’s picture

Priority: Normal » Major
Issue tags: +7.x-2.0 alpha blocker

Bumping to major. We shouldn't be requiring modules like media_youtube to be updating the file types themselves.

azinck’s picture

aaron’s picture

Status: Active » Needs work
FileSize
1.14 KB

here is a proof of concept.

slashrsm’s picture

Looks good. Attached patch adds submit validation.

bneil’s picture

Status: Needs work » Needs review
aaron’s picture

FileSize
3.19 KB

Thanks so much! This patch improves it by adding some sensible defaults.

slashrsm’s picture

Status: Needs review » Reviewed & tested by the community

Looks good to me. Ready to roll?

Dave Reid’s picture

Status: Reviewed & tested by the community » Needs work

I'm a little confused as to the usage of fnmatch, since I've never seen that function before. It appears that according to http://php.net/manual/en/function.fnmatch.php it's not available on Windows PHP until 5.3.0, so let's use preg for regex instead of fnmatch.

ParisLiakos’s picture

first time i saw this function here as well..nice job aaron digging this up:)

Dave Reid’s picture

Also if we're going to use a fnmatch() like pattern, then we should support all possible syntaxes that it can use, like *gr[ae]y!?

I would prefer we use something that checks for the characters *?[] in *any* part of the mimetype, and then do the following:

function file_entity_file_type($file) {
  $types = array();
  foreach (file_type_get_enabled_types() as $type) {
    if (in_array($file->filemime, $type->mimetypes)) {
      // The file mime type matches a file type exactly.
      $types[] = $type->type;
    }
    else {
      // Check to see if the file type contains wildcard mime types.
      $wildcards = array();
      foreach ($type->mimetypes as $mimetype) {
        if (strpbrk($mimetype, '*?[]')) {
          $wildcards[] = strtr(preg_quote($mimetype, '!'), array('\*' => '.*', '\?' => '.', '\[' => '[', '\]' => ']'));
        }
      }
      if (!empty($wildcards)) {
        $regex = '#' . implode('|', $wildcards) . '#';
        if (preg_match($regex, $file->filemime)) {
          $types[] = $type->type;
        }
      }
    }
    foreach ($type->mimetypes as $mimetype) {
      if (fnmatch($mimetype, $file->filemime)) {
        $types[] = $type->type;
      }
    }
  }
...

And we could add a @todo that this logic could be replaced by fnmatch() in Drupal 8.

slashrsm’s picture

I agree that we should support all possible syntaxes, but I am not sure about fnmatch(). Code that uses it looks MUCH simpler and more readable. Drupal 7 recommends PHP 5.3 anyway and most hosting providers are at least offering it nowadays.

ParisLiakos’s picture

well, i have seen many hosts with CentOs and php 5.2..but the thing here is that this applies for windows servers...i have no clue what happens there and what version the usually are in

slashrsm’s picture

True. Windows market share in server environments is very low. Dev envs should be even less problematic when it comes to upgrades.

Dave Reid’s picture

My point with the code is #10 is that I love the fnmatch syntax as well, but we cannot rely on it. Which is why I emulated the use of fnmatch using preg_replace, but the actual syntax is the same. We could make a file_entity_fnmatch() wrapper that calls fnmatch directly if it exists, or does the alternative code if necessary.

slashrsm’s picture

I may be stupid, but I really don't get why we cannot rely on fnmatch(). Alternative implementation sounds good, though. Maybe something like this (as proposed on php.net):

<?php
if(!function_exists('fnmatch')) {

    function fnmatch($pattern, $string) {
        return preg_match("#^".strtr(preg_quote($pattern, '#'), array('\*' => '.*', '\?' => '.'))."$#i", $string);
    } // end

} // end if
?>
Dave Reid’s picture

We cannot rely on fnmatch because the minimum requirement for Drupal 7 is PHP 5.2, and this function isn't available at all on Windows systems until PHP 5.3.0. We also shouldn't define PHP-namespaced functions like that. If anything we should just use a wrapper named file_entity_fnmatch() that calls fnmatch() if it exists, or does the backup logic. Also note that fnmatch is case sensitive by default, so we cannot use the 'i' modifier in preg_match().

aaron’s picture

Status: Needs work » Needs review
FileSize
3.83 KB

Thanks for the suggestions, here is a reroll with a wrapper named file_entity_fnmatch().

slashrsm’s picture

Status: Needs review » Needs work

Thanks aaron. I tested some patterns and I found that those do not work properly with fallback code (while they do with fnmatch()):

file_entity_fnmatch('im[ae]ge/*', 'image/png')
file_entity_fnmatch('im[ae]ge/pn[gf]', 'image/png')
file_entity_fnmatch('im[ae]ge/pn*', 'image/png')

The following patterns work correctly with both implementations:

file_entity_fnmatch('image/*', 'image/png')
file_entity_fnmatch('*', 'image/png')
file_entity_fnmatch('*/*', 'image/png')
Dave Reid’s picture

+++ b/file_entity.admin.incundefined
@@ -788,6 +788,20 @@ function file_entity_file_type_form_validate($form, &$form_state) {
+  $valid_wildcards = array();
+  foreach ($invalid_mimetypes as $mimetype) {
+    $parts = explode('/', $mimetype);
+    if ($parts[1] == '*') {
+      foreach ($valid_mimetypes as $valid) {
+        if (file_entity_fnmatch($mimetype, $valid)) {
+          $valid_wildcards[] = $mimetype;
+          break;
+        }
+      }
+    }
+  }
+  $invalid_mimetypes = array_diff($invalid_mimetypes, $valid_wildcards);
+

I feel like this logic and the one in file_entity_file_type() could use a helper function for 'does this mimetype match an array of possible mimetypes and mimetype wildcards'.

This should also be checking the other characters of ?[]. Also, the wildcard shouldn't be limited to only the second part of the mimetype. It should likely do a check like strpbrk($mimetype, '*?[]');

+++ b/file_entity.moduleundefined
@@ -1933,3 +1924,17 @@ function path_file_delete($file) {
+        return preg_match("#^".strtr(preg_quote($pattern, '#'), array('\*' => '.*', '\?' => '.'))."$#i", $string);

This needs to add support for '\[' => '[', '\]' => ']'

And it needs to remove the i modifier from the regex as fnmatch() is case-sensitive by default.

slashrsm’s picture

Status: Needs work » Needs review
FileSize
4.06 KB
3.21 KB

Comments from #19 taken into consideration + some whitespace fixes.

Status: Needs review » Needs work

The last submitted patch, wildcards_1911970_20.patch, failed testing.

slashrsm’s picture

Status: Needs work » Needs review
FileSize
899 bytes
4.94 KB

Let's see if we fixed those nasty tests....

aaron’s picture

Status: Needs review » Reviewed & tested by the community

This looks great to me.

Devin Carlson’s picture

Status: Reviewed & tested by the community » Fixed

The patch in #22 applied with an offset.

Looks good and I tested this in our Windows [read: only] environments with no issues.

Committed to 7.x-2.x. Great work everyone!

Dave Reid’s picture

Dave Reid’s picture

Dave Reid’s picture

Automatically closed -- issue fixed for 2 weeks with no activity.

Status: Closed (fixed) » Needs review

thynan queued 22: wildcards_1911970_22.patch for re-testing.

Status: Needs review » Needs work

The last submitted patch, 22: wildcards_1911970_22.patch, failed testing.

Devin Carlson’s picture

Issue summary: View changes
Status: Needs work » Closed (fixed)