To generate image styles origin image (source) must be on the disk (server).

Problem: I'm using Amazon S3 to store images of my site. I have a lot of images and don't want to store them on the server because they are already on S3, so I deleted all images from the server. Then I added a new image style, and when I try to render image with this new style I get broken image, when I open this image link (For example: s3/files/styles/large_new/s3/test/myimage.jpg?itok=IPykZqaz) - I get error - "Error generating image, missing source file.".

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

semjuel created an issue. See original summary.

mibstar’s picture

Category: Feature request » Bug report

I'm getting the same error so I'm changing to a bug.

Using 8.x-3.0-alpha8 with Drupal set to download private files via Amazon S3. The source file exists in the S3 bucket.

mibstar’s picture

Also and I'm not sure if this is related but when I try to load the source file I get the following error on http://example.org/system/files/photo/2017-11/image.gif

This page isn’t working

example.org unexpectedly closed the connection.
ERR_CONTENT_LENGTH_MISMATCH

The file exists on:
s3bucket/root/private/hd_photo/2017-11/image.gif

The image style url is:
http://example.org/system/files/styles/small/private/hd_photo/2017-11/im...

semjuel’s picture

To solve this issue you can try attached patch. It works for me.

ArnaudDabouis’s picture

This patch is a workaround, but downloading the image to local storage is at the opposite of the purpose of this module. We should find something else.

ArnaudDabouis’s picture

The issue is on the 76th line of the s3fs/src/Controller/S3fsImageStyleDownloadController.php file, file_exists returns FALSE even if the file exists.

I will update the issue when I have time to dig further.

firewaller’s picture

Status: Active » Needs review
FileSize
1.47 KB

With PHP 7.0 on Ubuntu 17.04 and with the option allow_url_fopen=On, file_exists() returns always false when trying to check a remote file via HTTP.

- http://php.net/manual/en/function.file-exists.php#121436

Although this is happening for me even on PHP 5.6.30. I even tried checking the absolute URL to make sure it wasn't an issue with the schema registration:

if (!file_exists($image_uri) && !file_exists(file_create_url($image_uri))) {
 // ...
}

It seems to have something to do with accessing remote servers with file_exists(). I went ahead and used the attached patch (influenced by https://stackoverflow.com/a/10444151/2036095) to get it working for me, I'm sure that we could make it cleaner though (i.e. don't look at http version, etc.).

Status: Needs review » Needs work

The last submitted patch, 7: s3fs-error-generating-image-2915322-7.patch, failed testing. View results

jansete’s picture

Hi all,

Can anybody tell me config and steps to reproduce?

Thanks!

scaldinglake’s picture

Reroll #7

As for settings, the only thing I have set in this module is I'm using the montreal datacentre. Using settings.php for access key/secret and I have $settings['s3fs.use_s3_for_public'] = TRUE; set.

I'm using media for fieldable entities, and using twig tweak to generate URIs for the image styles in twig using the image_style filter. Happy to answer any other questions.

Core: 8.5

jansete’s picture

Issue tags: +alpha target
emb03’s picture

I am getting this error on my local environment as well. Patch 10 is not installing for me:

Executing command (CWD): git -C 'web/core' apply --check -v '-p2' '/var/folders/s7/qygp10sd0v72_zcc7hlp4zmr0000gp/T/5bae62397fd29.patch'
Checking patch Controller/S3fsImageStyleDownloadController.php...
error: Controller/S3fsImageStyleDownloadController.php: No such file or directory

Does this mean we can't use image styles until this gets fixed?

Update: Works after I changed the image thumbnail in the entity browser view. I was using the image style format in the view.

adrian1231’s picture

#7 worked for me

aguilarm’s picture

I thought I was experiencing this issue as well, because file_exists was returning false for source images.

Relevant environment setup:
php-fpm 7.2 behind nginx, use_s3_for_public=TRUE, and I'm using google cloud storage rather than amazon s3.

I copied my default/files directory to my bucket's s3fs-public folder manually (the provided copy-local too slow for the amount of files I have, gsutil rsync is very very fast), which appears to be the reason this isn't working. copy-local must be entering every item into the s3fs_file table.

Making an http request works - I get 200s back - but these things combined mask the real issue, from what I can tell.

When you switch use_s3_for_public to TRUE, this module replaces the public:// scheme stream wrapper with the s3 stream wrapper (in s3c/S3fsServiceProvider.php:25) which is some custom drupalizing sugar on top of the php awss3 library.

Replacing the stream wrapper means file_exists will use s3fs to find files with the public:// scheme, which means it will use the S3fsStream->url_stat() function to lookup files (used by file_exists).

It appears the trouble exists when that eventually calls S3fsStream->getS3fsObject(). In that function, it only ever reads from cache to lookup files. Setting s3fs.settings ignore_cache to TRUE forces lookups and makes file_exists behave as expected.

I assume drush s3fs-rc is supposed to fully populate the s3fs_file table with every file in my bucket (?) but running it does not look like it's scanning the s3fs-public folder. I have yet to dig into it.

S3fsStream->getS3fsObject() has this comment:

  /**
   * Try to fetch an object from the metadata cache.
   *
   * If that file isn't in the cache, we assume it doesn't exist.
   ...

So I may be missing something, but I think it's appropriate for S3fsStream->getS3fsObject() to make an attempt to lookup in cache and fallback to a direct metadata call that fills cache with the lookup, no? Potentially means you'll be looking up files that don't exist directly against s3 a lot (whereas now it's basically never going to do that) but it's still a 404 that isn't much more expensive than a 404 that didn't make the extra request unless I'm mistaken.

I'm going to try patching this so it creates a cache entry if the metadata lookup was successful.

TLDR: s3fs.settings ignore_cache TRUE will 'fix' this problem, but ultimately the cache should be filled before looking up files right now - from what I can tell.

aguilarm’s picture

Attaching patch that will lookup files that do not exist in cache, and save them in cache if found.

If ignore_cache is TRUE, it does not interact with cache at all and goes for a direct lookup every time.

This way you wont have to add every file to s3fs cache for the module to work, cache will warm up as requests come in. In my case that is 10s of thousands of files, so warming the cache would take a pretty long time. Initial lookup and 404s are a bit slower, but I think they should be.

I also noticed the annotations for getS3Metadata has an @throws, but it looks like it does not actually throw exceptions but rather catches them itself and returns false instead of throwing, so I adjusted that.

mikeryan’s picture

We're seeing a related situation, error message "TypeError: Uncaught exception 'TypeError' with message 'Argument 1 passed to Drupal\s3fs\StreamWrapper\S3fsStream::writeCache() must be of the type array, boolean given'". This happens when getS3Metadata(), as aguilarm points out, returns FALSE on an S3Exception instead of letting the exception bubble up. Given that callers expect an array in all situations, rather than change the documentation to allow returning a boolean, it makes more sense to return []. Also, wouldn't it be helpful to log the error somewhere so there's a chance of knowing why it failed?

mikeryan’s picture

  1. +++ b/src/StreamWrapper/S3fsStream.php
    @@ -1033,21 +1034,26 @@ class S3fsStream extends StreamWrapper implements StreamWrapperInterface {
    +    $metadata = false;
    

    Drupal style is FALSE.

  2. +++ b/src/StreamWrapper/S3fsStream.php
    @@ -1033,21 +1034,26 @@ class S3fsStream extends StreamWrapper implements StreamWrapperInterface {
    +        } catch (S3fsException $exception) {
    

    Drupal style: move catch to separate line from }.

  3. +++ b/src/StreamWrapper/S3fsStream.php
    @@ -1170,12 +1176,9 @@ class S3fsStream extends StreamWrapper implements StreamWrapperInterface {
    +   * @return array|bool
    +   *   An array of DB-compatible file metadata or false if lookup fails.
    

    As mentioned in my previous comment, we should return [] instead of FALSE on failure.

mikeryan’s picture

Just a minimal patch for us to include in our composer.json...

Anas_maw’s picture

Status: Needs work » Needs review

Patch in #15 worked for me, thanks

SuperfluousApostrophe’s picture

Patch in #15 solved my issues too.

Dinesh18’s picture

patch #15 worked for me. I cannot see the same code committed to alpha8 version. Any idea when it will be added?

  • jansete committed f990c30 on 8.x-3.x authored by semjuel
    Issue #2915322 by mikeryan, semjuel, aguilarm, firewaller, scaldinglake...
jansete’s picture

Status: Needs review » Fixed

Now is in dev branch, good work guys!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Dinesh18’s picture

When it will be added to the alpha8 version ?

jrb’s picture

These changes caused the bug in #3151913-7: Can't upload file to S3 bucket on version 8.x-3.0-alpha15. This was a problem only when the "Ignore the file metadata cache" is checked. I've added a patch there.

The specific problem was the changes around the checking of metadata-- it now only calls $this->readCache($uri) if the cache is enabled.

if ($cache_enabled) {
  $metadata = $this->readCache($uri);
}

Once that conditional is removed, that other bug is fixed:

$metadata = $this->readCache($uri);

This is how it worked prior to these changes. Would this change break anything?

cmlara’s picture

As a followup,

Input welcomed in #3201248: Disallow realtime unkown file lookup to S3 -- Reverts #2915322

It appears this issue was originally opened because files may have been added to S3 w/o syncing the database metadata afterwards.

Why did we not just require the administrator to run a metadata sync as documented in the README? Were we hitting issues in the batch limits/ metadata imports that are being worked on in different tickets? (seeing as initial ticket was opened in 2017 I'm guessing yes). Were we hitting the 255 character S3FS path limit? Why did we not solve the batch import concerns rather than allowing this 'warm cache heatup'?

The more I come around edge cases that this could eventually become involved in the more I feel this patch will hide faults rather than fixing them and needs to be reverted.