2 min read

Don't mess with files

Don't mess with files

One thing that's been causing me some issues recently is storing managed files and the effects of changing various properties of them. Since the files have a number of things recorded in the database; changing these properties can have unwanted effects.

The file_managed table is where records are stored and retrieved from. file_save and file_load being the operations to save/update and retrieve data about files from the database respectively.

When a file is loaded the $file object contains the following information:

  • fid - The unique identifier of the file (similar to a node ID or user ID);
  • uid - The user ID of the user who owns the file (usually the uploader);
  • filename - The name of the file without any paths;
  • uri - The path to where the file is stored in the filesystem;
  • filemime - The mimetype of the file
  • filesize - The size of the file in bytes;
  • status - In the simplest terms 1 for permanent, 0 for temporary (removed at cron);
  • timestamp - A UNIX timestamp signifying when the file was added.

Whilst writing modules I've had to alter the uri, the filename and the filesize properties of the $file object. In doing so there were a few repercussions I didn't expect which I've had to resolve.

Why are you messing with the filesystem?

Ideally you shouldn't mess with files. Drupal does an excellent job of managing everything from updates, moves, copies, deletes and even renaming should you have a duplicate.

But.

When using another system that Drupal does not control, nor see operations performed, it is useful to be able to keep any changes made updated within Drupal. Programmatically creating files or requesting and downloading files (Drupal 7 Module Development) requires such knowledge.

Why bother telling Drupal

Altering the uri of the file will cause two major issues.

  • The user will not be able to download the file through Drupal
  • Drupal will error out if the user attempts to save a file of the same name.

By changing the uri of the file, the path referencing where that file is stored is also changed. When the user clicks on a link to download the file Drupal will look in the place the uri tells it to and return 404.

When Drupal is saving new files to the filesystem it checks to see if there is one there with the same name. If there is, the standard behaviour is to rename files thus:

my_uploaded_file.txt -> my_uploaded_file_0.txt

Picture this scenario:

  1. User saves the file foo.txt.
  2. File is moved (manually) to bar.txt (but the uri in the database will remain foo.txt)
  3. A new foo.txt is uploaded.
  4. Drupal will run a file_exists($uri) returning false as foo.txt does not exist in the filesystem (since it was moved to bar.txt). This means it thinks it does not need to rename the file nor change the uri to match said change.
  5. Drupal will attempt to insert a record in the file_managed table but as the uri is a unique key Drupal will not be able to insert foo.txt as the old record from the first file still exists.

Changing the filesize of the file can cause downloads to end prematurely as the browser is provided with a Content-Length parameter in the header. If it only expects 20kb it's not going to keep downloading after that!

What if I really want to alter files

If none of this has put you off then the best advice I can offer is:

If in doubt, file_save.

This will update (via drupal_write_record) the file_managed table and hopefully assuage any potential problems!