Uploading Very Large Files

Over the past few months, I've been working off-and-on to get an upload form working for very large file uploads. (In this case, very large means up to 1GB.) I'm still not done, but I've finally reached the "I think I have this mostly working" stage. So I figured I should write a bit on the problems I ran into, the tools I've tried out, and what I ended up doing.

Problems

INI Settings

The first issue I've had is that uploading large files using a PHP script requires increasing a number of limits set in php.ini. This pops up for any file larger than 2MB, since that's the default maximum file upload size for PHP. (Unless your host is already configured with a larger or smaller value.) Usually I can set these values by adding "php_value" statements to the .htaccess file, however in some cases php.ini needs to be edited directly.

The following values MIGHT have an effect on uploads. I'm not 100% certain, so if you know exactly when these settings could effect PHP allowing file uploads, let me know:

  • max_input_time
  • max_execution_time
  • memory_limit

These values set limits on how long PHP can run (max_execution_time), how long it can take to parse the uploaded POST data (max_input_time) and how much memory it can use on the server (memory_limit).

The following values absolutely MUST be updated:

  • post_max_size
  • upload_max_filesize

These values are used to limit the maximum size of the POST data (post_max_size) and the maximum size of any single uploaded file (upload_max_filesize). Binary POST data is encoded in Base64, which makes it larger than the size of the file on disk. The POST data also includes any other form values submitted. This means that the post_max_size will limit the size of the file you can upload. I usually don't bother trying to figure out the math for how much bigger one should be than the other. Often I just double the maximum filesize I want to upload and use that value for both. Not perfect, but good enough. Since PHP will usually die without any feedback if the uploaded file is larger than upload_max_filesize, it makes sense to allow uploads of larger files than you want to allow. For Catalyst, we've implemented an upload behavior that will disallow uploads of files larger than a specified size, and provide an error message that the file is too large. This can only be done if the "too large" file was actually able to upload, though.

In my tests on shared hosting server, updating these values didn't help for files larger than approximately 128MB or 256MB (depending on the server). I suspect that this is due to additional security software preventing these uploads. The behavior was a bit odd. In my tests I already had upload progress meters working, so I could see when things stopped. On the servers that had this problem, files smaller than the magic limit would upload without issue. Once I went over the limit, however, a few megabytes would get uploaded and then the progress meter would stop completely. My coworkers would also stop complaining about having trouble accessing the internet, so it was clear that data was no longer being sent. I suspect that something on the server was closing the connection when it detected that the amount of POST data was bigger than the magic number. The most likely explanation is that this is due to anti-denial-of-service software. When I ran tests on a server where I had full control over ALL software on the system, this didn't happen.

When I set up a Perl CGI script on one of the shared hosting servers, it suffered the same problem. I asked Wil to look into this a bit more, and when he set up a Perl CGI script to handle uploads he noticed that he could not track the upload progress of file. While his script was written to append data to a file as it came in, the file would just suddenly show up at full size. It seemed like the entire file was being uploaded to the server and then passed off to his script ... which would make sense if the security software were proxying POST data so it could prevent certain types of attacks.

Progress Bars

The second main issue with uploading large files using PHP is that PHP doesn't support progress bars for files being uploaded by default. There are numerous ways to ensure that a progress bar will show up, 

Client side solutions:

  • Use an HTML5 JavaScript uploader.
  • Use Flash to upload the file.
  • Use a Java applet to upload the file.

Server side solutions:

  • Use Perl CGI.
  • Install the Upload Progress PECL Package http://pecl.php.net/package/uploadprogress/
  • Re-compile PHP with the Alternative PHP Cache Extension http://php.net/manual/en/book.apc.php

Initially I wanted to use a Flash uploader. I've used these in the past and they are quite easy to integrate and allow things to fall back to a standard HTML form submission if Flash is not available. However, I soon ran into a problem. It seems that Flash on Mac OS X will try to load the entire file into memory before uploading it. From what I've read, Flash on Windows or Linux doesn't have this problem, but since I suspect a large percentage of the end users of this particular upload form would be using Macs, I couldn't just ignore the issue. Plus I'm testing on a Mac myself ... I'd really prefer to have an upload form that I could make use of. Trying to use a Flash uploader to upload a 1GB file on my laptop caused everything to freeze up, which meant Flash was out of the question. So I kept hunting. I ended up trying out quite a wide variety of tools.

 

 

Tool Pros Cons
SWF Upload
  • Well established and well documented.
  • MIT licensed.
  • Requires flash.
  • Flash on OSX loads entire file into memory, which freezes the browser when uploading very large files.
    PLUpload
    • Multiple runtimes (Flash, Silverlight, Google Gears, HTML5, HTML 4)
    • Can automatically fall back to best available runtime.
    • HTML5 and Flash runtimes both load entire file into memory on OS X before uploading, which freezes the browser when uploading very large files.
    • GPLv2 or Proprietary license.
    FileChucker
    • Perl CGI (if you use Perl).
    • Perl CGI (if you don't).
    jQuery File Upload
    • HTML5 based, requires no additional plugins.
    • MIT licensed.
    • Doesn't freeze the browser when uploading large files on OS X.
    • Separating out the jQuery-UI features so you can use your own styles is a bit tricky.
    • No progress bars in IE or Opera.
    JumpLoader
    • Can pause and restart uploads.
    • Can upload directly to an FTP server.
    • Java applet.
    • Requires a proprietary license to use without a logo.
    Jotform
    • Can create complex forms.
    • Synchs with a Dropbox account.
    • Can automatically email form results to you.
    • Saves other form data in a PDF.
    • Externally hosted, so upload bandwidth does not apply to your server.
    • Not integrated into the website. An iframe must be embedded.
    • Styled by Jotform, not the containing website. Customizing the form appearance might be difficult or impossible.
    • Requires monthly subscription in addition to Dropbox subscription.
    YUI Uploader
    • BSD Licensed.
    • Good integration with YUI controls (if you use them).
    • Beta software.
    • Requires flash.

    Conclusion

    I'm currently leaning toward using the jQuery File Upload script. It's simple to integrate, and doesn't require anything special on the client side. The only downside is that IE users will see a spinner instead of a progress bar. Unfortunately, the only way I could get progress bars in IE was through a Flash, Silverlight, or Java uploader and none of the ones I tested worked well in OSX. So it comes down to IE or everything else.

    to blog