使用 PHP 为文件提供服务的最快方法

我尝试组合一个函数来接收文件路径,标识它是什么,设置适当的头,并像 Apache 那样为它提供服务。

之所以这样做,是因为我需要在提供文件之前使用 PHP 处理有关请求的一些信息。

速度至关重要

Virtual ()不是一个选项

必须在用户无法控制 Web 服务器(Apache/nginx 等)的共享主机环境中工作

以下是我目前为止得到的信息:

File::output($path);


<?php
class File {
static function output($path) {
// Check if the file exists
if(!File::exists($path)) {
header('HTTP/1.0 404 Not Found');
exit();
}


// Set the content-type header
header('Content-Type: '.File::mimeType($path));


// Handle caching
$fileModificationTime = gmdate('D, d M Y H:i:s', File::modificationTime($path)).' GMT';
$headers = getallheaders();
if(isset($headers['If-Modified-Since']) && $headers['If-Modified-Since'] == $fileModificationTime) {
header('HTTP/1.1 304 Not Modified');
exit();
}
header('Last-Modified: '.$fileModificationTime);


// Read the file
readfile($path);


exit();
}


static function mimeType($path) {
preg_match("|\.([a-z0-9]{2,4})$|i", $path, $fileSuffix);


switch(strtolower($fileSuffix[1])) {
case 'js' :
return 'application/x-javascript';
case 'json' :
return 'application/json';
case 'jpg' :
case 'jpeg' :
case 'jpe' :
return 'image/jpg';
case 'png' :
case 'gif' :
case 'bmp' :
case 'tiff' :
return 'image/'.strtolower($fileSuffix[1]);
case 'css' :
return 'text/css';
case 'xml' :
return 'application/xml';
case 'doc' :
case 'docx' :
return 'application/msword';
case 'xls' :
case 'xlt' :
case 'xlm' :
case 'xld' :
case 'xla' :
case 'xlc' :
case 'xlw' :
case 'xll' :
return 'application/vnd.ms-excel';
case 'ppt' :
case 'pps' :
return 'application/vnd.ms-powerpoint';
case 'rtf' :
return 'application/rtf';
case 'pdf' :
return 'application/pdf';
case 'html' :
case 'htm' :
case 'php' :
return 'text/html';
case 'txt' :
return 'text/plain';
case 'mpeg' :
case 'mpg' :
case 'mpe' :
return 'video/mpeg';
case 'mp3' :
return 'audio/mpeg3';
case 'wav' :
return 'audio/wav';
case 'aiff' :
case 'aif' :
return 'audio/aiff';
case 'avi' :
return 'video/msvideo';
case 'wmv' :
return 'video/x-ms-wmv';
case 'mov' :
return 'video/quicktime';
case 'zip' :
return 'application/zip';
case 'tar' :
return 'application/x-tar';
case 'swf' :
return 'application/x-shockwave-flash';
default :
if(function_exists('mime_content_type')) {
$fileSuffix = mime_content_type($path);
}
return 'unknown/' . trim($fileSuffix[0], '.');
}
}
}
?>
61373 次浏览

The fastest way: Don't. Look into the x-sendfile header for nginx, there are similar things for other web servers also. This means that you can still do access control etc in php but delegate the actual sending of the file to a web server designed for that.

P.S: I get chills just thinking about how much more efficient using this with nginx is, compared to reading and sending the file in php. Just think if 100 people are downloading a file: With php + apache, being generous, thats probably 100*15mb = 1.5GB (approx, shoot me), of ram right there. Nginx will just hand off sending the file to the kernel, and then it's loaded directly from the disk into the network buffers. Speedy!

P.P.S: And, with this method you can still do all the access control, database stuff you want.

header('Location: ' . $path);
exit(0);

Let Apache do the work for you.

My previous answer was partial and not well documented, here is an update with a summary of the solutions from it and from others in the discussion.

The solutions are ordered from best solution to worst but also from the solution needing the most control over the web server to the one needing the less. There don't seem to be an easy way to have one solution that is both fast and work everywhere.


Using the X-SendFile header

As documented by others it's actually the best way. The basis is that you do your access control in php and then instead of sending the file yourself you tell the web server to do it.

The basic php code is :

header("X-Sendfile: $file_name");
header("Content-type: application/octet-stream");
header('Content-Disposition: attachment; filename="' . basename($file_name) . '"');

Where $file_name is the full path on the file system.

The main problem with this solution is that it need to be allowed by the web server and either isn't installed by default (apache), isn't active by default (lighttpd) or need a specific configuration (nginx).

Apache

Under apache if you use mod_php you need to install a module called mod_xsendfile then configure it (either in apache config or .htaccess if you allow it)

XSendFile on
XSendFilePath /home/www/example.com/htdocs/files/

With this module the file path could either be absolute or relative to the specified XSendFilePath.

Lighttpd

The mod_fastcgi support this when configured with

"allow-x-send-file" => "enable"

The documentation for the feature is on the lighttpd wiki they document the X-LIGHTTPD-send-file header but the X-Sendfile name also work

Nginx

On Nginx you can't use the X-Sendfile header you must use their own header that is named X-Accel-Redirect. It is enabled by default and the only real difference is that it's argument should be an URI not a file system. The consequence is that you must define a location marked as internal in your configuration to avoid clients finding the real file url and going directly to it, their wiki contains a good explanation of this.

Symlinks and Location header

You could use symlinks and redirect to them, just create symlinks to your file with random names when an user is authorized to access a file and redirect the user to it using:

header("Location: " . $url_of_symlink);

Obviously you'll need a way to prune them either when the script to create them is called or via cron (on the machine if you have access or via some webcron service otherwise)

Under apache you need to be able to enable FollowSymLinks in a .htaccess or in the apache config.

Access control by IP and Location header

Another hack is to generate apache access files from php allowing the explicit user IP. Under apache it mean using mod_authz_host (mod_access) Allow from commands.

The problem is that locking access to the file (as multiple users may want to do this at the same time) is non trivial and could lead to some users waiting a long time. And you still need to prune the file anyway.

Obviously another problem would be that multiple people behind the same IP could potentially access the file.

When everything else fail

If you really don't have any way to get your web server to help you, the only solution remaining is readfile it's available in all php versions currently in use and work pretty well (but isn't really efficient).


Combining solutions

In fine, the best way to send a file really fast if you want your php code to be usable everywhere is to have a configurable option somewhere, with instructions on how to activate it depending on the web server and maybe an auto detection in your install script.

It is pretty similar to what is done in a lot of software for

  • Clean urls (mod_rewrite on apache)
  • Crypto functions (mcrypt php module)
  • Multibyte string support (mbstring php module)

if you have the possibility to add PECL extensions to your php you can simply use the functions from the Fileinfo package to determine the content-type and then send the proper headers...

Here goes a pure PHP solution. I've adapted the following function from my personal framework:

function Download($path, $speed = null, $multipart = true)
{
while (ob_get_level() > 0)
{
ob_end_clean();
}


if (is_file($path = realpath($path)) === true)
{
$file = @fopen($path, 'rb');
$size = sprintf('%u', filesize($path));
$speed = (empty($speed) === true) ? 1024 : floatval($speed);


if (is_resource($file) === true)
{
set_time_limit(0);


if (strlen(session_id()) > 0)
{
session_write_close();
}


if ($multipart === true)
{
$range = array(0, $size - 1);


if (array_key_exists('HTTP_RANGE', $_SERVER) === true)
{
$range = array_map('intval', explode('-', preg_replace('~.*=([^,]*).*~', '$1', $_SERVER['HTTP_RANGE'])));


if (empty($range[1]) === true)
{
$range[1] = $size - 1;
}


foreach ($range as $key => $value)
{
$range[$key] = max(0, min($value, $size - 1));
}


if (($range[0] > 0) || ($range[1] < ($size - 1)))
{
header(sprintf('%s %03u %s', 'HTTP/1.1', 206, 'Partial Content'), true, 206);
}
}


header('Accept-Ranges: bytes');
header('Content-Range: bytes ' . sprintf('%u-%u/%u', $range[0], $range[1], $size));
}


else
{
$range = array(0, $size - 1);
}


header('Pragma: public');
header('Cache-Control: public, no-cache');
header('Content-Type: application/octet-stream');
header('Content-Length: ' . sprintf('%u', $range[1] - $range[0] + 1));
header('Content-Disposition: attachment; filename="' . basename($path) . '"');
header('Content-Transfer-Encoding: binary');


if ($range[0] > 0)
{
fseek($file, $range[0]);
}


while ((feof($file) !== true) && (connection_status() === CONNECTION_NORMAL))
{
echo fread($file, round($speed * 1024)); flush(); sleep(1);
}


fclose($file);
}


exit();
}


else
{
header(sprintf('%s %03u %s', 'HTTP/1.1', 404, 'Not Found'), true, 404);
}


return false;
}

The code is as efficient as it can be, it closes the session handler so that other PHP scripts can run concurrently for the same user / session. It also supports serving downloads in ranges (which is also what Apache does by default I suspect), so that people can pause/resume downloads and also benefit from higher download speeds with download accelerators. It also allows you to specify the maximum speed (in Kbps) at which the download (part) should be served via the $speed argument.

The PHP Download function mentioned here was causing some delay before the file actually started to download. I don't know if this was caused by using varnish cache or what, but for me it helped to remove the sleep(1); completely and set $speed to 1024. Now it works without any problem as is fast as hell. Maybe you could modify that function too, because I saw it used all over the internet.

A better implementation, with cache support, customized http headers.

serveStaticFile($fn, array(
'headers'=>array(
'Content-Type' => 'image/x-icon',
'Cache-Control' =>  'public, max-age=604800',
'Expires' => gmdate("D, d M Y H:i:s", time() + 30 * 86400) . " GMT",
)
));


function serveStaticFile($path, $options = array()) {
$path = realpath($path);
if (is_file($path)) {
if(session_id())
session_write_close();


header_remove();
set_time_limit(0);
$size = filesize($path);
$lastModifiedTime = filemtime($path);
$fp = @fopen($path, 'rb');
$range = array(0, $size - 1);


header('Last-Modified: ' . gmdate("D, d M Y H:i:s", $lastModifiedTime)." GMT");
if (( ! empty($_SERVER['HTTP_IF_MODIFIED_SINCE']) && strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE']) == $lastModifiedTime ) ) {
header("HTTP/1.1 304 Not Modified", true, 304);
return true;
}


if (isset($_SERVER['HTTP_RANGE'])) {
//$valid = preg_match('^bytes=\d*-\d*(,\d*-\d*)*$', $_SERVER['HTTP_RANGE']);
if(substr($_SERVER['HTTP_RANGE'], 0, 6) != 'bytes=') {
header('HTTP/1.1 416 Requested Range Not Satisfiable', true, 416);
header('Content-Range: bytes */' . $size); // Required in 416.
return false;
}


$ranges = explode(',', substr($_SERVER['HTTP_RANGE'], 6));
$range = explode('-', $ranges[0]); // to do: only support the first range now.


if ($range[0] === '') $range[0] = 0;
if ($range[1] === '') $range[1] = $size - 1;


if (($range[0] >= 0) && ($range[1] <= $size - 1) && ($range[0] <= $range[1])) {
header('HTTP/1.1 206 Partial Content', true, 206);
header('Content-Range: bytes ' . sprintf('%u-%u/%u', $range[0], $range[1], $size));
}
else {
header('HTTP/1.1 416 Requested Range Not Satisfiable', true, 416);
header('Content-Range: bytes */' . $size);
return false;
}
}


$contentLength = $range[1] - $range[0] + 1;


//header('Content-Disposition: attachment; filename="xxxxx"');
$headers = array(
'Accept-Ranges' => 'bytes',
'Content-Length' => $contentLength,
'Content-Type' => 'application/octet-stream',
);


if(!empty($options['headers'])) {
$headers = array_merge($headers, $options['headers']);
}
foreach($headers as $k=>$v) {
header("$k: $v", true);
}


if ($range[0] > 0) {
fseek($fp, $range[0]);
}
$sentSize = 0;
while (!feof($fp) && (connection_status() === CONNECTION_NORMAL)) {
$readingSize = $contentLength - $sentSize;
$readingSize = min($readingSize, 512 * 1024);
if($readingSize <= 0) break;


$data = fread($fp, $readingSize);
if(!$data) break;
$sentSize += strlen($data);
echo $data;
flush();
}


fclose($fp);
return true;
}
else {
header('HTTP/1.1 404 Not Found', true, 404);
return false;
}
}

I coded a very simple function to serve files with PHP and automatic MIME type detection :

function serve_file($filepath, $new_filename=null) {
$filename = basename($filepath);
if (!$new_filename) {
$new_filename = $filename;
}
$mime_type = mime_content_type($filepath);
header('Content-type: '.$mime_type);
header('Content-Disposition: attachment; filename="downloaded.pdf"');
readfile($filepath);
}

Usage

serve_file("/no_apache/invoice243.pdf");