不下载文件的远程文件大小

有没有办法不用下载文件就能得到远程文件 http://my_url/my_file.txt的大小?

112061 次浏览

Sure. Make a headers-only request and look for the Content-Length header.

Found something about this here:

Here's the best way (that I've found) to get the size of a remote file. Note that HEAD requests don't get the actual body of the request, they just retrieve the headers. So making a HEAD request to a resource that is 100MB will take the same amount of time as a HEAD request to a resource that is 1KB.

<?php
/**
* Returns the size of a file without downloading it, or -1 if the file
* size could not be determined.
*
* @param $url - The location of the remote file to download. Cannot
* be null or empty.
*
* @return The size of the file referenced by $url, or -1 if the size
* could not be determined.
*/
function curl_get_file_size( $url ) {
// Assume failure.
$result = -1;


$curl = curl_init( $url );


// Issue a HEAD request and follow any redirects.
curl_setopt( $curl, CURLOPT_NOBODY, true );
curl_setopt( $curl, CURLOPT_HEADER, true );
curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $curl, CURLOPT_USERAGENT, get_user_agent_string() );


$data = curl_exec( $curl );
curl_close( $curl );


if( $data ) {
$content_length = "unknown";
$status = "unknown";


if( preg_match( "/^HTTP\/1\.[01] (\d\d\d)/", $data, $matches ) ) {
$status = (int)$matches[1];
}


if( preg_match( "/Content-Length: (\d+)/", $data, $matches ) ) {
$content_length = (int)$matches[1];
}


// http://en.wikipedia.org/wiki/List_of_HTTP_status_codes
if( $status == 200 || ($status > 300 && $status <= 308) ) {
$result = $content_length;
}
}


return $result;
}
?>

Usage:

$file_size = curl_get_file_size( "http://stackoverflow.com/questions/2602612/php-remote-file-size-without-downloading-file" );

Since this question is already tagged "php" and "curl", I'm assuming you know how to use Curl in PHP.

If you set curl_setopt(CURLOPT_NOBODY, TRUE) then you will make a HEAD request and can probably check the "Content-Length" header of the response, which will be only headers.

Try this code

function retrieve_remote_file_size($url){
$ch = curl_init($url);


curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_NOBODY, TRUE);


$data = curl_exec($ch);
$size = curl_getinfo($ch, CURLINFO_CONTENT_LENGTH_DOWNLOAD);


curl_close($ch);
return $size;
}

Try the below function to get Remote file size

function remote_file_size($url){
$head = "";
$url_p = parse_url($url);


$host = $url_p["host"];
if(!preg_match("/[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*/",$host)){


$ip=gethostbyname($host);
if(!preg_match("/[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*/",$ip)){


return -1;
}
}
if(isset($url_p["port"]))
$port = intval($url_p["port"]);
else
$port    =    80;


if(!$port) $port=80;
$path = $url_p["path"];


$fp = fsockopen($host, $port, $errno, $errstr, 20);
if(!$fp) {
return false;
} else {
fputs($fp, "HEAD "  . $url  . " HTTP/1.1\r\n");
fputs($fp, "HOST: " . $host . "\r\n");
fputs($fp, "User-Agent: http://www.example.com/my_application\r\n");
fputs($fp, "Connection: close\r\n\r\n");
$headers = "";
while (!feof($fp)) {
$headers .= fgets ($fp, 128);
}
}
fclose ($fp);


$return = -2;
$arr_headers = explode("\n", $headers);
foreach($arr_headers as $header) {


$s1 = "HTTP/1.1";
$s2 = "Content-Length: ";
$s3 = "Location: ";


if(substr(strtolower ($header), 0, strlen($s1)) == strtolower($s1)) $status = substr($header, strlen($s1));
if(substr(strtolower ($header), 0, strlen($s2)) == strtolower($s2)) $size   = substr($header, strlen($s2));
if(substr(strtolower ($header), 0, strlen($s3)) == strtolower($s3)) $newurl = substr($header, strlen($s3));
}


if(intval($size) > 0) {
$return=intval($size);
} else {
$return=$status;
}


if (intval($status)==302 && strlen($newurl) > 0) {


$return = remote_file_size($newurl);
}
return $return;
}

Most answers here uses either CURL or are basing on reading headers. But in some certain situations you can use a way easier solution. Consider note on filesize()'s docs on PHP.net. You'll find there a tip saying: "As of PHP 5.0.0, this function can also be used with some URL wrappers. Refer to Supported Protocols and Wrappers to determine which wrappers support stat() family of functionality".

So, if your server and PHP parser is properly configured, you can simply use filesize() function, fed it with full URL, pointing to a remote file, which size you want to get, and let PHP do the all magic.

The simplest and most efficient implementation:

function remote_filesize($url, $fallback_to_download = false)
{
static $regex = '/^Content-Length: *+\K\d++$/im';
if (!$fp = @fopen($url, 'rb')) {
return false;
}
if (isset($http_response_header) && preg_match($regex, implode("\n", $http_response_header), $matches)) {
return (int)$matches[0];
}
if (!$fallback_to_download) {
return false;
}
return strlen(stream_get_contents($fp));
}

I'm not sure, but couldn't you use the get_headers function for this?

$url     = 'http://example.com/dir/file.txt';
$headers = get_headers($url, true);


if ( isset($headers['Content-Length']) ) {
$size = 'file size:' . $headers['Content-Length'];
}
else {
$size = 'file size: unknown';
}


echo $size;

As mentioned a couple of times, the way to go is to retrieve the information from the response header's Content-Length field.

However, you should note that

  • the server you're probing not necessarily implements the HEAD method(!)
  • there's absolutely no need to manually craft a HEAD request (which, again, might not even be supported) using fopen or alike or even to invoke the curl library, when PHP has get_headers() (remember: K.I.S.S.)

Use of get_headers() follows the K.I.S.S. principle and works even if the server you're probing does not support the HEAD request.

So, here's my version (gimmick: returns human-readable formatted size ;-)):

Gist: https://gist.github.com/eyecatchup/f26300ffd7e50a92bc4d (curl and get_headers version)
get_headers()-Version:

<?php
/**
*  Get the file size of any remote resource (using get_headers()),
*  either in bytes or - default - as human-readable formatted string.
*
*  @author  Stephan Schmitz <eyecatchup@gmail.com>
*  @license MIT <http://eyecatchup.mit-license.org/>
*  @url     <https://gist.github.com/eyecatchup/f26300ffd7e50a92bc4d>
*
*  @param   string   $url          Takes the remote object's URL.
*  @param   boolean  $formatSize   Whether to return size in bytes or formatted.
*  @param   boolean  $useHead      Whether to use HEAD requests. If false, uses GET.
*  @return  string                 Returns human-readable formatted size
*                                  or size in bytes (default: formatted).
*/
function getRemoteFilesize($url, $formatSize = true, $useHead = true)
{
if (false !== $useHead) {
stream_context_set_default(array('http' => array('method' => 'HEAD')));
}
$head = array_change_key_case(get_headers($url, 1));
// content-length of download (in bytes), read from Content-Length: field
$clen = isset($head['content-length']) ? $head['content-length'] : 0;


// cannot retrieve file size, return "-1"
if (!$clen) {
return -1;
}


if (!$formatSize) {
return $clen; // return size in bytes
}


$size = $clen;
switch ($clen) {
case $clen < 1024:
$size = $clen .' B'; break;
case $clen < 1048576:
$size = round($clen / 1024, 2) .' KiB'; break;
case $clen < 1073741824:
$size = round($clen / 1048576, 2) . ' MiB'; break;
case $clen < 1099511627776:
$size = round($clen / 1073741824, 2) . ' GiB'; break;
}


return $size; // return formatted size
}

Usage:

$url = 'http://download.tuxfamily.org/notepadplus/6.6.9/npp.6.6.9.Installer.exe';
echo getRemoteFilesize($url); // echoes "7.51 MiB"

Additional note: The Content-Length header is optional. Thus, as a general solution it isn't bullet proof!


Here is another approach that will work with servers that do not support HEAD requests.

It uses cURL to make a request for the content with an HTTP range header asking for the first byte of the file.

If the server supports range requests (most media servers will) then it will receive the response with the size of the resource.

If the server does not response with a byte range, it will look for a content-length header to determine the length.

If the size is found in a range or content-length header, the transfer is aborted. If the size is not found and the function starts reading the response body, the transfer is aborted.

This could be a supplementary approach if a HEAD request results in a 405 method not supported response.

/**
* Try to determine the size of a remote file by making an HTTP request for
* a byte range, or look for the content-length header in the response.
* The function aborts the transfer as soon as the size is found, or if no
* length headers are returned, it aborts the transfer.
*
* @return int|null null if size could not be determined, or length of content
*/
function getRemoteFileSize($url)
{
$ch = curl_init($url);


$headers = array(
'Range: bytes=0-1',
'Connection: close',
);


$in_headers = true;
$size       = null;


curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2450.0 Iron/46.0.2450.0');
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_VERBOSE, 0); // set to 1 to debug
curl_setopt($ch, CURLOPT_STDERR, fopen('php://output', 'r'));


curl_setopt($ch, CURLOPT_HEADERFUNCTION, function($curl, $line) use (&$in_headers, &$size) {
$length = strlen($line);


if (trim($line) == '') {
$in_headers = false;
}


list($header, $content) = explode(':', $line, 2);
$header = strtolower(trim($header));


if ($header == 'content-range') {
// found a content-range header
list($rng, $s) = explode('/', $content, 2);
$size = (int)$s;
return 0; // aborts transfer
} else if ($header == 'content-length' && 206 != curl_getinfo($curl, CURLINFO_HTTP_CODE)) {
// found content-length header and this is not a 206 Partial Content response (range response)
$size = (int)$content;
return 0;
} else {
// continue
return $length;
}
});


curl_setopt($ch, CURLOPT_WRITEFUNCTION, function($curl, $data) use ($in_headers) {
if (!$in_headers) {
// shouldn't be here unless we couldn't determine file size
// abort transfer
return 0;
}


// write function is also called when reading headers
return strlen($data);
});


$result = curl_exec($ch);
$info   = curl_getinfo($ch);


return $size;
}

Usage:

$size = getRemoteFileSize('http://example.com/video.mp4');
if ($size === null) {
echo "Could not determine file size from headers.";
} else {
echo "File size is {$size} bytes.";
}

Php function get_headers() works for me to check the content-length as

$headers = get_headers('http://example.com/image.jpg', 1);
$filesize = $headers['Content-Length'];

For More Detail : PHP Function get_headers()

one line best solution :

echo array_change_key_case(get_headers("http://.../file.txt",1))['content-length'];

php is too delicius

function urlsize($url):int{
return array_change_key_case(get_headers($url,1))['content-length'];
}


echo urlsize("http://.../file.txt");

Try this: I use it and got good result.

    function getRemoteFilesize($url)
{
$file_headers = @get_headers($url, 1);
if($size =getSize($file_headers)){
return $size;
} elseif($file_headers[0] == "HTTP/1.1 302 Found"){
if (isset($file_headers["Location"])) {
$url = $file_headers["Location"][0];
if (strpos($url, "/_as/") !== false) {
$url = substr($url, 0, strpos($url, "/_as/"));
}
$file_headers = @get_headers($url, 1);
return getSize($file_headers);
}
}
return false;
}


function getSize($file_headers){


if (!$file_headers || $file_headers[0] == "HTTP/1.1 404 Not Found" || $file_headers[0] == "HTTP/1.0 404 Not Found") {
return false;
} elseif ($file_headers[0] == "HTTP/1.0 200 OK" || $file_headers[0] == "HTTP/1.1 200 OK") {


$clen=(isset($file_headers['Content-Length']))?$file_headers['Content-Length']:false;
$size = $clen;
if($clen) {
switch ($clen) {
case $clen < 1024:
$size = $clen . ' B';
break;
case $clen < 1048576:
$size = round($clen / 1024, 2) . ' KiB';
break;
case $clen < 1073741824:
$size = round($clen / 1048576, 2) . ' MiB';
break;
case $clen < 1099511627776:
$size = round($clen / 1073741824, 2) . ' GiB';
break;
}
}
return $size;


}
return false;
}

Now, test like these:

echo getRemoteFilesize('http://mandasoy.com/wp-content/themes/spacious/images/plain.png').PHP_EOL;
echo getRemoteFilesize('http://bookfi.net/dl/201893/e96818').PHP_EOL;
echo getRemoteFilesize('https://stackoverflow.com/questions/14679268/downloading-files-as-attachment-filesize-incorrect').PHP_EOL;

Results:

24.82 KiB

912 KiB

101.85 KiB

To cover the HTTP/2 request, the function provided here https://stackoverflow.com/a/2602624/2380767 needs to be changed a bit:

<?php
/**
* Returns the size of a file without downloading it, or -1 if the file
* size could not be determined.
*
* @param $url - The location of the remote file to download. Cannot
* be null or empty.
*
* @return The size of the file referenced by $url, or -1 if the size
* could not be determined.
*/
function curl_get_file_size( $url ) {
// Assume failure.
$result = -1;


$curl = curl_init( $url );


// Issue a HEAD request and follow any redirects.
curl_setopt( $curl, CURLOPT_NOBODY, true );
curl_setopt( $curl, CURLOPT_HEADER, true );
curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $curl, CURLOPT_USERAGENT, get_user_agent_string() );


$data = curl_exec( $curl );
curl_close( $curl );


if( $data ) {
$content_length = "unknown";
$status = "unknown";


if( preg_match( "/^HTTP\/1\.[01] (\d\d\d)/", $data, $matches ) ) {
$status = (int)$matches[1];
} elseif( preg_match( "/^HTTP\/2 (\d\d\d)/", $data, $matches ) ) {
$status = (int)$matches[1];
}


if( preg_match( "/Content-Length: (\d+)/", $data, $matches ) ) {
$content_length = (int)$matches[1];
} elseif( preg_match( "/content-length: (\d+)/", $data, $matches ) ) {
$content_length = (int)$matches[1];
}


// http://en.wikipedia.org/wiki/List_of_HTTP_status_codes
if( $status == 200 || ($status > 300 && $status <= 308) ) {
$result = $content_length;
}
}


return $result;
}
?>

If you using laravel 7 <=

use Illuminate\Support\Facades\Http;


Http::head($url)->header('Content-Length');