You can just send the string through base64_decode (with $strict set to TRUE), it will return FALSE if the input is invalid.
You can also use f.i. regular expressions see whether the string contains any characters outside the base64 alphabet, and check whether it contains the right amount of padding at the end (= characters). But just using base64_decode is much easier, and there shouldn't be a risk of a malformed string causing any harm.
I realise that this is an old topic, but using the strict parameter isn't necessarily going to help.
Running base64_decode on a string such as "I am not base 64 encoded" will not return false.
If however you try decoding the string with strict and re-encode it with base64_encode, you can compare the result with the original data to determine if it's a valid bas64 encoded value:
if ( base64_encode(base64_decode($data, true)) === $data){
echo '$data is valid';
} else {
echo '$data is NOT valid';
}
Just for strings, you could use this function, that checks several base64 properties before returning true:
function is_base64($s){
// Check if there are valid base64 characters
if (!preg_match('/^[a-zA-Z0-9\/\r\n+]*={0,2}$/', $s)) return false;
// Decode the string in strict mode and check the results
$decoded = base64_decode($s, true);
if(false === $decoded) return false;
// Encode the string again
if(base64_encode($decoded) != $s) return false;
return true;
}
This is a really old question, but I found the following approach to be practically bullet proof. It also takes into account those weird strings with invalid characters that would cause an exception when validating.
public static function isBase64Encoded($str)
{
try
{
$decoded = base64_decode($str, true);
if ( base64_encode($decoded) === $str ) {
return true;
}
else {
return false;
}
}
catch(Exception $e)
{
// If exception is caught, then it is not a base64 encoded string
return false;
}
}
I got the idea from this page and adapted it to PHP.
basically i check for every character that is not printable (:graph:) is not a space or tab (\s) and is not a unicode letter (all accent ex: èéùìà etc.)
i still get false positive with this chars: £§° but i never use them in a string and for me is perfectly fine to invalidate them.
I aggregate this check with the function proposed by @merlucin
so the result:
function is_base64($s)
{
// Check if there are valid base64 characters
if (!preg_match('/^[a-zA-Z0-9\/\r\n+]*={0,2}$/', $s)) return false;
// Decode the string in strict mode and check the results
$decoded = base64_decode($s, true);
if(false === $decoded) return false;
// if string returned contains not printable chars
if (0 < preg_match('/((?![[:graph:]])(?!\s)(?!\p{L}))./', $decoded, $matched)) return false;
// Encode the string again
if(base64_encode($decoded) != $s) return false;
return true;
}
base64 decode the string with strict parameter set to true.
base64 encode the result of previous step. if the result is not same as the original string, then original string is not base64 encoded
if the result is same as previous string, then check if the decoded string contains printable characters. I used the php function ctype_print to check for non printable characters. The function returns false if the input string contains one or more non printable characters.
The following code implements the above steps:
public function IsBase64($data) {
$decoded_data = base64_decode($data, true);
$encoded_data = base64_encode($decoded_data);
if ($encoded_data != $data) return false;
else if (!ctype_print($decoded_data)) return false;
return true;
}
The above code will may return unexpected results. For e.g for the string "json" it will return false. "json" may be a valid base64 encoded string since the number of characters it has is a multiple of 4 and all characters are in the allowed range for base64 encoded strings. It seems we must know the range of allowed characters of the original string and then check if the decoded data has those characters.
I write this method is working perfectly on my projects. When you pass the base64 Image to this method, If it valid return true else return false. Let's try and let me know any wrong. I will edit and learn in the feature.
In fact, there is no reliable answer, as many non-base64-encoded text will be readable as base64-encoded, so there's no default way to know for sure.
Further, it's worth noting that base64_decode will decode many invalid strings
For exmaple, and is not valid base64 encoding, but base64_decode WILL decode it. As jw specifically. (I learned this the hard way)
That said, your most reliable method is, if you control the input, to add an identifier to the string after you encode it that is unique and not base64, and include it along with other checks. It's not bullet-proof, but it's a lot more bullet resistant than any other solution I've seen. For example:
Alright guys... finally I have found a bullet proof solution for this problem. Use this below function to check if the string is base64 encoded or not -
if u are doing api calls using js for image/file upload to the back end this might help
function is_base64_string($string) //check base 64 encode
{
// Check if there is no invalid character in string
if (!preg_match('/^(?:[data]{4}:(text|image|application)\/[a-z]*)/', $string)){
return false;
}else{
return true;
}
}
For those who use base64_encode(base64_decode('xxx')) to check may found that some time it is not able to check for string like test, 5555.
If the invalid base 64 string was base64_decode() without return false, it will be dead when you try to json_encode() anyway. This because the decoded string is invalid.
So, I use this method to check for valid base 64 encoded string.
Here is the code.
/**
* Check if the given string is valid base 64 encoded.
*
* @param string $string The string to check.
* @return bool Return `true` if valid, `false` for otherwise.
*/
function isBase64Encoded($string): bool
{
if (!is_string($string)) {
// if check value is not string.
// base64_decode require this argument to be string, if not then just return `false`.
// don't use type hint because `false` value will be converted to empty string.
return false;
}
$decoded = base64_decode($string, true);
if (false === $decoded) {
return false;
}
if (json_encode([$decoded]) === false) {
return false;
}
return true;
}// isBase64Encoded
And here is tests code.
// each tests value must be 'original string' => 'base 64 encoded string'
$testValues = [
555 => 'NTU1',
5555 => 'NTU1NQ==',
'hello' => 'aGVsbG8=',
'สวัสดี' => '4Liq4Lin4Lix4Liq4LiU4Li1',
'test' => 'dGVzdA==',
];
foreach ($testValues as $invalid => $valid) {
if (isBase64Encoded($invalid) === false) {
echo '<strong>' . $invalid . '</strong> is invalid base 64<br>';
} else {
echo '<strong style="color:red;">Error:</strong>';
echo '<strong>' . $invalid . '</strong> should not be valid base 64<br>';
}
if (isBase64Encoded($valid) === true) {
echo '<strong>' . $valid . '</strong> is valid base 64<br>';
} else {
echo '<strong style="color:red;">Error:</strong>';
echo '<strong>' . $valid . '</strong> should not be invalid base 64<br>';
}
echo '<br>';
}
Tests result:
555 is invalid base 64 NTU1 is valid base 64
5555 is invalid base 64 NTU1NQ== is valid base 64
hello is invalid base 64 aGVsbG8= is valid base 64
สวัสดี is invalid base 64 4Liq4Lin4Lix4Liq4LiU4Li1 is valid base
64
To validate without errors that someone sends a clipped base64 or that it is not an image, use this function to check the base64 and then if it is really an image