When escapeshellarg() was stripping my non-ASCII characters from a UTF-8 string, adding the following fixed the problem:
<?php
setlocale(LC_CTYPE, "en_US.UTF-8");
?>
PHP - Manual: escapeshellarg
2025-01-21
(PHP 4 >= 4.0.3, PHP 5, PHP 7, PHP 8)
escapeshellarg — 把字符串转码为可以在 shell 命令里使用的参数
$arg
): stringescapeshellarg() 将给字符串增加一个单引号并且能引用或者转码任何已经存在的单引号,这样以确保能够直接将一个字符串传入 shell 函数,并且还是确保安全的。对于用户输入的部分参数就应该使用这个函数。shell 函数包含 exec(), system() 执行运算符 。
arg
需要被转码的参数。
转换之后字符串。
示例 #1 escapeshellarg() 的例子
<?php
system('ls '.escapeshellarg($dir));
?>
When escapeshellarg() was stripping my non-ASCII characters from a UTF-8 string, adding the following fixed the problem:
<?php
setlocale(LC_CTYPE, "en_US.UTF-8");
?>
if you want empty arguments for empty input
use the form
escapeshellarg($input)."''"
the shell will treat foo'' as foo but empty input will become an empty argument instead of a missing one.
Under Windows, this function puts string into double-quotes, not single, and replaces %(percent sign) with a space, that's why it's impossible to pass a filename with percents in its name through this function.
Most of the comments above have misunderstood this function. It does not need to escape characters such as '$' and '`' - it uses the fact that the shell does not treat any characters as special inside single quotes (except the single quote character itself). The correct way to use this function is to call it on a variable that is intended to be passed to a command-line program as a single argument to that program - you do not call it on command-line as a whole.
The person above who comments that this function behaves badly if given the empty string as input is correct - this is a bug. It should indeed return two single quotes in this case.
If escapeshellarg() function removes your accents (like á, a with an 'accute') from the given string, ensure your LC_ALL variable is correct. If using it via web, you need to restart Apache or the corresponding web server after setting LC_ALL with a export LC_ALL=es_ES.utf8 (for example) from your shell.
On Windows, this function naively strips special characters and replaces them with spaces. The resulting string is always safe for use with exec() etc, but the operation is not lossless - strings containing " or % will not be passed through to the child process correctly.
Correctly escaping shell commands on Windows is not a simple matter. Programs must consider two distinct escape mechanisms which serve different purposes:
1) The convention used by the CommandLineToArgV() windows system function, used by the child process to interpret the command line string
2) The convention used by cmd.exe to escape shell meta-characters (e.g. output redirection controls)
All commands should be escaped for CommandLineToArgV() - this mechanism is applied to each argument individually before it is appended to the command line string. The resulting string may be safely used with the CreateProcess() family of system functions. However...
In almost all cases when creating a child process from PHP on Windows, it is done indirectly by invoking cmd.exe - this is to enable the use of shell functionality such as I/O redirection and environment variable substitution. As a consequence, the entire command string must be further escaped for cmd.exe. If the executed command contains further indirect calls through cmd.exe, each child command must be escaped again for each level of indirection.
The following functions can be used to correctly escape strings such that they are safely passed through to a child process:
<?php
/**
* Escape a single value in accordance with CommandLineToArgV()
* https://docs.microsoft.com/en-us/previous-versions/17w5ykft(v=vs.85)
*/
function escape_win32_argv(string $value): string
{
static $expr = '(
[\x00-\x20\x7F"] # control chars, whitespace or double quote
| \\\\++ (?=("|$)) # backslashes followed by a quote or at the end
)ux';
if ($value === '') {
return '""';
}
$quote = false;
$replacer = function($match) use($value, &$quote) {
switch ($match[0][0]) { // only inspect the first byte of the match
case '"': // double quotes are escaped and must be quoted
$match[0] = '\\"';
case ' ': case "\t": // spaces and tabs are ok but must be quoted
$quote = true;
return $match[0];
case '\\': // matching backslashes are escaped if quoted
return $match[0] . $match[0];
default: throw new InvalidArgumentException(sprintf(
"Invalid byte at offset %d: 0x%02X",
strpos($value, $match[0]), ord($match[0])
));
}
};
$escaped = preg_replace_callback($expr, $replacer, (string)$value);
if ($escaped === null) {
throw preg_last_error() === PREG_BAD_UTF8_ERROR
? new InvalidArgumentException("Invalid UTF-8 string")
: new Error("PCRE error: " . preg_last_error());
}
return $quote // only quote when needed
? '"' . $escaped . '"'
: $value;
}
/** Escape cmd.exe metacharacters with ^ */
function escape_win32_cmd(string $value): string
{
return preg_replace('([()%!^"<>&|])', '^$0', $value);
}
/** Like shell_exec() but bypass cmd.exe */
function noshell_exec(string $command): string
{
static $descriptors = [['pipe', 'r'],['pipe', 'w'],['pipe', 'w']],
$options = ['bypass_shell' => true];
if (!$proc = proc_open($command, $descriptors, $pipes, null, null, $options)) {
throw new \Error('Creating child process failed');
}
fclose($pipes[0]);
$result = stream_get_contents($pipes[1]);
fclose($pipes[1]);
stream_get_contents($pipes[2]);
fclose($pipes[2]);
proc_close($proc);
return $result;
}
// usage
$badString = 'String with "C:\\quotes\\" or malicious %OS% stuff \\';
$cmdParts = [
'php',
'-d', 'display_errors=1', '-d', 'error_reporting=-1',
'-r', 'echo $argv[1];',
$badString // child process $argv[1] value
];
/* The typical approach - works fine on POSIX shells but totally wrong
on Windows */
$wrong = implode(' ', array_map('escapeshellarg', $cmdParts));
/* Always escape each argument individually */
$escaped = implode(' ', array_map('escape_win32_argv', $cmdParts));
/* In almost all cases, escape for cmd.exe as well - the only exception is
when using proc_open() with the bypass_shell option. cmd doesn't handle
arguments individually, so the entire command line string can be escaped,
no need to process arguments individually */
$cmd = escape_win32_cmd($escaped);
$cmds = [
'escapeshellarg() - wrong' => $wrong,
'escape_win32_argv() - correct for bypass_shell' => $escaped,
'escape_win32_cmd(escape_win32_argv()) - correct everywhere else' => $cmd,
];
function check($original, $received)
{
$match = $original === $received ? '=' : 'X';
return "$match '$received'";
}
foreach ($cmds as $description => $cmd) {
echo "$description\n";
echo " $cmd\n";
echo " original: '$badString'\n";
echo " shell_exec(): " . check($badString, shell_exec($cmd)) . "\n";
echo " noshell_exec(): " . check($badString, noshell_exec($cmd)) . "\n";
echo "\n";
}
The reason why % are replaced with space on windows is that it is impossible in cmd.exe to escape or quote them so that environment variables are not expanded. If for instance %path% is in your argument it will always be expanded, so the only safe thing to do is to replace % with something else.
Alternatively, you could wipe the environment before making the call to exec(), but that has its side-effects.
The comment from 'rmays at castlecomm dot com' is incorrect: single quotes cannot be backslash-escaped inside a single-quoted string when constructing a shell argument. The output from this function is in fact correct. It drops out of the single-quoted string, includes a literal single quote with a backslash-escape, then resumes the single-quoted string. Observe:
[shellarg.php]
<?php
system("echo ' single quote\'d '");
system("echo ' single quote'\''d '");
?>
$ php shellarg.php
sh: -c: line 0: unexpected EOF while looking for matching `''
sh: -c: line 1: syntax error: unexpected end of file
single quote'd
Take care if using escapeshellarg() on serialized objects. Serialized objects contain null bytes, and escapeshellarg stops on the first null byte so you will not receive the full argument. (I consider this a bug, though not sure what it should do in this case. Probably serialize shouldn't have used null bytes, but too late for that now).
The workaround I've found to pass serialized objects on the command line is to base64_encode() them first and decode on the other side.
escapeshellarg() will strip all invalid characters according to your locale settings (e.g. latin-1 characters are stripped when locale/LC_CTYPE is UTF-8).
Please keep in mind that the locale support depends on your C standard library while compiling. This might result in strange behavior on embedded systems that use a standard library with poor locale support, other than glibc.
Ubuntu: wondering why your system locale (e.g. 'en_US.UTF-8') is not inherited to your Apache (still 'C')?
Check `/etc/apache2/envvars` ... activate the line `. /etc/default/locale`
if you need to generate Linux arguments even when running on Windows, try
<?php
/**
* quote arguments using linux escape rules, regardless of host OS
* (eg, it will use linux escape rules even when running on Windows)
*
* @param string $arg
* @throws \InvalidArgumentException if argument contains null bytes
* @return string
*/
/*public static*/ function linux_escapeshellarg(string $arg): string
{
if (false !== strpos($arg, "\x00")) {
throw new \InvalidArgumentException("argument contains null bytes, it's impossible to escape null bytes!");
}
return "'" . strtr($arg, [
"'" => "'\\''"
]) . "'";
}
if you need to generate Linux arguments even when running on Windows, try
<?php
/**
* quote arguments using linux escape rules, regardless of host OS
* (eg, it will use linux escape rules even when running on Windows)
*
* @param string $arg
* @throws \InvalidArgumentException if argument contains null bytes
* @return string
*/
/*public static*/ function linux_escapeshellarg(string $arg): string
{
if (false !== strpos($arg, "\x00")) {
throw new \InvalidArgumentException("argument contains null bytes, it's impossible to escape null bytes!");
}
return "'" . strtr($arg, [
"'" => "'\\''"
]) . "'";
}
the best alternative to escapeshellarg() for windows i've come up with is this:
<?php
function w32escapeshellarg($s)
{ return '"' . addcslashes($s, '\\"') . '"'; }
?>
Here's a quick and dirty replacement of this function in case you need to deal with special characters.
<?php
/**
* An ugly, non-ASCII-character safe replacement of escapeshellarg().
*/
function escapeshellarg_special($file) {
return "'" . str_replace("'", "'\"'\"'", $file) . "'";
}
?>
When running a string of LaTeX code containing hyphens through as an argument to pdflatex escaped using this command, it will result in failure.
If escapeshellarg() returned something on a null input it would probably break more programs than it helps. Even if it's two "'s or two ''s, this function wouldn't work the way it's supposed to (that is, returning nothing).
However, most people do not put "" into their commands but I can see where it might be useful at the same time.
Perhaps an option in the command that would return the type of null we want. I might want the null character to be returned, someone else might want '', and someone else might want nothing at all.
官方地址:https://www.php.net/manual/en/function.escapeshellarg.php