Você está na página 1de 11

A

Short Introduction to Qemu

Table of Contents
Introduction

Introduction-What is Qemu and What It Can Do

Register Allocation-How Qemu allocates register for IR: movi_i32?

A Short Introduction to Qemu

I got to know Qemu since a dynamic taint analysis project, which I've been involving for
almost one year. This short introduction to Qemu is going to become a summary of what I
learn about Qemu.
Qemu is a huge project, the contents I'm going to touch here only accounts for a tiny portion
of it. (I hope it's the core portion though) Since I only understand a small piece of Qemu and
I can only introduce based on what I learned, thus there is no particular order between
contents. Basically, it can be considered as blog articles instead of a complete reference of
Qemu.

Introduction

A Short Introduction to Qemu

What is Qemu and What It Can Do


System Emulator
When people mention "using a PC or a smart device", they actually mean that they are using
the operating system, such as Mac OS or Windows for PCs, Android or iOS for smart
devices, etc.
So what about running an operating system under another, such as running a Windows
under a Mac OS, or vice versa, like a Matryoshka doll?
The question actually entails two sub questions,
1. Is it able to run nested operating systems?
2. Is it able to run nested operating systems even under different instruction set
architectures?
An instruction is a "command" that a CPU can execute (i.e., moves value from A to B, or
adds values from A and B and stores result in A, etc.). An instruction set is all the
"commands" that a CPU can execute. And an instruction set architecture includes the
instruction set, as well as registers, addressing mode, etc.1
As you might have guessed, Qemu (short for Quick Emulator2), originated by Fabrice
Bellard, allows
1. running nested operating systems
2. running nested operating systems under different instruction set architectures (i.e., runs
an Android system under a Windows or Mac OS of x86 architecture)
To achieve the purposes, essentially, Qemu translates one CPU instruction set to another
on the fly.

Binary Translation
To understand what the translation means, let's take a look a small example. The piece of
code shown below executes in the guest os:

Introduction-What is Qemu and What It Can Do

A Short Introduction to Qemu

#inlcude <stdio.h>
int main(void){
printf("hello qemu\n");
return 0;
}

Code 1.1: Simple Example: hello qemu


And its corresponding i386 instructions:
0x080483e4 <+0> push %ebp
0x080483e5 <+1> mov %esp,%ebp
0x080483e7 <+3> and $0xfffffff0,%esp
0x080483ea <+6> sub $0x10,%esp
0x080483ed <+9> movl $0x80484c0,(%esp)
0x080483f4 <+16> call 0x8048318
0x080483f9 <+21> mov $0x0,%eax
0x080483fe <+26> leave
0x080483ff <+27> ret

Code 1.2: i386 Instructions of 1.1


So what's really happening when the instructions showed in 1.2 are executing in the guest
OS that Qemu emulates?
Generally, it can be divided into two steps:
1. Qemu translates the guest instructions into its Intermediate Representations (IRs)
2. Qemu translates the corresponding IRs into host instructions
First, let's take a look an example of step 1:

Introduction-What is Qemu and What It Can Do

A Short Introduction to Qemu

OP after optimization and liveness analysis:


ld_i32 tmp12,env,$0xfffffffc
movi_i32 tmp13,$0x0
brcond_i32 tmp12,tmp13,ne,$L0
---- 0x80483e4
movi_i32 tmp12,$0x4
sub_i32 tmp2,esp,tmp12
qemu_st_i32 ebp,tmp2,leul,$0x1
mov_i32 esp,tmp2
---- 0x80483e5
mov_i32 ebp,esp
---- 0x80483e7
movi_i32 tmp1,$0xfffffff0
and_i32 tmp0,esp,tmp1
mov_i32 esp,tmp0
discard cc_src
discard cc_src2
discard cc_op
...

Code 1.3: Portion of IRs that Qemu Translates from Code 1.2
Code 1.3 shows a portion of IRs (from 0x80483e4 to 0x80483e7 ) that are translated from
the instructions shown in Code 1.2.
Afterwards, Qemu further translates these IRs into host instructions as shown below:
OUT: [size=326]
0xaa9ab870: mov -0x4(%ebp),%ebx
0xaa9ab873: test %ebx,%ebx
0xaa9ab875: jne 0xaa9ab946
0xaa9ab87b: mov 0x10(%ebp),%ebx
0xaa9ab87e: sub $0x4,%ebx
0xaa9ab881: mov 0x14(%ebp),%esi
0xaa9ab884: mov %ebx,%eax
0xaa9ab886: mov %ebx,%edx
...

Code 1.4: Portion of Host Instructions Translated from IRs showed in Code 1.3
Eventually, the instructions showed in Code 1.4 will be executed in the host machine. That
finishes the whole translation process from guest instructions to host instructions, which is
called binary translation some time.

Introduction-What is Qemu and What It Can Do

A Short Introduction to Qemu

1: Instruction set. (2015, March 21). In Wikipedia, The Free Encyclopedia.


2: QEMU. (2015, April 14). In Wikipedia, The Free Encyclopedia

Introduction-What is Qemu and What It Can Do

A Short Introduction to Qemu

What is movi_i32?
movi_i32 is one of Qemu intermediate representations (IRs), defined in file tcg/tcg-opc.h:
DEF(movi_i32, 1, 0, 1, 0)

According with the syntax of defining an IR (tcg/tcg-opc.h):


/*
* DEF(name, oargs, iargs, cargs, flags)
*/

movi_i32 has one constant input temporary, and one output temporary. The semantic is to
move the constant input temporary to the output.

How Qemu allocates register for this IR?


The corresponding source code defined in tcg/tcg.c:
static void tcg_reg_alloc_movi(TCGContext *s, const TCGArg *args)
{
TCGTemp *ots;
tcg_target_ulong val;
ots = &s->temps[args[0]];
val = args[1];
if (ots->fixed_reg) {
/* for fixed registers, we do not do any constant
propagation */
tcg_out_movi(s, ots->type, ots->reg, val);
} else {
/* The movi is not explicitly generated here */
if (ots->val_type == TEMP_VAL_REG)
s->reg_to_temp[ots->reg] = -1;
ots->val_type = TEMP_VAL_CONST;
ots->val = val;
}
}

The big picture is to assign the input constant to the output. Details are explained below.

Register Allocation-How Qemu allocates register for IR: movi_i32?

A Short Introduction to Qemu

1.

args[0] is the index of the output temporary

2.

args[1] is the value of the constant input temporary

3. Via the index - args[0] , Qemu can accesses the content of the output temporary ( &s>temps[args[0]] ), which temps[] is an array to store all temporaries of IRs that are

being used in the current translation block.


4. Temporary is defined as a structure , one of its member is fixed_reg . Most
temporaries don't use this member with a few exceptions, indicating the temporary has
a fixed register in the host machine.
For example, env temporary uses this member. For the rest, this fixed_reg is set to 0.
5. If the output temporary fixed_reg is nonzero, then Qemu will generate a host
instruction via function: tcg_out_movi(s, ots->type, ots->reg, val);
ots is the output temporary
ots->type indicates the temporary type, either 32 bit or 64 bit width
ots->reg indicates the index of the register that the temporary occupies currently

in the host machine (since it has a fixed_reg, which one is it?)


s is the TCGContext (will not talk about it in details here)

The semantic of this function is to issue an host instruction that assigning the value
val to the register ( ots->reg ), which is corresponding to the semantic of the IR.

6. If the output temporary's fixed_reg is zero, then


If ots->val_type is TEMP_VAL_REG , indicating that the temporary currently occupies
a register in the host machine
s->reg_to_temp[ots->reg] = -1; frees the register

Array reg_to_temp[] is a member of the TCGContext indicating which


registers in the host machine correspond to which temporaries.
Update val_type to TEMP_VAL_COSNT, indicating the current value of the
temporary is stored in the member val other that a register nor a memory in host.
Assign the input constant val to the val member of the temporary.
In this case, what it does is a constant propagation without issuing any host instruction.

A concrete example
Consider an assembly instruction running in the guest machine of Qemu.
IN:
0x000fe070: mov $0xe5852,%edx

It assigns the constant value 0xe5852 to register edx.

Register Allocation-How Qemu allocates register for IR: movi_i32?

A Short Introduction to Qemu

So what IRs are translated based on the instruction?


OP:
---- 0xfe070
movi_i32 tmp0,$0xe5852
mov_i32 edx,tmp0

Qemu translates the guest instruction into two IRs:


1. Assign constant 0xe5852 to a temporary tmp0
2. Assign tmp0 to edx
Let's focus on the first IR, since it is a movi_i32 .
Via gdb, we can see the index of the output temporary - args[0] is
(gdb) p args[0]
$38 = 7

Then Qemu can access the content of the output temporary.


(gdb) p *ots
$39 = {base_type = TCG_TYPE_I32, type = TCG_TYPE_I32, val_type = 2, reg = 0,
val = 0, mem_reg = 5, mem_offset = 8, fixed_reg = 0, mem_coherent = 0,
mem_allocated = 1, temp_local = 0, temp_allocated = 0, next_free_temp = 0,
name = 0x8026e302 "edx"}

The output temporary is a global temporary edx (specified by the name member)
The val_type is 2, indicating it is a TEMP_VAL_MEM , which indicates its current value was
stored in memory of the host.
defined in tcg.h
#define TEMP_VAL_DEAD 0
#define TEMP_VAL_REG 1
#define TEMP_VAL_MEM 2
#define TEMP_VAL_CONST 3

fixed_reg is set to 0, Qemu will not issue any host instruction


ots->val_type is NOT TEMP_VAL_REG , Qemu need not to free any host register
The key codes that Qemu executed in run-time are:

Register Allocation-How Qemu allocates register for IR: movi_i32?

10

A Short Introduction to Qemu

...
ots->val_type = TEMP_VAL_CONST;
ots->val = val;
...

After the allocation finished executing, the output temporary became:


{base_type = TCG_TYPE_I32, type = TCG_TYPE_I32, val_type = 2, reg = 0,
val = 0xe5852, mem_reg = 5, mem_offset = 8, fixed_reg = 0, mem_coherent = 0,
mem_allocated = 1, temp_local = 0, temp_allocated = 0, next_free_temp = 0,
name = 0x8026e302 "edx"}

Conclusion
For this particular case, Qemu did not issue any host instruction, it only did a constant
propagation.
(All codes presented should be found in stable-1.0, and assumes both guest and host
machine are in x86 platform.)

Register Allocation-How Qemu allocates register for IR: movi_i32?

11

Você também pode gostar