PHP内核学习教程之php opcode内核实现
opcode是计算机指令中的一部分,用于指定要执行的操作,指令的格式和规范由处理器的指令规范指定。除了指令本身以外通常还有指令所需要的操作数,可能有的指令不需要显式的操作数。这些操作数可能是寄存器中的值,堆栈中的值,某块内存的值或者IO端口中的值等等。
通常opcode还有另一种称谓:字节码(bytecodes)。例如Java虚拟机(JVM),.NET的通用中间语言(CIL:CommonIntermeditateLanguage)等等。
1.Opcode简介
opcode是计算机指令中的一部分,用于指定要执行的操作,指令的格式和规范由处理器的指令规范指定。除了指令本身以外通常还有指令所需要的操作数,可能有的指令不需要显式的操作数。这些操作数可能是寄存器中的值,堆栈中的值,某块内存的值或者IO端口中的值等等
通常opcode还有另一种称谓:字节码(bytecodes)。例如Java虚拟机(JVM),.NET的通用中间语言(CIL:CommonIntermeditateLanguage)等等
PHP中的opcode则属于前面介绍中的后着,PHP是构建在Zend虚拟机(ZendVM)之上的。PHP的opcode就是Zend虚拟机中的指令(基于Zend的中间代码)
RelevantLink:
http://www.luocong.com/learningopcode/doc/1._%E4%BB%80%E4%B9%88%E6%98%AFOpCode%EF%BC%9F.htm
2.PHP中的Opcode
0x1:数据结构
在PHP实现内部,opcode由如下的结构体表示
\php-5.6.17\Zend\zend_compile.h
struct_zend_op { opcode_handler_thandler;//执行该opcode时调用的处理函数 znode_opop1;//opcode所操作的操作数 znode_opop2;//opcode所操作的操作数 znode_opresult; ulongextended_value; uintlineno; zend_ucharopcode;//opcode代码 zend_ucharop1_type; zend_ucharop2_type; zend_ucharresult_type; };
和CPU的指令类似,有一个标示指令的opcode字段,以及这个opcode所操作的操作数,PHP不像汇编那么底层,在脚本实际执行的时候可能还需要其他更多的信息,extended_value字段就保存了这类信息,其中的result域则是保存该指令执行完成后的结果
例如如下代码是在编译器遇到print语句的时候进行编译的函数
\php-5.6.17\Zend\zend_compile.c
voidzend_do_print(znode*result,constznode*argTSRMLS_DC)/*{{{*/ { //新创建一条zend_op zend_op*opline=get_next_op(CG(active_op_array)TSRMLS_CC); //将新建的zend_op的返回值类型设置为临时变量(IS_TMP_VAR),因为print中的内存仅仅为了临时输出,并不需要保存 opline->result_type=IS_TMP_VAR; //为临时变量申请空间 opline->result.var=get_temporary_variable(CG(active_op_array)); //指定opcode为ZEND_PRINT opline->opcode=ZEND_PRINT; //将传递进来的参数赋值给这条opcode的第一个操作数 SET_NODE(opline->op1,arg); SET_UNUSED(opline->op2); GET_NODE(result,opline->result); }
0x2:opcode类型:zend_op->zend_ucharopcode
比对汇编语言的概念,每个opcode都对应于一个类型,表明该opcpde的"操作指令",opcode的类型为zend_uchar,zend_uchar实际上就是unsignedchar,此字段保存的整形值即为op的编号,用来区分不同的op类型,opcode的可取值都被定义成了宏
/Zend/zend_vm_opcodes.h
#defineZEND_NOP0 #defineZEND_ADD1 #defineZEND_SUB2 #defineZEND_MUL3 #defineZEND_DIV4 #defineZEND_MOD5 #defineZEND_SL6 #defineZEND_SR7 #defineZEND_CONCAT8 #defineZEND_BW_OR9 #defineZEND_BW_AND10 #defineZEND_BW_XOR11 #defineZEND_BW_NOT12 #defineZEND_BOOL_NOT13 #defineZEND_BOOL_XOR14 #defineZEND_IS_IDENTICAL15 #defineZEND_IS_NOT_IDENTICAL16 #defineZEND_IS_EQUAL17 #defineZEND_IS_NOT_EQUAL18 #defineZEND_IS_SMALLER19 #defineZEND_IS_SMALLER_OR_EQUAL20 #defineZEND_CAST21 #defineZEND_QM_ASSIGN22 #defineZEND_ASSIGN_ADD23 #defineZEND_ASSIGN_SUB24 #defineZEND_ASSIGN_MUL25 #defineZEND_ASSIGN_DIV26 #defineZEND_ASSIGN_MOD27 #defineZEND_ASSIGN_SL28 #defineZEND_ASSIGN_SR29 #defineZEND_ASSIGN_CONCAT30 #defineZEND_ASSIGN_BW_OR31 #defineZEND_ASSIGN_BW_AND32 #defineZEND_ASSIGN_BW_XOR33 #defineZEND_PRE_INC34 #defineZEND_PRE_DEC35 #defineZEND_POST_INC36 #defineZEND_POST_DEC37 #defineZEND_ASSIGN38 #defineZEND_ASSIGN_REF39 #defineZEND_ECHO40 #defineZEND_PRINT41 #defineZEND_JMP42 #defineZEND_JMPZ43 #defineZEND_JMPNZ44 #defineZEND_JMPZNZ45 #defineZEND_JMPZ_EX46 #defineZEND_JMPNZ_EX47 #defineZEND_CASE48 #defineZEND_SWITCH_FREE49 #defineZEND_BRK50 #defineZEND_CONT51 #defineZEND_BOOL52 #defineZEND_INIT_STRING53 #defineZEND_ADD_CHAR54 #defineZEND_ADD_STRING55 #defineZEND_ADD_VAR56 #defineZEND_BEGIN_SILENCE57 #defineZEND_END_SILENCE58 #defineZEND_INIT_FCALL_BY_NAME59 #defineZEND_DO_FCALL60 #defineZEND_DO_FCALL_BY_NAME61 #defineZEND_RETURN62 #defineZEND_RECV63 #defineZEND_RECV_INIT64 #defineZEND_SEND_VAL65 #defineZEND_SEND_VAR66 #defineZEND_SEND_REF67 #defineZEND_NEW68 #defineZEND_INIT_NS_FCALL_BY_NAME69 #defineZEND_FREE70 #defineZEND_INIT_ARRAY71 #defineZEND_ADD_ARRAY_ELEMENT72 #defineZEND_INCLUDE_OR_EVAL73 #defineZEND_UNSET_VAR74 #defineZEND_UNSET_DIM75 #defineZEND_UNSET_OBJ76 #defineZEND_FE_RESET77 #defineZEND_FE_FETCH78 #defineZEND_EXIT79 #defineZEND_FETCH_R80 #defineZEND_FETCH_DIM_R81 #defineZEND_FETCH_OBJ_R82 #defineZEND_FETCH_W83 #defineZEND_FETCH_DIM_W84 #defineZEND_FETCH_OBJ_W85 #defineZEND_FETCH_RW86 #defineZEND_FETCH_DIM_RW87 #defineZEND_FETCH_OBJ_RW88 #defineZEND_FETCH_IS89 #defineZEND_FETCH_DIM_IS90 #defineZEND_FETCH_OBJ_IS91 #defineZEND_FETCH_FUNC_ARG92 #defineZEND_FETCH_DIM_FUNC_ARG93 #defineZEND_FETCH_OBJ_FUNC_ARG94 #defineZEND_FETCH_UNSET95 #defineZEND_FETCH_DIM_UNSET96 #defineZEND_FETCH_OBJ_UNSET97 #defineZEND_FETCH_DIM_TMP_VAR98 #defineZEND_FETCH_CONSTANT99 #defineZEND_GOTO100 #defineZEND_EXT_STMT101 #defineZEND_EXT_FCALL_BEGIN102 #defineZEND_EXT_FCALL_END103 #defineZEND_EXT_NOP104 #defineZEND_TICKS105 #defineZEND_SEND_VAR_NO_REF106 #defineZEND_CATCH107 #defineZEND_THROW108 #defineZEND_FETCH_CLASS109 #defineZEND_CLONE110 #defineZEND_RETURN_BY_REF111 #defineZEND_INIT_METHOD_CALL112 #defineZEND_INIT_STATIC_METHOD_CALL113 #defineZEND_ISSET_ISEMPTY_VAR114 #defineZEND_ISSET_ISEMPTY_DIM_OBJ115 #defineZEND_PRE_INC_OBJ132 #defineZEND_PRE_DEC_OBJ133 #defineZEND_POST_INC_OBJ134 #defineZEND_POST_DEC_OBJ135 #defineZEND_ASSIGN_OBJ136 #defineZEND_INSTANCEOF138 #defineZEND_DECLARE_CLASS139 #defineZEND_DECLARE_INHERITED_CLASS140 #defineZEND_DECLARE_FUNCTION141 #defineZEND_RAISE_ABSTRACT_ERROR142 #defineZEND_DECLARE_CONST143 #defineZEND_ADD_INTERFACE144 #defineZEND_DECLARE_INHERITED_CLASS_DELAYED145 #defineZEND_VERIFY_ABSTRACT_CLASS146 #defineZEND_ASSIGN_DIM147 #defineZEND_ISSET_ISEMPTY_PROP_OBJ148 #defineZEND_HANDLE_EXCEPTION149 #defineZEND_USER_OPCODE150 #defineZEND_JMP_SET152 #defineZEND_DECLARE_LAMBDA_FUNCTION153 #defineZEND_ADD_TRAIT154 #defineZEND_BIND_TRAITS155 #defineZEND_SEPARATE156 #defineZEND_QM_ASSIGN_VAR157 #defineZEND_JMP_SET_VAR158 #defineZEND_DISCARD_EXCEPTION159 #defineZEND_YIELD160 #defineZEND_GENERATOR_RETURN161 #defineZEND_FAST_CALL162 #defineZEND_FAST_RET163 #defineZEND_RECV_VARIADIC164 #defineZEND_SEND_UNPACK165 #defineZEND_POW166 #defineZEND_ASSIGN_POW167
0x3:opcode执行句柄:zend_op->handler
op的执行句柄,其类型为opcode_handler_t
typedefint(ZEND_FASTCALL*opcode_handler_t)(ZEND_OPCODE_HANDLER_ARGS);
这个函数指针为op定义了执行方式,每一种opcode字段都对应一个种类的handler,比如如果$a=1;这样的代码生成的op,操作数为const和cv,最后就能确定handler为函数ZEND_ASSIGN_SPEC_CV_CONST_HANDLER
/Zend/zend_vm_execute.h
voidzend_init_opcodes_handlers(void) { staticconstopcode_handler_tlabels[]={ .. ZEND_ASSIGN_SPEC_CV_CONST_HANDLER, .. } }
0x4:opcpde操作数znode
操作数字段是_zend_op类型中比较重要的部分了,其中op1,op2,result三个操作数定义为znode类型
\php-5.6.17\Zend\zend_compile.h
typedefstruct_znode{/*usedonlyduringcompilation*/ /* 这个int类型的字段定义znode操作数的类型 #defineIS_CONST(1<<0)//表示常量,例如$a=123;$b="hello";这些代码生成OP后,123和"hello"都是以常量类型操作数存在 #defineIS_TMP_VAR(1<<1)//表示临时变量,临时变量一般在前面加~来表示,这是一些OP执行过程中需要用到的中间变量,例如初始化一个数组的时候,就需要一个临时变量来暂时存储数组zval,然后将数组赋值给变量 #defineIS_VAR(1<<2)//一般意义上的变量,以$开发表示 #defineIS_UNUSED(1<<3)//Unusedvariable #defineIS_CV(1<<4)//Compiledvariable,这种类型的操作数比较重要,此类型是在PHP后来的版本中(大概5.1)中才出现,CV的意思是compiledvariable,即编译后的变量,变量都是保存在一个符号表中,这个符号表是一个哈希表,如果每次读写变量的时候都需要到哈希表中去检索,会对效率有一定的影响,因此在执行上下文环境中,会将一些编译期间生成的变量缓存起来。此类型操作数一般以!开头表示,比如变量$a=123;$b="hello"这段代码,$a和$b对应的操作数可能就是!0和!1,0和1相当于一个索引号,通过索引号从缓存中取得相应的值 */ intop_type; /* 此字段为一个联合体,根据op_type的不同,u取不同的值 1.op_type=IS_CONST的时候,u中的constant保存的就是操作数对应的zval结构 2.例如$a=123时,123这个操作数中,u中的constant是一个IS_LONG类型的zval,其值lval为123 */ union{ znode_opop; zvalconstant;/*replacedbyliteral/zv*/ zend_op_array*op_array; zend_ast*ast; }u; zend_uintEA;/*extendedattributes*/ }znode;
0x5:opcode编译后数组op_array
在zend_do_print函数中的第一行,我们注意到下面这行代码
zend_op*opline=get_next_op(CG(active_op_array)TSRMLS_CC);
PHP脚本代码被编译后产生的opcode保存在op_array中,其内部存储的结构如下
\php-5.6.17\Zend\zend_compile.h
struct_zend_op_array { /*Commonelements*/ zend_uchartype; constchar*function_name;//如果是用户定义的函数则,这里将保存函数的名字 zend_class_entry*scope; zend_uintfn_flags; union_zend_function*prototype; zend_uintnum_args; zend_uintrequired_num_args; zend_arg_info*arg_info; /*ENDofcommonelements*/ zend_uint*refcount; zend_op*opcodes;//opcode数组 zend_uintlast; zend_compiled_variable*vars; intlast_var; zend_uintT; zend_uintnested_calls; zend_uintused_stack; zend_brk_cont_element*brk_cont_array; intlast_brk_cont; zend_try_catch_element*try_catch_array; intlast_try_catch; zend_boolhas_finally_block; /*staticvariablessupport*/ HashTable*static_variables; zend_uintthis_var; constchar*filename; zend_uintline_start; zend_uintline_end; constchar*doc_comment; zend_uintdoc_comment_len; zend_uintearly_binding;/*thelinkedlistofdelayeddeclarations*/ zend_literal*literals; intlast_literal; void**run_time_cache; intlast_cache_slot; void*reserved[ZEND_MAX_RESERVED_RESOURCES]; };
整个PHP脚本代码被编译后的opcodes保存在这里,在执行的时候由下面的execute函数执行
ZEND_APIvoidexecute(zend_op_array*op_arrayTSRMLS_DC) { //...循环执行op_array中的opcode或者执行其他op_array中的opcode }
每条opcode都有一个opcode_handler_t的函数指针字段,用于执行该opcode,PHP有三种方式来进行opcode的处理
1.CALL:PHP默认使用CALL的方式,也就是函数调用的方式
2.SWITCH:由于opcode执行是每个PHP程序频繁需要进行的操作,可以使用SWITCH或者GOTO的方式来分发
3.GOTO:通常GOTO的效率相对会高一些,不过效率是否提高依赖于不同的CPU
实际上我们会发现,在/zend/zend_language_parser.c中就是Zend的opcode翻译解释执行过程,其中包含了call、switch、goto三种opcode执行方式
这就是PHP为什么称之为解释型语言的内核原理,PHP在完成Lex词法解析后,在语法解析即生成产生式的时候,直接通过call、switch、goto的方式调用zendapi进行即使解释执行
RelevantLink:
http://www.nowamagic.net/librarys/veda/detail/1325 http://php.net/manual/zh/internals2.opcodes.list.php http://www.nowamagic.net/librarys/veda/detail/1543 http://www.nowamagic.net/librarys/veda/detail/1324 http://www.nowamagic.net/librarys/veda/detail/1543 http://www.laruence.com/2008/06/18/221.html http://www.php-internals.com/book/?p=chapt02/02-03-02-opcode
3.opcode翻译执行(即时解释执行)
RelevantLink:
http://www.php-internals.com/book/?p=chapt02/02-03-03-from-opcode-to-handler
以上所述本文给大家介绍的PHP内核学习教程之phpopcode内核实现的相关知识,希望对大家有所帮助。