[Python源码学习]之bytecode
生成 .pyc 文件代码中通过import使用到的.py文件会自动编译成.pyc文件,如何手动来编译呢?
>>> import py_compile >>> py_compile.compile('hello.py') >>>
python3 -m py_compile hello.py 生成的文件(个人机子上的结果): __pycache__/hello.cpython-32.pyc
python -m compileall . 这儿的py_compile和compileall使用的都是builtins模块的compile()函数 builtins在python执行环境中,builtins模块中:
一个例子: >>> a = "1+2" >>> b = compile(a, "test.py", 'single') >>> type(b) <class 'code'> >>> eval(b) 3 它们对应C高层接口中的下面两类函数:
代码compile() 和 eval()、exec() 是内建模块中的函数,所以瞅瞅
中定义的方法: static PyMethodDef builtin_methods[] = { //... {"compile", (PyCFunction)builtin_compile, METH_VARARGS|METH_KEYWORDS, compile_doc}, //... {"eval", builtin_eval, METH_VARARGS, eval_doc}, {"exec", builtin_exec, METH_VARARGS, exec_doc}, //... {NULL, NULL}, }; 其中:
static PyObject * builtin_compile(PyObject *self, PyObject *args, PyObject *kwds) { .... is_ast = PyAST_Check(cmd); if (is_ast) { ... result = (PyObject*)PyAST_CompileEx(mod, filename, ... goto finally; } ... result = Py_CompileStringExFlags(str, filename, start[mode], &cf, optimize); goto finally; finally: Py_DECREF(filename_obj); return result; }
static PyObject * builtin_eval(PyObject *self, PyObject *args) { ... if (PyCode_Check(cmd)) { return PyEval_EvalCode(cmd, globals, locals); } cf.cf_flags = PyCF_SOURCE_IS_UTF8; str = source_as_string(cmd, "eval", "string, bytes or code", &cf); ... (void)PyEval_MergeCompilerFlags(&cf); result = PyRun_StringFlags(str, Py_eval_input, globals, locals, &cf); Py_XDECREF(tmp); return result; } 恩,这样一来,总算将C代码和python代码联系上了。 PyCodeObject前面提到的 bytecode,具体到源码中,就是PyCodeObject对象了(对应python环境中的code): 定义先看一下该结构体的定义:
/* Bytecode object */ typedef struct { PyObject_HEAD int co_argcount; /* #arguments, except *args */ int co_kwonlyargcount; /* #keyword only arguments */ int co_nlocals; /* #local variables */ int co_stacksize; /* #entries needed for evaluation stack */ int co_flags; /* CO_..., see below */ PyObject *co_code; /* instruction opcodes */ PyObject *co_consts; /* list (constants used) */ PyObject *co_names; /* list of strings (names used) */ PyObject *co_varnames; /* tuple of strings (local variable names) */ PyObject *co_freevars; /* tuple of strings (free variable names) */ PyObject *co_cellvars; /* tuple of strings (cell variable names) */ /* The rest doesn't count for hash or comparisons */ PyObject *co_filename; /* unicode (where it was loaded from) */ PyObject *co_name; /* unicode (name, for reference) */ int co_firstlineno; /* first source line number */ PyObject *co_lnotab; /* string (encoding addr<->lineno mapping) See Objects/lnotab_notes.txt for details. */ void *co_zombieframe; /* for optimization only (see frameobject.c) */ PyObject *co_weakreflist; /* to support weakrefs to code objects */ } PyCodeObject;
查看code的成员Python提供了简单的封装,于是,我们可以直接查看这些成员。例子:
>>> c = compile("1+2", "test.py", "single") >>> c.co_argcount 0 >>> c.co_code b'd\x03\x00Fd\x02\x00S' >>> c.co_consts (1, 2, None, 3) >>> c.co_name '<module>' >>> c.co_filename 'test.py' 其中 co_code 就是字节码了:d\x03\x00Fd\x02\x00S 那么如何理解这些代码?? 字节码co_code 写成10进制:100 3 0 70 100 2 0 83
指令码定义在文件 Include/opcode.h 中。 不过这样阅读指令码真的很难受,幸好,python提供了 dis 模块 dis用它来看看前面的例子 >>> c = compile("1+2", "test.py", "single") >>> import dis >>> dis.dis(c) 1 0 LOAD_CONST 3 (3) 3 PRINT_EXPR 4 LOAD_CONST 2 (None) 7 RETURN_VALUE 恩,一目了然。最开始的那个1是行号,指令码前面的数字是它在co_code中的索引。 恩,dis 是很有用的东西,不过偶还没学会怎么利用它。 参考 |
|
来自: java_laq小馆 > 《Python》