| Compiler design |
| =============== |
| |
| Abstract |
| -------- |
| |
| In CPython, the compilation from source code to bytecode involves several steps: |
| |
| 1. Tokenize the source code [Parser/lexer/](../Parser/lexer) |
| and [Parser/tokenizer/](../Parser/tokenizer). |
| 2. Parse the stream of tokens into an Abstract Syntax Tree |
| [Parser/parser.c](../Parser/parser.c). |
| 3. Transform AST into an instruction sequence |
| [Python/compile.c](../Python/compile.c). |
| 4. Construct a Control Flow Graph and apply optimizations to it |
| [Python/flowgraph.c](../Python/flowgraph.c). |
| 5. Emit bytecode based on the Control Flow Graph |
| [Python/assemble.c](../Python/assemble.c). |
| |
| This document outlines how these steps of the process work. |
| |
| This document only describes parsing in enough depth to explain what is needed |
| for understanding compilation. This document provides a detailed, though not |
| exhaustive, view of the how the entire system works. You will most likely need |
| to read some source code to have an exact understanding of all details. |
| |
| |
| Parsing |
| ======= |
| |
| As of Python 3.9, Python's parser is a PEG parser of a somewhat |
| unusual design. It is unusual in the sense that the parser's input is a stream |
| of tokens rather than a stream of characters which is more common with PEG |
| parsers. |
| |
| The grammar file for Python can be found in |
| [Grammar/python.gram](../Grammar/python.gram). |
| The definitions for literal tokens (such as `:`, numbers, etc.) can be found in |
| [Grammar/Tokens](../Grammar/Tokens). Various C files, including |
| [Parser/parser.c](../Parser/parser.c) are generated from these. |
| |
| See Also: |
| |
| * [Guide to the parser](parser.md) |
| for a detailed description of the parser. |
| |
| * [Changing CPython’s grammar](changing_grammar.md) |
| for a detailed description of the grammar. |
| |
| |
| Abstract syntax trees (AST) |
| =========================== |
| |
| |
| The abstract syntax tree (AST) is a high-level representation of the |
| program structure without the necessity of containing the source code; |
| it can be thought of as an abstract representation of the source code. The |
| specification of the AST nodes is specified using the Zephyr Abstract |
| Syntax Definition Language (ASDL) [^1], [^2]. |
| |
| The definition of the AST nodes for Python is found in the file |
| [Parser/Python.asdl](../Parser/Python.asdl). |
| |
| Each AST node (representing statements, expressions, and several |
| specialized types, like list comprehensions and exception handlers) is |
| defined by the ASDL. Most definitions in the AST correspond to a |
| particular source construct, such as an 'if' statement or an attribute |
| lookup. The definition is independent of its realization in any |
| particular programming language. |
| |
| The following fragment of the Python ASDL construct demonstrates the |
| approach and syntax: |
| |
| ``` |
| module Python |
| { |
| stmt = FunctionDef(identifier name, arguments args, stmt* body, |
| expr* decorators) |
| | Return(expr? value) | Yield(expr? value) |
| attributes (int lineno) |
| } |
| ``` |
| |
| The preceding example describes two different kinds of statements and an |
| expression: function definitions, return statements, and yield expressions. |
| All three kinds are considered of type `stmt` as shown by `|` separating |
| the various kinds. They all take arguments of various kinds and amounts. |
| |
| Modifiers on the argument type specify the number of values needed; `?` |
| means it is optional, `*` means 0 or more, while no modifier means only one |
| value for the argument and it is required. `FunctionDef`, for instance, |
| takes an `identifier` for the *name*, `arguments` for *args*, zero or more |
| `stmt` arguments for *body*, and zero or more `expr` arguments for |
| *decorators*. |
| |
| Do notice that something like 'arguments', which is a node type, is |
| represented as a single AST node and not as a sequence of nodes as with |
| stmt as one might expect. |
| |
| All three kinds also have an 'attributes' argument; this is shown by the |
| fact that 'attributes' lacks a '|' before it. |
| |
| The statement definitions above generate the following C structure type: |
| |
| |
| ``` |
| typedef struct _stmt *stmt_ty; |
| |
| struct _stmt { |
| enum { FunctionDef_kind=1, Return_kind=2, Yield_kind=3 } kind; |
| union { |
| struct { |
| identifier name; |
| arguments_ty args; |
| asdl_seq *body; |
| } FunctionDef; |
| |
| struct { |
| expr_ty value; |
| } Return; |
| |
| struct { |
| expr_ty value; |
| } Yield; |
| } v; |
| int lineno; |
| } |
| ``` |
| |
| Also generated are a series of constructor functions that allocate (in |
| this case) a `stmt_ty` struct with the appropriate initialization. The |
| `kind` field specifies which component of the union is initialized. The |
| `FunctionDef()` constructor function sets 'kind' to `FunctionDef_kind` and |
| initializes the *name*, *args*, *body*, and *attributes* fields. |
| |
| See also [Green Tree Snakes - The missing Python AST docs]( |
| https://greentreesnakes.readthedocs.io/en/latest) by Thomas Kluyver. |
| |
| Memory management |
| ================= |
| |
| Before discussing the actual implementation of the compiler, a discussion of |
| how memory is handled is in order. To make memory management simple, an **arena** |
| is used that pools memory in a single location for easy |
| allocation and removal. This enables the removal of explicit memory |
| deallocation. Because memory allocation for all needed memory in the compiler |
| registers that memory with the arena, a single call to free the arena is all |
| that is needed to completely free all memory used by the compiler. |
| |
| In general, unless you are working on the critical core of the compiler, memory |
| management can be completely ignored. But if you are working at either the |
| very beginning of the compiler or the end, you need to care about how the arena |
| works. All code relating to the arena is in either |
| [Include/internal/pycore_pyarena.h](../Include/internal/pycore_pyarena.h) |
| or [Python/pyarena.c](../Python/pyarena.c). |
| |
| `PyArena_New()` will create a new arena. The returned `PyArena` structure |
| will store pointers to all memory given to it. This does the bookkeeping of |
| what memory needs to be freed when the compiler is finished with the memory it |
| used. That freeing is done with `PyArena_Free()`. This only needs to be |
| called in strategic areas where the compiler exits. |
| |
| As stated above, in general you should not have to worry about memory |
| management when working on the compiler. The technical details of memory |
| management have been designed to be hidden from you for most cases. |
| |
| The only exception comes about when managing a PyObject. Since the rest |
| of Python uses reference counting, there is extra support added |
| to the arena to cleanup each PyObject that was allocated. These cases |
| are very rare. However, if you've allocated a PyObject, you must tell |
| the arena about it by calling `PyArena_AddPyObject()`. |
| |
| |
| Source code to AST |
| ================== |
| |
| The AST is generated from source code using the function |
| `_PyParser_ASTFromString()` or `_PyParser_ASTFromFile()` |
| [Parser/peg_api.c](../Parser/peg_api.c). |
| |
| After some checks, a helper function in |
| [Parser/parser.c](../Parser/parser.c) |
| begins applying production rules on the source code it receives; converting source |
| code to tokens and matching these tokens recursively to their corresponding rule. The |
| production rule's corresponding rule function is called on every match. These rule |
| functions follow the format `xx_rule`. Where *xx* is the grammar rule |
| that the function handles and is automatically derived from |
| [Grammar/python.gram](../Grammar/python.gram) by |
| [Tools/peg_generator/pegen/c_generator.py](../Tools/peg_generator/pegen/c_generator.py). |
| |
| Each rule function in turn creates an AST node as it goes along. It does this |
| by allocating all the new nodes it needs, calling the proper AST node creation |
| functions for any required supporting functions and connecting them as needed. |
| This continues until all nonterminal symbols are replaced with terminals. If an |
| error occurs, the rule functions backtrack and try another rule function. If |
| there are no more rules, an error is set and the parsing ends. |
| |
| The AST node creation helper functions have the name `_PyAST_{xx}` |
| where *xx* is the AST node that the function creates. These are defined by the |
| ASDL grammar and contained in [Python/Python-ast.c](../Python/Python-ast.c) |
| (which is generated by [Parser/asdl_c.py](../Parser/asdl_c.py) |
| from [Parser/Python.asdl](../Parser/Python.asdl)). |
| This all leads to a sequence of AST nodes stored in `asdl_seq` structs. |
| |
| To demonstrate everything explained so far, here's the |
| rule function responsible for a simple named import statement such as |
| `import sys`. Note that error-checking and debugging code has been |
| omitted. Removed parts are represented by `...`. |
| Furthermore, some comments have been added for explanation. These comments |
| may not be present in the actual code. |
| |
| |
| ``` |
| // This is the production rule (from python.gram) the rule function |
| // corresponds to: |
| // import_name: 'import' dotted_as_names |
| static stmt_ty |
| import_name_rule(Parser *p) |
| { |
| ... |
| stmt_ty _res = NULL; |
| { // 'import' dotted_as_names |
| ... |
| Token * _keyword; |
| asdl_alias_seq* a; |
| // The tokenizing steps. |
| if ( |
| (_keyword = _PyPegen_expect_token(p, 513)) // token='import' |
| && |
| (a = dotted_as_names_rule(p)) // dotted_as_names |
| ) |
| { |
| ... |
| // Generate an AST for the import statement. |
| _res = _PyAST_Import ( a , ...); |
| ... |
| goto done; |
| } |
| ... |
| } |
| _res = NULL; |
| done: |
| ... |
| return _res; |
| } |
| ``` |
| |
| |
| To improve backtracking performance, some rules (chosen by applying a |
| `(memo)` flag in the grammar file) are memoized. Each rule function checks if |
| a memoized version exists and returns that if so, else it continues in the |
| manner stated in the previous paragraphs. |
| |
| There are macros for creating and using `asdl_xx_seq *` types, where *xx* is |
| a type of the ASDL sequence. Three main types are defined |
| manually -- `generic`, `identifier` and `int`. These types are found in |
| [Python/asdl.c](../Python/asdl.c) and its corresponding header file |
| [Include/internal/pycore_asdl.h](../Include/internal/pycore_asdl.h). |
| Functions and macros for creating `asdl_xx_seq *` types are as follows: |
| |
| * `_Py_asdl_generic_seq_new(Py_ssize_t, PyArena *)`: |
| Allocate memory for an `asdl_generic_seq` of the specified length |
| * `_Py_asdl_identifier_seq_new(Py_ssize_t, PyArena *)`: |
| Allocate memory for an `asdl_identifier_seq` of the specified length |
| * `_Py_asdl_int_seq_new(Py_ssize_t, PyArena *)`: |
| Allocate memory for an `asdl_int_seq` of the specified length |
| |
| In addition to the three types mentioned above, some ASDL sequence types are |
| automatically generated by [Parser/asdl_c.py](../Parser/asdl_c.py) and found in |
| [Include/internal/pycore_ast.h](../Include/internal/pycore_ast.h). |
| Macros for using both manually defined and automatically generated ASDL |
| sequence types are as follows: |
| |
| * `asdl_seq_GET(asdl_xx_seq *, int)`: |
| Get item held at a specific position in an `asdl_xx_seq` |
| * `asdl_seq_SET(asdl_xx_seq *, int, stmt_ty)`: |
| Set a specific index in an `asdl_xx_seq` to the specified value |
| |
| Untyped counterparts exist for some of the typed macros. These are useful |
| when a function needs to manipulate a generic ASDL sequence: |
| |
| * `asdl_seq_GET_UNTYPED(asdl_seq *, int)`: |
| Get item held at a specific position in an `asdl_seq` |
| * `asdl_seq_SET_UNTYPED(asdl_seq *, int, stmt_ty)`: |
| Set a specific index in an `asdl_seq` to the specified value |
| * `asdl_seq_LEN(asdl_seq *)`: |
| Return the length of an `asdl_seq` or `asdl_xx_seq` |
| |
| Note that typed macros and functions are recommended over their untyped |
| counterparts. Typed macros carry out checks in debug mode and aid |
| debugging errors caused by incorrectly casting from `void *`. |
| |
| If you are working with statements, you must also worry about keeping |
| track of what line number generated the statement. Currently the line |
| number is passed as the last parameter to each `stmt_ty` function. |
| |
| See also [PEP 617: New PEG parser for CPython](https://peps.python.org/pep-0617/). |
| |
| |
| Control flow graphs |
| =================== |
| |
| A **control flow graph** (often referenced by its acronym, **CFG**) is a |
| directed graph that models the flow of a program. A node of a CFG is |
| not an individual bytecode instruction, but instead represents a |
| sequence of bytecode instructions that always execute sequentially. |
| Each node is called a *basic block* and must always execute from |
| start to finish, with a single entry point at the beginning and a |
| single exit point at the end. If some bytecode instruction *a* needs |
| to jump to some other bytecode instruction *b*, then *a* must occur at |
| the end of its basic block, and *b* must occur at the start of its |
| basic block. |
| |
| As an example, consider the following code snippet: |
| |
| ```python |
| if x < 10: |
| f1() |
| f2() |
| else: |
| g() |
| end() |
| ``` |
| |
| The `x < 10` guard is represented by its own basic block that |
| compares `x` with `10` and then ends in a conditional jump based on |
| the result of the comparison. This conditional jump allows the block |
| to point to both the body of the `if` and the body of the `else`. The |
| `if` basic block contains the `f1()` and `f2()` calls and points to |
| the `end()` basic block. The `else` basic block contains the `g()` |
| call and similarly points to the `end()` block. |
| |
| Note that more complex code in the guard, the `if` body, or the `else` |
| body may be represented by multiple basic blocks. For instance, |
| short-circuiting boolean logic in a guard like `if x or y:` |
| will produce one basic block that tests the truth value of `x` |
| and then points both (1) to the start of the `if` body and (2) to |
| a different basic block that tests the truth value of y. |
| |
| CFGs are useful as an intermediate representation of the code because |
| they are a convenient data structure for optimizations. |
| |
| AST to CFG to bytecode |
| ====================== |
| |
| The conversion of an `AST` to bytecode is initiated by a call to the function |
| `_PyAST_Compile()` in [Python/compile.c](../Python/compile.c). |
| |
| The first step is to construct the symbol table. This is implemented by |
| `_PySymtable_Build()` in [Python/symtable.c](../Python/symtable.c). |
| This function begins by entering the starting code block for the AST (passed-in) |
| and then calling the proper `symtable_visit_{xx}` function (with *xx* being the |
| AST node type). Next, the AST tree is walked with the various code blocks that |
| delineate the reach of a local variable as blocks are entered and exited using |
| `symtable_enter_block()` and `symtable_exit_block()`, respectively. |
| |
| Once the symbol table is created, the `AST` is transformed by `compiler_codegen()` |
| in [Python/compile.c](../Python/compile.c) into a sequence of pseudo instructions. |
| These are similar to bytecode, but in some cases they are more abstract, and are |
| resolved later into actual bytecode. The construction of this instruction sequence |
| is handled by several functions that break the task down by various AST node types. |
| The functions are all named `compiler_visit_{xx}` where *xx* is the name of the node |
| type (such as `stmt`, `expr`, etc.). Each function receives a `struct compiler *` |
| and `{xx}_ty` where *xx* is the AST node type. Typically these functions |
| consist of a large 'switch' statement, branching based on the kind of |
| node type passed to it. Simple things are handled inline in the |
| 'switch' statement with more complex transformations farmed out to other |
| functions named `compiler_{xx}` with *xx* being a descriptive name of what is |
| being handled. |
| |
| When transforming an arbitrary AST node, use the `VISIT()` macro. |
| The appropriate `compiler_visit_{xx}` function is called, based on the value |
| passed in for <node type> (so `VISIT({c}, expr, {node})` calls |
| `compiler_visit_expr({c}, {node})`). The `VISIT_SEQ()` macro is very similar, |
| but is called on AST node sequences (those values that were created as |
| arguments to a node that used the '*' modifier). |
| |
| Emission of bytecode is handled by the following macros: |
| |
| * `ADDOP(struct compiler *, location, int)`: |
| add a specified opcode |
| * `ADDOP_IN_SCOPE(struct compiler *, location, int)`: |
| like `ADDOP`, but also exits current scope; used for adding return value |
| opcodes in lambdas and closures |
| * `ADDOP_I(struct compiler *, location, int, Py_ssize_t)`: |
| add an opcode that takes an integer argument |
| * `ADDOP_O(struct compiler *, location, int, PyObject *, TYPE)`: |
| add an opcode with the proper argument based on the position of the |
| specified PyObject in PyObject sequence object, but with no handling of |
| mangled names; used for when you |
| need to do named lookups of objects such as globals, consts, or |
| parameters where name mangling is not possible and the scope of the |
| name is known; *TYPE* is the name of PyObject sequence |
| (`names` or `varnames`) |
| * `ADDOP_N(struct compiler *, location, int, PyObject *, TYPE)`: |
| just like `ADDOP_O`, but steals a reference to PyObject |
| * `ADDOP_NAME(struct compiler *, location, int, PyObject *, TYPE)`: |
| just like `ADDOP_O`, but name mangling is also handled; used for |
| attribute loading or importing based on name |
| * `ADDOP_LOAD_CONST(struct compiler *, location, PyObject *)`: |
| add the `LOAD_CONST` opcode with the proper argument based on the |
| position of the specified PyObject in the consts table. |
| * `ADDOP_LOAD_CONST_NEW(struct compiler *, location, PyObject *)`: |
| just like `ADDOP_LOAD_CONST`, but steals a reference to PyObject |
| * `ADDOP_JUMP(struct compiler *, location, int, basicblock *)`: |
| create a jump to a basic block |
| |
| The `location` argument is a struct with the source location to be |
| associated with this instruction. It is typically extracted from an |
| `AST` node with the `LOC` macro. The `NO_LOCATION` can be used |
| for *synthetic* instructions, which we do not associate with a line |
| number at this stage. For example, the implicit `return None` |
| which is added at the end of a function is not associated with any |
| line in the source code. |
| |
| There are several helper functions that will emit pseudo-instructions |
| and are named `compiler_{xx}()` where *xx* is what the function helps |
| with (`list`, `boolop`, etc.). A rather useful one is `compiler_nameop()`. |
| This function looks up the scope of a variable and, based on the |
| expression context, emits the proper opcode to load, store, or delete |
| the variable. |
| |
| Once the instruction sequence is created, it is transformed into a CFG |
| by `_PyCfg_FromInstructionSequence()`. Then `_PyCfg_OptimizeCodeUnit()` |
| applies various peephole optimizations, and |
| `_PyCfg_OptimizedCfgToInstructionSequence()` converts the optimized `CFG` |
| back into an instruction sequence. These conversions and optimizations are |
| implemented in [Python/flowgraph.c](../Python/flowgraph.c). |
| |
| Finally, the sequence of pseudo-instructions is converted into actual |
| bytecode. This includes transforming pseudo instructions into actual instructions, |
| converting jump targets from logical labels to relative offsets, and |
| construction of the [exception table](exception_handling.md) and |
| [locations table](code_objects.md#source-code-locations). |
| The bytecode and tables are then wrapped into a `PyCodeObject` along with additional |
| metadata, including the `consts` and `names` arrays, information about function |
| reference to the source code (filename, etc). All of this is implemented by |
| `_PyAssemble_MakeCodeObject()` in [Python/assemble.c](../Python/assemble.c). |
| |
| |
| Code objects |
| ============ |
| |
| The result of `_PyAST_Compile()` is a `PyCodeObject` which is defined in |
| [Include/cpython/code.h](../Include/cpython/code.h). |
| And with that you now have executable Python bytecode! |
| |
| The code objects (byte code) are executed in `_PyEval_EvalFrameDefault()` |
| in [Python/ceval.c](../Python/ceval.c). |
| |
| Important files |
| =============== |
| |
| * [Parser/](../Parser) |
| |
| * [Parser/Python.asdl](../Parser/Python.asdl): |
| ASDL syntax file. |
| |
| * [Parser/asdl.py](../Parser/asdl.py): |
| Parser for ASDL definition files. |
| Reads in an ASDL description and parses it into an AST that describes it. |
| |
| * [Parser/asdl_c.py](../Parser/asdl_c.py): |
| Generate C code from an ASDL description. Generates |
| [Python/Python-ast.c](../Python/Python-ast.c) and |
| [Include/internal/pycore_ast.h](../Include/internal/pycore_ast.h). |
| |
| * [Parser/parser.c](../Parser/parser.c): |
| The new PEG parser introduced in Python 3.9. Generated by |
| [Tools/peg_generator/pegen/c_generator.py](../Tools/peg_generator/pegen/c_generator.py) |
| from the grammar [Grammar/python.gram](../Grammar/python.gram). |
| Creates the AST from source code. Rule functions for their corresponding production |
| rules are found here. |
| |
| * [Parser/peg_api.c](../Parser/peg_api.c): |
| Contains high-level functions which are used by the interpreter to create |
| an AST from source code. |
| |
| * [Parser/pegen.c](../Parser/pegen.c): |
| Contains helper functions which are used by functions in |
| [Parser/parser.c](../Parser/parser.c) to construct the AST. Also contains |
| helper functions which help raise better error messages when parsing source code. |
| |
| * [Parser/pegen.h](../Parser/pegen.h): |
| Header file for the corresponding [Parser/pegen.c](../Parser/pegen.c). |
| Also contains definitions of the `Parser` and `Token` structs. |
| |
| * [Python/](../Python) |
| |
| * [Python/Python-ast.c](../Python/Python-ast.c): |
| Creates C structs corresponding to the ASDL types. Also contains code for |
| marshalling AST nodes (core ASDL types have marshalling code in |
| [Python/asdl.c](../Python/asdl.c)). |
| File automatically generated by [Parser/asdl_c.py](../Parser/asdl_c.py). |
| This file must be committed separately after every grammar change |
| is committed since the `__version__` value is set to the latest |
| grammar change revision number. |
| |
| * [Python/asdl.c](../Python/asdl.c): |
| Contains code to handle the ASDL sequence type. |
| Also has code to handle marshalling the core ASDL types, such as number |
| and identifier. Used by [Python/Python-ast.c](../Python/Python-ast.c) |
| for marshalling AST nodes. |
| |
| * [Python/ast.c](../Python/ast.c): |
| Used for validating the AST. |
| |
| * [Python/ast_preprocess.c](../Python/ast_preprocess.c): |
| Preprocesses the AST before compiling. |
| |
| * [Python/ast_unparse.c](../Python/ast_unparse.c): |
| Converts the AST expression node back into a string (for string annotations). |
| |
| * [Python/ceval.c](../Python/ceval.c): |
| Executes byte code (aka, eval loop). |
| |
| * [Python/symtable.c](../Python/symtable.c): |
| Generates a symbol table from AST. |
| |
| * [Python/pyarena.c](../Python/pyarena.c): |
| Implementation of the arena memory manager. |
| |
| * [Python/compile.c](../Python/compile.c): |
| Emits pseudo bytecode based on the AST. |
| |
| * [Python/flowgraph.c](../Python/flowgraph.c): |
| Implements peephole optimizations. |
| |
| * [Python/assemble.c](../Python/assemble.c): |
| Constructs a code object from a sequence of pseudo instructions. |
| |
| * [Python/instruction_sequence.c](../Python/instruction_sequence.c): |
| A data structure representing a sequence of bytecode-like pseudo-instructions. |
| |
| * [Include/](../Include) |
| |
| * [Include/cpython/code.h](../Include/cpython/code.h) |
| : Header file for [Objects/codeobject.c](../Objects/codeobject.c); |
| contains definition of `PyCodeObject`. |
| |
| * [Include/opcode.h](../Include/opcode.h) |
| : One of the files that must be modified whenever |
| [Lib/opcode.py](../Lib/opcode.py) is. |
| |
| * [Include/internal/pycore_ast.h](../Include/internal/pycore_ast.h) |
| : Contains the actual definitions of the C structs as generated by |
| [Python/Python-ast.c](../Python/Python-ast.c). |
| Automatically generated by [Parser/asdl_c.py](../Parser/asdl_c.py). |
| |
| * [Include/internal/pycore_asdl.h](../Include/internal/pycore_asdl.h) |
| : Header for the corresponding [Python/ast.c](../Python/ast.c). |
| |
| * [Include/internal/pycore_ast.h](../Include/internal/pycore_ast.h) |
| : Declares `_PyAST_Validate()` external (from [Python/ast.c](../Python/ast.c)). |
| |
| * [Include/internal/pycore_symtable.h](../Include/internal/pycore_symtable.h) |
| : Header for [Python/symtable.c](../Python/symtable.c). |
| `struct symtable` and `PySTEntryObject` are defined here. |
| |
| * [Include/internal/pycore_parser.h](../Include/internal/pycore_parser.h) |
| : Header for the corresponding [Parser/peg_api.c](../Parser/peg_api.c). |
| |
| * [Include/internal/pycore_pyarena.h](../Include/internal/pycore_pyarena.h) |
| : Header file for the corresponding [Python/pyarena.c](../Python/pyarena.c). |
| |
| * [Include/opcode_ids.h](../Include/opcode_ids.h) |
| : List of opcodes. Generated from [Python/bytecodes.c](../Python/bytecodes.c) |
| by |
| [Tools/cases_generator/opcode_id_generator.py](../Tools/cases_generator/opcode_id_generator.py). |
| |
| * [Objects/](../Objects) |
| |
| * [Objects/codeobject.c](../Objects/codeobject.c) |
| : Contains PyCodeObject-related code. |
| |
| * [Objects/frameobject.c](../Objects/frameobject.c) |
| : Contains the `frame_setlineno()` function which should determine whether it is allowed |
| to make a jump between two points in a bytecode. |
| |
| * [Lib/](../Lib) |
| |
| * [Lib/opcode.py](../Lib/opcode.py) |
| : opcode utilities exposed to Python. |
| |
| * [Include/core/pycore_magic_number.h](../Include/internal/pycore_magic_number.h) |
| : Home of the magic number (named `MAGIC_NUMBER`) for bytecode versioning. |
| |
| |
| Objects |
| ======= |
| |
| * [Locations](code_objects.md#source-code-locations): Describes the location table |
| * [Frames](frames.md): Describes frames and the frame stack |
| * [Objects/object_layout.md](../Objects/object_layout.md): Describes object layout for 3.11 and later |
| * [Exception Handling](exception_handling.md): Describes the exception table |
| |
| |
| References |
| ========== |
| |
| [^1]: Daniel C. Wang, Andrew W. Appel, Jeff L. Korn, and Chris |
| S. Serra. `The Zephyr Abstract Syntax Description Language.`_ |
| In Proceedings of the Conference on Domain-Specific Languages, |
| pp. 213--227, 1997. |
| |
| [^2]: The Zephyr Abstract Syntax Description Language.: |
| https://www.cs.princeton.edu/research/techreps/254 |