Intro

Python and Static Analysis

/spoiler/: Should be Python or Static Analysis!

Tooling for Static Analysis of Python Programs

Serge « sans paille » Guelton

Compiler Engineer / Wood Craft Lover / RedHat employee

EuroPython'20 — 24th of July 2020

Prelude

For 7 years or so, I've playing with the Pythran project, building a DSL embedded in Python.

I've been building a bunch of tools in the process:

gast

https://github.com/serge-sans-paille/gast

['gast] f - good time girl

Tool started in 2016 and presented at PyConFR 2016

Originally meant to ease the transition of Pythran to Python3

Core component of AST manipulation:

> Downloads last day: 155,526

(don't trust numbers)

A Simple Expression Across the Ages

> python -c "import ast; \
             print(ast.dump(ast.parse('a[1, ...]')))"
...

In Python2.7

Module(body=[
    Expr(value=
            Subscript(value=Name(id='a', ctx=Load()),
                      slice=
                        ExtSlice(dims=
                                [Index(value=Num(n=1)), Ellipsis()]),
                      ctx=Load()))
])

In Python3.6

Module(body=[
    Expr(value=
            Subscript(value=Name(id='a', ctx=Load()),
                      slice=
                        Index(value=Tuple(elts=[Num(n=1), Ellipsis()],
                              ctx=Load())),
                      ctx=Load()))
])

In Python3.9

Module(body=[
    Expr(value=
            Subscript(value=Name(id='a', ctx=Load()),
                      slice=
                        Tuple(elts=[Constant(value=1), Constant(value=Ellipsis)],
                              ctx=Load()),
                      ctx=Load()))
], type_ignores=[])

With gast

Whatever the Python version

> python -c "import gast as ast; \
             print(ast.dump(ast.parse('a[1, ...]')))"
Module(body=[Expr(value=Subscript(value=Name(id='a', ctx=Load(),
annotation=None, type_comment=None), slice=Tuple(elts=[Constant(value=1,
kind=None), Constant(value=Ellipsis, kind=None)], ctx=Load()),
ctx=Load()))], type_ignores=[])

Tradeoffs

  1. Slightly more verbose than Python 3.9 because of Python2 compatibility
  2. Extra translation step, slight performance impact when parsing the world

beniget

https://github.com/serge-sans-paille/beniget

['beniget] adj. - blessed

Compute use-def chains for Python

Foundation of several Pythran analyses

About Use-Def Chains

From https://en.wikipedia.org/wiki/Use-define_chain

A Use-Definition Chain (UD Chain) is a data structure that consists of a use, U, of a variable, and all the definitions, D, of that variable that can reach that use without any other intervening definitions

Typical Usage

A def without a use means a useless def:

tip: _ is often used to state a useless assignment

Being Pedantic

In Python, one does not assign a value to a variable,

One sets an identifier on a value.

Tricky cases (0)

for i in l:
    if i:
        print(j)
    else:
        j = i

Is the print statement faulty?

Tricky cases (1)

def foo():
    global x
    x = 1

def bar():
    print(x)

Does calling bar raise an exception?

Tricky cases (2)

x = 1
for x in y:
    pass
print(x)

Which value is x bound to?

Application: a simple linter

for local_def in self.defuses.locals[node]:
    if local_def.users():
        continue

    if local_def.name() == "_":
        continue  # typical naming by-pass

    # [...]

    print(
        "W: '{}' is defined but not used at {}:{}:{}".format(
            local_def.name(),
            self.filename,
            location.lineno,
            location.col_offset,
        )
    )

Limitations: Did You Say Static?

eval(expr)
globals()[name] = 1

And as an extension, any method call…

memestra

https://github.com/QuantStack/memestra

['memestra] adv. - Oh, please!

Memestra checks code for places where deprecated functions are called.

How would you do that after that talk?

Finding Deprecated Usage

Simple!

  1. Track a given decorator usage
  2. Track usage of decorated definitions
  3. Print

Example

> cat test.py
import decorator

@decorator.deprecated
def foo(): pass

def bar():
    foo()
foo()

> python memestra.py test.py
foo used at test.py:7:5
foo used at test.py:9:1

Cross-Module Exploration

When we import a function from a module, is that function deprecated?

→ Statically resolve imports, and walk them recursively

→ Quickly end-up parsing hundreds of Python package

→ Use a caching mechanism

Advertising Deprecated Usage

> pip install deprecated
from deprecated import deprecated
@deprecated(reason="You should use another function")
def some_old_function(x, y):
    pass

Limitations: Did You Say Typing?

class Foo:

    @deprecated
    def foo():
        pass

def bar(f):
    return f.foo()  # Is this call deprecated?

Postlude

o/ Thanks Sylvain Corlay for the memestra idea, Mariana Meierles for the code reviews and Lancelot Six for the proof reading \o

1
SpaceForward
Right, Down, Page DownNext slide
Left, Up, Page UpPrevious slide
GGo to slide number
POpen presenter console
HToggle this help