Lisp in Python
Paul Tagliamonte
(@paultag)
Lisp is a pretty great language. (no, really!)
It was one of the first high-level languages (like, ever)
I also really love Python. So much so, I don't really like using other
languages, since I always write API bindings (for my own projects)
in Python.
Thing is, I also really like Lisp (in particular, Clojure).
It's a pretty neat language full of some powerful ideas. It's a language
steeped in culture, and the de-facto language of AI.
So, might as well make them play nice, right?
So, I just spent my winter break hacking on Hy.
Hy is a few things. It's a front-end lexer (something that breaks
the code apart) and also a back-end Python AST generator.
This means, I can (basically) convert Lisp expressions into
Python.
Anyway, some of you might be wondering: why Lisp?
It's fully-functional, extremely regular, & simple. The language
is made up of so-called "s-expressions".
S-expressions look something like:
(foo bar (baz foo))
Some folks (such as myself) prefer the "look" of Lisp-style
languages, and it's commonly used in AI, so it's easier to "speak"
Lisp for some things.
Also, Macros. Sweet, sweet macros.
Before we go on, I should also note for the non-Lispers that in Lisp,
everything's done in Polish notation — that is to say,
1 + 1 + 2
is written as
+ 1 1 2
If you were to write that in an s-expression, it'd look a bit like
(+ 1 1 2)
I also added lists and maps (dicts) to Hy (not found in some variants)
[1 2 3 4]
{key1 value1
key2 value2}
(print [1 2 3 4 5])
(print {"one" "onekey"
"two" "twokey"})
You can even declare functions and stuff.
(defn square-number [x]
"This is a (valid!) docstring"
(* x x))
(def foo 2)
; and, you call it, like so:
(print (square-number 3))
I added support for for loops (some academics may prefer recusion with
TCO (tail call optimization), but Python has good for loops, so this
ends up just as fast and easy on RAM)
(def my-list ["this" "is" "a" "test"])
(for [x my-list]
(print x))
Imports, and index access work too
(import sys)
(print (get sys.argv 1))
Code branching also works pretty well.
(if (= 1 2)
(print "This is the 'true' branch")
(print "This is the 'false' branch"))
And, now, for some of the advanced type stuff it can do (for the
hardcore lispians)
(decorate-with foo (bar "baz")
(defn my-function [arg1]
" This is a decorated function "
(kwapply (.method-name obj arg1) {"kwarg1" "kwval1"})))
(my-function "foo!")
Given the following code block:
(def foo [1 2 3 4])
(print (get foo 100)) ; out of bounds
Even more magic, like proper tracebacks, totally work.
Traceback (most recent call last)
File "frob.hy", line 4, in module
(print (get foo 100)) ; out of bounds
IndexError
Even imports (really!) work. Hy can import Python as if it were lisp, and
Python can import Hy as if it were Python (after Hy is shimed in)
Hy can do this via some newish magic, via
PEP 302 and
some clever hackery.
PEP 302 allows for "New Import Hooks", which basically were intended to
be used to search non-standard places (like the Internet, in one example)
if the module isn't on the Python path.
Basically, my
meta-importer
(as it's called) just searches the normal
sys.path for modules and packages ending with .hy, just like the
built-in stuff searches for .py (or .pyc, etc). The loader-class is then
appended to sys.meta_path.
If it finds a .hy file, the lexer and AST translation routine(s) kick in,
and insert a completely fake AST into sys.modules, then returns it back
out.
Which is, yes, amazing. PEP 302 is pretty much the coolest thing, like,
ever.
The importer consists of two parts, the Importer and the Finder.
The finder is in charge of finding the file to import, and returns
a Loader object that is able to load that file into sys.modules. The
function looks a bit like:
finder.find_module(fullname, path=None)
where "fullname" is something like
os.path
The Loader object (which the finder returned up above) can be then invoked
and is expected to have the following signature:
loader.load_module(fullname)
(where "fullname" is again, something like
os.path)
The importer basically looks like:
class MetaImporter(HyFinder):
def find_module(self, fullname, path=None):
pth = self.find_on_path(fullname)
if pth is None:
return
return MetaLoader()
And the finder basically looks like:
class MetaLoader(HyFinder):
def load_module(self, fullname):
# [snip]
mod = _hy_import_file(pth, fullname)
ispkg = self.is_package(fullname)
mod.__file__ = pth
mod.__loader__ = self
mod.__name__ = fullname
if ispkg:
mod.__path__ = []
mod.__package__ = fullname
else:
mod.__package__ = fullname.rpartition('.')[0]
sys.modules[fullname] = mod
return mod
But, wait! Paul! You totally glossed over the AST stuff! Talk more about
that!
ASTs (or Abstract Syntax Trees) are the internal representation
if Python code.
They can get pretty fugly, but they're extremely powerful. Basically, this
means that Python (like Lisp before it, for the record) can
programmatically alter itself.
Given the following Python:
def hello_world(x):
print "Hello,", x
We get the following AST:
FunctionDef(
name='hello_world',
args=arguments(args=[Name(id='x', ctx=Param())],
vararg=None, kwarg=None, defaults=[]),
body=[
Print(dest=None, values=[Str(s='Hello,'),
Name(id='x', ctx=Load())], nl=True)
],
decorator_list=[])
Baiscally, Hy acts as an AST "compiler" just takes the parsed &
lexed Lisp, and "renders" it out to a Python AST.
To demystify some of what Hy is doing, let's take the following Hython
(lisp) expression:
(defn hello-world [x]
(print "Hello," x))
And check out the generated AST back in Python-land.
FunctionDef(
name='hello_world',
args=arguments(args=[Name(id='x', ctx=Param())],
vararg=None, kwarg=None, defaults=[]),
body=[
Print(dest=None, values=[Str(s='Hello,'),
Name(id='x', ctx=Load())], nl=True)
],
decorator_list=[])
(hint: It's the same as that last slide!)
Alright, so we've got ASTs, let's check out how the internals of Hy work
to generate the AST from Lexed Lisp
Basically, hy internal Lexed Objects (stuff like HYExpression, HYString,
HYBool, HYList, etc) get passed to Hy's AST Generator's "render" method,
and get passed to one of a few methods depending on it's type. (helps
keep things clear and neat, well, as neat as they can be).
Let's start simple -- the String generator:
# [snip]
import ast
# [snip]
def render_string(self, node):
# (node is a string, in this case)
return ast.Str(s=str(node))
# [snip]
Which produces (when given "foo!") the following block:
Str(s='foo!')
OK, let's get more advanced here, the List render-er.
# [snip]
import ast
# [snip]
def render_list(self, node):
ret = []
for c in node:
ret.append(self.render(c))
return ast.List(elts=ret, ctx=ast.Load())
# [snip]
Which produces (when given
[1, 2, "foo"]) the following block:
List(elts=[Num(n=1), Num(n=2), Str(s='foo')], ctx=Load())
Alright, let's move up to Dicts (slighty more hackey)
def render_map(self, node):
keys = []
values = []
for key in node:
keys.append(self.render(key))
values.append(self.render(node[key]))
return ast.Dict(keys=keys, values=values)
Which produces (when given
{"one": "oneval", "two": "twoval"})
the following block:
Dict(
keys=[Str(s='one'), Str(s='two')],
values=[Str(s='oneval'), Str(s='twoval')]
)
Now, here's the Symbol generator:
def render_symbol(self, node):
# (node is a string, in this case)
if "." in node:
glob, local = node.rsplit(".", 1)
ret = ast.Attribute(
value=self.render_symbol(glob),
attr=str(local),
ctx=ast.Load()
)
return ret
return ast.Name(id=str(node), ctx=ast.Load())
It produces something like (for "foo"):
Name(id='foo', ctx=Load())
Or (for "foo.bar"):
Attribute(value=Name(id='foo', ctx=Load()),
attr='bar', ctx=Load())
In addition to the basics, we also need to render expressions (function
calls, etc), and friends.
def render_expression(self, node):
# node is a list, like ['fn', 'arg', 'arg', 'arg']
inv = node.get_invocation()
if inv['function'] in self.native_cases:
return self.native_cases[inv['function']](node)
for key in self.startswith:
if inv['function'].startswith(key):
return self.startswith[key](node)
c = []
for child in node.get_children():
c.append(self.render(child))
if inv['function'] in special_cases:
return special_cases[inv['function']](node, c, self)
ret = value=ast.Call(
func=self.render_symbol(inv['function']),
args=c,
keywords=[],
starargs=None,
kwargs=None
)
return ret
Defining a function
def _defn(self, node):
inv = node.get_invocation()
args = inv['args']
name = args.pop(0)
sig = args.pop(0)
doc = None
for child in args:
c.append(self.render(child))
cont = c[-1]
body = cont if isinstance(cont, list) else [cont]
ret = ast.FunctionDef(
name=str(name),
args=ast.arguments(
args=[
ast.Name(arg=str(x), id=str(x), ctx=ast.Param()) for x in sig
],
vararg=None,
kwarg=None,
kwonlyargs=[],
kw_defaults=[],
defaults=[]
),
body=body,
decorator_list=[])
return ret
Now, let's go back and look at another bit of generated
AST off Lisp -- let's look at:
def square(number):
return number * number
Which produces this:
FunctionDef(
name='square',
args=arguments(
args=[Name(id='number', ctx=Param())],
vararg=None,
kwarg=None,
defaults=[]
),
body=[
Return(
value=BinOp(
left=Name(id='number', ctx=Load()),
op=Mult(),
right=Name(id='number', ctx=Load())
)
)
],
decorator_list=[]
)
Now, for how that decorator magic works:
(remember, a FunctionDef looks like:)
ast.FunctionDef(
name=str(name),
args=ast.arguments(
# [snip]
),
body=body,
decorator_list=[]
)
So, our decorate "macro" in Python does the following:
def _ast_decorate(self, node):
i = node.get_invocation()
c = i['args']
meth = c.pop(-1)
fn = self.render(meth)
for child in c:
fn.decorator_list.append(self.render(child))
return fn
The AST is converted to code via the built-in compile() function.
the compile() function takes an AST (or just plain Python), a filename,
and a "mode" (what to do with it), and returns a code object.
That code object is then passed to eval(), to let the Python VM evaluate
(and actually run!) the code.
The forge_module bits (which take Hython and evaulate it into
a Python module fit for use in sys.modules) looks like:
def forge_module(name, fpath, forest):
mod = imp.new_module(name)
mod.__file__ = fpath
ast = forge_ast(name, forest) # lex, produce AST
eval(compile(ast, fpath, "exec"), mod.__dict__) # compile, produce module
return mod
Quick questions (& a demo)?