In this post, we will explain how we added a Python feature into an existing Elixir app. We have also created a small Elixir app pushed to GitHub for you to download and have a closer look into what is needed.
Who we are and what we do
Stuart is a last mile delivery company. We connect businesses, customers and couriers to speed up the way goods are transported within cities.
Our clients mostly interact with us using our public REST API. However, there are certain clients with very specific requirements that need an ad-hoc solution.
This is where Monocle, our Elixir application, comes to the rescue. Monocle works as an API gateway: mapping and modifying client payloads and inserting them into our system.
One of these clients wanted the ability for us to group certain deliveries under certain conditions (proximity in time, places, etc).
This automatic stacking solution was a problem that our data team was happy to build. They soon started working on the algorithm and a Python implementation and we (the backend team) started looking for ways to glue Elixir and Python together.
Things we will cover:
- Different ways of connecting external apps : ErlPort, Pyrlang
- Use of Poolboy
- Our implementation
- Overview of a sample project
Ways of connecting Python programs into Elixir
We had a look at these 2 ways of connecting Python and Elixir:
- Using the Pyrlang library
- Using the ErlPort library
Pyrlang is a Python library which implements the Erlang distribution protocol and creates an Erlang-compatible node in your Erlang cluster.
ErlPort, on the other hand, is a library for Erlang which helps connect Erlang to a number of other programming languages. The library uses the Erlang port protocol to simplify connection between languages and the Erlang external term format to set the common data type mapping.
Using Pyrlang meant adding an Erlang node only for Python execution. We wanted to keep it simple and didn’t want to manage multiple nodes, so we went for the ErlPort approach.
First, we will need to add ErlPort to our mix.exs file:
And then just run “mix deps.get”.
We are installing our Python module in a directory called “python” inside our application’s private directory. Since our application is called monocle, our python app goes as follows:
Now all that’s left is to:
grab the returned “pid” and start calling functions in our Python module using MFA format: Module, function & argument.
Our Python module is called “optimize.py” (2nd parameter) and our Python function is also called “optimize” (3rd parameter), our fourth parameter is the argument we pass to the Python “optimize” function.
Since our Python function returns data, it is stored in the Elixir “result” variable, and we can now continue to process it within Elixir.
Every time we run Python code through ErlPort, ErlPort starts an OS process, this is rather inefficient and expensive. To avoid this trap, we went looking for some pooling mechanism and found the right tool for this: enter Poolboy!
From its own documentation, “Poolboy is a lightweight, generic pooling library for Erlang with a focus on simplicity, performance, and rock-solid disaster recovery.”
Poolboy gives us the ability to manage a pool of Elixir processes (which in our case are connected to Python OS processes) and then reuse them without starting a new process for each request.
We will need to add Poolboy to our mix.exs file:
Then, we will create our worker. We will go into the details later on. For now, here is the name :
And now the configuration under application.ex:
And now the configuration under application.ex:
Let’s go through all the configuration options:
- Name — a sub-config tuple
:local) The possible values are (
:via) — to determine where the pool is run
Pool Name (
:worker) — an atom to provide a unique name for the pool for further reference.
- Worker Module (
Monocle.AutoStacking.PythonWorker) — the name of the module that will act as the Poolboy worker for dealing with requests
- Size — the number of workers that can be running at any given time. Which in our case refers to a private function that grabs the value from an ENV variable.
- Max Overflow — the number of backup workers that can be used if existing workers are being utilized
Worker Module: Monocle.AutoStacking.PythonWorker
Our PythonWorker implements a GenServer. Let’s have a look first at the init callback:
If you recall from the usage of ErlPort in the section above, this init callback returns a tuple with :ok, and the process ID of the Python call for further use.
Now, in our “call” function (which is the function that is going go be called within the Elixir code) we make use of two interesting things:
- A call to a poolboy.transaction
- The Async/Await pattern.
A Poolboy transaction is the way Poolboy creates for us an automatic checkout of the Python process from the pool and a checkin when it’s finished.
The async/await pattern is a syntactic feature that allows an asynchronous, non-blocking function to be structured in a way similar to an ordinary synchronous function.
Here is the
handle_call callback goes as follows:
Our call to
PythonWorker inside the Elixir code goes like this:
Conclusion — TL;DR
The main application is coded in Elixir.
Our data team created a solution in Python that given a JSON returns calculations in another JSON.
We used ErlPort to be able to call this Python program and fetch its results.
We used Poolboy to spawn a pool of initial Python processes ready to do the work.
We’ve created a small Elixir application to demonstrate how easy it is to add Python.
You can find the repository here: https://github.com/martincabrera/python_elixir
Let’s have a look at some of the commits.
For the sake of simplicity, our Python feature is just a small Python module with 1 function that returns a string.
After commit 1e899c9, we can run an iex session and execute the following:
Then, after our Python Worker has been added, we could play in an iex session to achieve the same result, like this:
As you can see, we have added some logging to check when the GenServer starts and also when
handle_call is in use.
And finally, when we configure Poolboy (https://github.com/martincabrera/python_elixir/commit/be1298188b23f295e383f030ed7f4a8c05e93fe9) and run again an iex session, we could do the same, in this other way:
Notice how “Started python worker” appears 5 times, which are the connections we told Poolboy to create when starting. (See here.)
And that is all, I hope you enjoyed this post and learned something useful along the way! 🙂
A lot of inspiration was found reading these fine articles:
- Elixir, Poolboy, and Little’s Law
- Mixing Python with Elixir
- Mixing Python with Elixir II
- Elixir, Mix, Erlport, poolboy, and Anaconda Python
- Building a City Search with Elixir and Python
- ErlPort: Using Python from Erlang/LFE
- Mixing Python with Elixir with Export (Erlport)
Like what you see? We’re hiring! 🚀