This is a brain dump of stuff I've learned that I don't want to forget and hopefully will be useful for someone else.
I'm mainly going to cover the general Python development process I've used in small to medium projects. This might no perfect and can always be done better but it gives you a good base from where to start.
To make this a bit easier to digest as always I prepared a sandbox you can find here:
Note: I developed this sandbox in Linux but I'm happy to create a Windows version if there's demand for it. The principles are exactly the same, just some slight differences in paths when it comes to virtual environments, etc.
github.com/zom-pro/sandbox/tree/master/python/packaging
Bear in mind this project has been structured to show the details I want to talk in this blog. You might want to structure your project different but don't reinvent the wheel. If you want to look at a great examples, one of my favorite projects when it comes to structure and design is the requests library.
I'm mainly going to cover the general Python development process I've used in small to medium projects. This might no perfect and can always be done better but it gives you a good base from where to start.
To make this a bit easier to digest as always I prepared a sandbox you can find here:
Note: I developed this sandbox in Linux but I'm happy to create a Windows version if there's demand for it. The principles are exactly the same, just some slight differences in paths when it comes to virtual environments, etc.
github.com/zom-pro/sandbox/tree/master/python/packaging
Bear in mind this project has been structured to show the details I want to talk in this blog. You might want to structure your project different but don't reinvent the wheel. If you want to look at a great examples, one of my favorite projects when it comes to structure and design is the requests library.
First thing first. Virtual environments
You probably know a lot about these guys (and there's plenty of material around if you don't). Mainly you want to use it to isolate your Python environment and specially your dependencies. When I'm developing Python, I will keep two virtual environments running at all times.
One is my development environment where I run my unit tests, smoke/integration tests, etc. This will also be the interpreter I use to debug and general development tasks like building my packages. Take a look at my poor's man CI bash script (remember for demo purpose only, requires a lot more around the edges to make something usable out of it) here https://github.com/zom-pro/sandbox/blob/master/python/packaging/build.sh.
As you can see, I can pass my virtual environment path, build my package and then run my smoke test. It would be easy and recommendable to start adding quality gates. For example to run integration and unit tests before packaging and stop if they fail.
Just by doing this, you have a good workflow where you can make changes, run your test and make sure the package coming on the other side is working as expected.
A couple of interesting things to notice at this point are:
Then, when developing, I would have another virtual environment where I can deploy the packages I'm building. I'm not using one of these in my sandbox just to keep it simple but you want this environment to test your requirements are working as expected, etc.
One is my development environment where I run my unit tests, smoke/integration tests, etc. This will also be the interpreter I use to debug and general development tasks like building my packages. Take a look at my poor's man CI bash script (remember for demo purpose only, requires a lot more around the edges to make something usable out of it) here https://github.com/zom-pro/sandbox/blob/master/python/packaging/build.sh.
As you can see, I can pass my virtual environment path, build my package and then run my smoke test. It would be easy and recommendable to start adding quality gates. For example to run integration and unit tests before packaging and stop if they fail.
Just by doing this, you have a good workflow where you can make changes, run your test and make sure the package coming on the other side is working as expected.
A couple of interesting things to notice at this point are:
- Always make sure your pip, python path are pointing to the right place. When you activate a virtual environment, you will achieve this. However check! You can look at where is pip pointing by doing pip --version. Similarly, you can use sys.executable from within a python console to look at where is it pointing.
- Pip can install from local files (like I'm doing there) and from other network locations. You can sign your packages and include for example the MD5 signature as part of your requirements file to check automatically that the package is what you expect.
- Notice the trick I'm using to run the some test as if it would be from the command line (I like using entry points for command line, look in the setup.py file for more details) but be careful. If you use subprocess for example, it will get you out of the your current virtual environment. In general be careful with threads and make sure the context in Python is always where you expect it to be. You probably know this by now but getting that right in Python is one of the trickiest things to achieve.
Then, when developing, I would have another virtual environment where I can deploy the packages I'm building. I'm not using one of these in my sandbox just to keep it simple but you want this environment to test your requirements are working as expected, etc.
Let's talk about requirements(.txt)
I control requirements mainly in two places. One is at the setup.py file level where you can specify them. Here, the library will be aware of your requirements when you are installing it and will attempt to fetch them from a source if it knows them.
But what if your requirements are local, or you want to deploy from a local source (a good idea if you want to keep control over the security of your packages for example). Then you can use a combination of requirements.txt file and the bit in setup.py. You can hand crack your requirements.txt file or get it from a virtual environment where they are already in place by doing something like:
pip freeze > requirements.txt.
So the process would be, you create your requirements file and point the paths in the direction of your local requirements. Then install them in your virtual environment before you install the library you're development (that contain these requirements in setup.py). When it comes to install the library, the requirements would have been fulfiled.
Take a look here for more info about how setup.py and requirements interact.
Another cool feature of setuptools requirements is the extras_required (again in the setup.py file). This feature allows you to create flags so you can have different requirements for test, prod, etc. Very commonly you will have a unit test runner like nose but wouldn't make sense to make it a requirement of your production library.
You can also point pip to local sources by doing something like:
pip install mypackage --no-index --find-links file:///srv/pkg/mypackage
But what if your requirements are local, or you want to deploy from a local source (a good idea if you want to keep control over the security of your packages for example). Then you can use a combination of requirements.txt file and the bit in setup.py. You can hand crack your requirements.txt file or get it from a virtual environment where they are already in place by doing something like:
pip freeze > requirements.txt.
So the process would be, you create your requirements file and point the paths in the direction of your local requirements. Then install them in your virtual environment before you install the library you're development (that contain these requirements in setup.py). When it comes to install the library, the requirements would have been fulfiled.
Take a look here for more info about how setup.py and requirements interact.
Another cool feature of setuptools requirements is the extras_required (again in the setup.py file). This feature allows you to create flags so you can have different requirements for test, prod, etc. Very commonly you will have a unit test runner like nose but wouldn't make sense to make it a requirement of your production library.
You can also point pip to local sources by doing something like:
pip install mypackage --no-index --find-links file:///srv/pkg/mypackage
Script entry point
Entry points are really useful when delivering command line applications (normally small and contained). They also go very well with virtual environments. Take a look in the setup.py file again and look for the entry point of my sandbox. This guys will create an executable in bin or Scripts folder (depending Linux or Windows) that will be accessible from anywhere when that virtual environment is activated.