I joined iib in January and since the beginning it was clear that I’d have a lot of fun automating a few manual process. The development team is formed by three persons: the CTO, another developer and me. The team is small but step by step we’re going towards the technical excellence we want. :)
One of the things that needed automation was the release of an internal library. The process to install it was basically git pull
+ python setup.py install
. Since we have many other things to do, this problem was waiting a little longer to be solved. But the things became ugly when I had to setup a test pipeline for Airflow (must say that now I understand why the most famous blog post on this topic is called Data’s Inferno: 7 Circles of Data Testing Hell with Airflow - but this discussion I’ll leave for another day).
At iib we use a self-hosted Gitlab instance, recently updated to the version 13. Gitlab offers a private Package Registry out of the box. The package registry was there, we had our project using poetry and a test pipeline. Publishing the package would be a piece of cake, right?
Where is this Package Registry after all?
I’m going to start by the most obvious thing that I took a few minutes to figure: you have to enable the Package Registry in your project in order to use it. Go to General > Visibility, project features, permissions > Repository > Packages.
Forbidden
As you can imagine, you’ll need the right permissions to upload the package to your own pypi. This can be done using CI_JOB_TOKEN
, an variable provided by Gitlab during a pipeline run. Below you can see how we did it:
release:
stage: deploy
image: python:3.8
script:
- pip install poetry twine
- poetry version ${BUMP_RULE} # missing a step to commit this change!
- poetry build
- TWINE_PASSWORD=${CI_JOB_TOKEN} TWINE_USERNAME=gitlab-ci-token python -m twine upload --verbose --repository-url https://gitlab.com/api/v4/projects/${CI_PROJECT_ID}/packages/pypi dist/*
only:
- tags
The username for the token `CI_JOB_TOKEN
is gitlab-ci-token
. Another important variable provided during the pipeline execution is CI_PROJECT_ID
(you can see the number in the repository page too).
Automation FTW
The thing that got me really excited was the automation flow for our releases. With the job shown previously, we only needed to create a new tag and then a new version would be released. I added BUMP_RULE
as an environment variable because we can control the rule used by just changing the repository environment variable. Poetry provides different options (following the SEMVER style).
Poetry publish #fail
Note that we used Twine instead of Poetry. Unfortunately it wasn’t possible to use poetry publish
due to a different errors:
poetry publish --repository gitlab -u gitlab-ci-token -p $env.CI_JOB_TOKEN
RuntimeError Repository gitlab is not defined
(when having it configured inpyproject.toml
)UploadError HTTP Error 404: Not Found
(when configuring the repository via CLI)
There are open issues for it, so I hope it will be possible in the future. By now, Twine is doing the trick just fine.
pip install
Since the package is published in the private package registry, we’ll need to adapt the command a little bit.
A trick here is: you’ll need the index URL and the extra index URL for the dependencies that are coming from Pypi. In the end the command will look like this:
pip install \
--index-url https://$PERSONAL_ACCESS_TOKEN_NAME:$PERSONAL_ACCESS_TOKEN@gitlab.com/api/v4/projects/44/packages/pypi/simple \
--extra-index-url https://pypi.org/simple \
mypackage
Without the --extra-index-url
you’ll get errors like ERROR: Could not find a version that satisfies the requirement
.
Unfortunately, Gitlab’s documentation is missing some details but you can find their step by step here.
Hope it helps! See ya next time.
comments powered by Disqus