Hacking on cloud-init¶
This document describes how to contribute changes to cloud-init.
It assumes you have a GitHub account, and refers to your GitHub user
Submitting your first pull request¶
Follow these steps to submit your first pull request to cloud-init:
To contribute to cloud-init, you must sign the Canonical contributor license agreement
- If you have already signed it as an individual, your Launchpad user will be listed in the contributor-agreement-canonical group. (Unfortunately there is no easy way to check if an organization or company you are doing work for has signed.)
- When signing it:
- ensure that you fill in the GitHub username field.
- when prompted for ‘Project contact’ or ‘Canonical Project Manager’, enter ‘Rick Harding’.
- If your company has signed the CLA for you, please contact us to help in verifying which Launchpad/GitHub accounts are associated with the company.
- For any questions or help with the process, please email Rick Harding with the subject, “Cloud-Init CLA”
- You also may contact user
#cloud-initchannel on the Freenode IRC network.
Configure git with your email and name for commit messages.
Your name will appear in commit messages and will also be used in changelogs or release notes. Give yourself credit!:
git config user.name "Your Name" git config user.email "Your Email"
Sign into your GitHub account
Fork the upstream repository on Github and clicking on the
Create a new remote pointing to your personal GitHub repository.
git clone git://github.com/canonical/cloud-init cd cloud-init git remote add GH_USER firstname.lastname@example.org:GH_USER/cloud-init.git git push GH_USER master
Read through the cloud-init Code Review Process, so you understand how your changes will end up in cloud-init’s codebase.
Submit your first cloud-init pull request, adding yourself to the in-repository list that we use to track CLA signatures: tools/.github-cla-signers
- See PR #344 and PR #345 for examples of what this pull request should look like.
- Note that
.github-cla-signersis sorted alphabetically.
- (If you already have a change that you want to submit, you can
also include the change to
tools/.github-cla-signersin that pull request, there is no need for two separate PRs.)
Transferring CLA Signatures from Launchpad to Github¶
For existing contributors who have signed the agreement in Launchpad
before the Github username field was included, we need to verify the
link between your Launchpad account and your GitHub account. To
enable us to do this, we ask that you create a branch with both your
Launchpad and GitHub usernames against both the Launchpad and GitHub
cloud-init repositories. We’ve added a tool
tools/migrate-lp-user-to-github) to the cloud-init repository to
handle this migration as automatically as possible.
The cloud-init team will review the two merge proposals and verify that the CLA has been signed for the Launchpad user and record the associated GitHub account.
Do these things for each feature or bug¶
Create a new topic branch for your work:
git checkout -b my-topic-branch
Make and commit your changes (note, you can make multiple commits, fixes, more commits.):
Run unit tests and lint/formatting checks with tox:
Push your changes to your personal GitHub repository:
git push -u GH_USER my-topic-branch
Use your browser to create a merge request:
Open the branch on GitHub
You can see a web view of your repository and navigate to the branch at:
Click ‘Pull Request`
Fill out the pull request title, summarizing the change and a longer message indicating important details about the changes included, like
Activate the frobnicator. The frobnicator was previously inactive and now runs by default. This may save the world some day. Then, list the bugs you fixed as footers with syntax as shown here. The commit message should be one summary line of less than 74 characters followed by a blank line, and then one or more paragraphs describing the change and why it was needed. This is the message that will be used on the commit when it is sqaushed and merged into trunk. LP: #1
Note that the project continues to use LP: #NNNNN format for closing launchpad bugs rather than GitHub Issues.
Click ‘Create Pull Request`
Then, someone in the Ubuntu Server team will review your changes and follow up in the pull request. Look at the Code Review Process doc to understand the following steps.
Feel free to ping and/or join
#cloud-init on freenode irc if you
have any questions.
This section captures design decisions that are helpful to know when hacking on cloud-init.
Cloud Config Modules¶
- Any new modules should use underscores in any new config options and not hyphens (e.g. new_option and not new-option).
cloud-init has both unit tests and integration tests. Unit tests can
be found in-tree alongside the source code, as well as
tests/unittests. Integration tests can be found at
tests/integration_tests. Documentation specifically for integration
tests can be found on the Integration Testing page, but
the guidelines specified below apply to both types of tests.
cloud-init uses pytest to run its tests, and has tests written both
unittest.TestCase sub-classes and as un-subclassed pytest tests.
The following guidelines should be followed:
For ease of organisation and greater accessibility for developers not familiar with pytest, all cloud-init unit tests must be contained within test classes
- Put another way, module-level test functions should not be used
pytest test classes should use pytest fixtures to share functionality instead of inheritance
As all tests are contained within classes, it is acceptable to mix
TestCasetest classes and pytest test classes within the same test file
- These can be easily distinguished by their definition: pytest
classes will not use inheritance at all (e.g.
TestCaseclasses will subclass (indirectly) from
- These can be easily distinguished by their definition: pytest classes will not use inheritance at all (e.g. TestGetPackageMirrorInfo), whereas
pytest tests should use bare
assertstatements, to take advantage of pytest’s assertion introspection
==and other commutative assertions, the expected value should be placed before the value under test:
assert expected_value == function_under_test()
As we still support Ubuntu 16.04 (Xenial Xerus), we can only use pytest features that are available in v2.8.7. This is an inexhaustive list of ways in which this may catch you out:
- Support for using
pytest.fixturefunctions was only introduced in pytest 3.0. Such functions must instead use the
- Only the following built-in fixtures are available
- On xenial, the objects returned by the
tmpdirfixture cannot be used where paths are required; they are rejected as invalid paths. You must instead use their
- For example, instead of
util.write_file(tmpdir.join("some_file"), ...), you should write
- For example, instead of
- The pytest.param function cannot be used. It was introduced in pytest 3.1, which means it is not available on xenial. The more limited mechanism it replaced was removed in pytest 4.0, so is not available in focal or later. The only available alternatives are to write mark-requiring test instances as completely separate tests, without utilising parameterisation, or to apply the mark to the entire parameterized test (and therefore every test instance).
- Support for using
Variables/parameter names for
MagicMockinstances should start with
m_to clearly distinguish them from non-mock variables
- For example,
m_readurl(which would be a mock for
- For example,
assert_*methods that are available on
MagicMockobjects should be avoided, as typos in these method names may not raise
AttributeError(and so can cause tests to silently pass). An important exception: if a
Mockis autospecced then misspelled assertion methods will raise an
AttributeError, so these assertion methods may be used on autospecced
Mocks, these substitutions can be used (
mis assumed to be a
assert mock.call(*args, **kwargs) in m.call_args_list
assert 0 != m.call_count
assert 1 == m.call_count
assert [mock.call(*args, **kwargs)] == m.call_args_list
assert mock.call(*args, **kwargs) == m.call_args_list[-1]
for call in call_list: assert call in m.call_args_list
m.assert_has_calls(..., any_order=False)are not easily replicated in a single statement, so their use when appropriate is acceptable.
assert 0 == m.call_count
Test arguments should be ordered as follows:
mock.patcharguments. When used as a decorator,
mock.patchpartially applies its generated
Mockobject as the first argument, so these arguments must go first.
pytest.mark.parametrizearguments, in the order specified to the
parametrizedecorator. These arguments are also provided by a decorator, so it’s natural that they sit next to the
- Fixture arguments, alphabetically. These are not provided by a decorator, so they are last, and their order has no defined meaning, so we default to alphabetical.
It follows from this ordering of test arguments (so that we retain the property that arguments left-to-right correspond to decorators bottom-to-top) that test decorators should be ordered as follows:
When there are multiple patch calls in a test file for the module it is testing, it may be desirable to capture the shared string prefix for these patch calls in a module-level variable. If used, such variables should be named
M_PATHor, for datasource tests,
The cloud-init codebase uses Python’s annotation support for storing
type annotations in the style specified by PEP-484. Their use in
the codebase is encouraged but with one important caveat: types from
typing module cannot be used.
cloud-init still supports Python 3.4, which doesn’t have the
module in the stdlib. This means that the use of any types from the
typing module in the codebase would require installation of an
additional Python module on platforms using Python 3.4. As such
platforms are generally in maintenance mode, the introduction of a new
dependency may act as a break in compatibility in practical terms.
Similarly, only function annotations are appropriate for use, as the variable annotations specified in PEP-526 were introduced in Python 3.6.
This list of fixtures (with markup) can be reproduced by running:
py.test-3 --fixtures -q | grep "^[^ -]" | grep -v '\(no\|capturelog\)' | sort | sed 's/.*/* ``\0``/'
in a xenial lxd container with python3-pytest-catchlog installed.
Feature flags are used as a way to easily toggle configuration at build time. They are provided to accommodate feature deprecation and downstream configuration changes.
Currently used upstream values for feature flags are set in
cloudinit/features.py. Overrides to these values (typically via quilt
patch) can be placed
in a file called
feature_overrides.py in the same directory. Any value
feature_overrides.py will override the original value set
Each flag should include a short comment regarding the reason for the flag and intended lifetime.
Tests are required for new feature flags, and tests must verify all valid states of a flag, not just the default state.
When configuring apt mirrors, if
Truecloud-init will detect that a datasource’s
availability_zoneproperty looks like an EC2 availability zone and set the
ec2_regionvariable when generating mirror URLs; this can lead to incorrect mirrors being configured in clouds whose AZs follow EC2’s naming pattern.
As of 20.3,
Falseso we no longer include
ec2_regionin mirror determination on non-AWS cloud platforms.
If the old behavior is desired, users can provide the appropriate mirrors via
apt:directives in cloud-config.
If there is a failure in obtaining user data (i.e., #include or decompress fails) and
False, cloud-init will log a warning and proceed. If it is
True, cloud-init will instead raise an exception.
As of 20.3,
(This flag can be removed after Focal is no longer supported.)
This captures ongoing refactoring projects in the codebase. This is intended as documentation for developers involved in the refactoring, but also for other developers who may interact with the code being refactored in the meantime.
cloudinit.net was imported from the curtin codebase as a chunk, and
then modified enough that it integrated with the rest of the cloud-init
codebase. Over the ~4 years since, the fact that it is not fully
integrated into the
Distro hierarchy has caused several issues.
The common pattern of these problems is that the commands used for
networking are different across distributions and operating systems.
This has lead to
cloudinit.net developing its own “distro
determination” logic: get_interfaces_by_mac is probably the clearest
example of this. Currently, these differences are primarily split
along Linux/BSD lines. However, it would be short-sighted to only
refactor in a way that captures this difference: we can anticipate that
differences will develop between Linux-based distros in future, or
there may already be differences in tooling that we currently
work around in less obvious ways.
The high-level plan is to introduce a hierarchy of networking classes
cloudinit.distros.networking, which each
will reference. These will capture the differences between networking
on our various distros, while still allowing easy reuse of code between
distros that share functionality (e.g. most of the Linux networking
Distro objects will instantiate the networking classes
self.networking, so callers will call
distro.networking.<func> instead of
will necessitate access to an instantiated
An implementation note: there may be external consumers of the
cloudinit.net module. We don’t consider this a public API, so we
will be removing it as part of this refactor. However, we will ensure
that the new API is complete from its introduction, so that any such
consumers can move over to it wholesale. (Note, however, that this new
API is still not considered public or stable, and may not replicate the
existing API exactly.)
In more detail:
- The root of this hierarchy will be the
cloudinit.distros.networking.Networkingclass. This class will have a corresponding method for every
cloudinit.netfunction that we identify to be involved in refactoring. Initially, these methods’ implementations will simply call the corresponding
cloudinit.netfunction. (This gives us the complete API from day one, for existing consumers.)
- As the biggest differentiator in behaviour, the next layer of the
hierarchy will be two subclasses:
BSDNetworking. These will be introduced in the initial PR.
- When a difference in behaviour for a particular distro is identified,
Networkingsubclass will be created. This new class should generally subclass either
- To be clear:
Networkingsubclasses will only be created when needed, we will not create a full hierarchy of per-
Distroclass will have a class variable (
cls.networking_cls) which points at the appropriate networking class (initially this will be either
Distroclasses are instantiated, they will instantiate
cls.networking_clsand store the instance at
self.networking. (This will be implemented in
- A helper function will be added which will determine the appropriate
Distrosubclass for the current system, instantiate it and return its
networkingattribute. (This is the entry point for existing consumers to migrate to.)
- Callers of refactored functions will change from calling
distrois an instance of the appropriate
Distroclass for this system. (This will require making such an instance available to callers, which will constitute a large part of the work in this project.)
After the initial structure is in place, the work in this refactor will
consist of replacing the
cloudinit.net.some_func call in each
cloudinit.distros.networking.Networking method with the actual
implementation. This can be done incrementally, one function at a
- pick an unmigrated
- find it in the the list of bugs tagged net-refactor and assign yourself to it (see Managing Work/Tracking Progress below for more details)
- refactor all of its callers to call the
Distroinstead of the
cloudinit.net.<func>function. (This is likely to be the most time-consuming step, as it may require plumbing
Distroobjects through to places that previously have not consumed them.)
- refactor its implementation from
Networkinghierarchy (e.g. if it has an if/else on BSD, this is the time to put the implementations in their respective subclasses)
- if part of the method contains distro-independent logic, then you
may need to create new methods to capture this distro-specific
logic; we don’t want to replicate common logic in different
- if after the refactor, the method on the root
Networkingclass no longer has any implementation, it should be converted to an abstractmethod
- if part of the method contains distro-independent logic, then you may need to create new methods to capture this distro-specific logic; we don’t want to replicate common logic in different
- ensure that the new implementation has unit tests (either by moving existing tests, or by writing new ones)
- ensure that the new implementation has a docstring
- add any appropriate type annotations
- note that we must follow the constraints described in the “Type Annotations” section above, so you may not be able to write complete annotations
- we have type aliases defined in
cloudinit.distros.networkingwhich should be used when applicable
- finally, remove it (and any other now-unused functions) from cloudinit.net (to avoid having two parallel implementations)
The functions/classes that need refactoring break down into some broad categories:
- helpers for accessing
/sys(that should not be on the top-level
Networkingclass as they are Linux-specific):
- those that directly access
/sys(via helpers) and should (IMO) be included in the API of the
config_driverparameter is used and passed as a boolean, so we can change the default value to
- those that directly access
/sys(via helpers) but may be Linux-specific concepts or names:
- those that directly use
- this has non-distro-specific logic so should potentially be
refactored to use helpers on
ipdirectly (rather than being wholesale reimplemented in each of
- we can also remove the
check_downableargument, it’s never specified so is always
- this has non-distro-specific logic so should potentially be refactored to use helpers on
- this has several internal helper functions which use
ipdirectly, and it calls
_get_current_rename_info. That said, there appears to be a lot of non-distro-specific logic that could live in a function on
Networking, so this will require some careful refactoring to avoid duplicating that logic in each of
- only the
current_infoparameters are ever passed in (and
current_infoonly by tests), so we can remove the others from the definition
- this has several internal helper functions which use
- this is another case where it mixes distro-specific and
non-specific functionality. Specifically,
__exit__are non-specific, and the remaining methods are distro-specific.
- when refactoring this, the need to track
cleanup_cmdslikely means that the distro-specific behaviour cannot be captured only in the
Networkingclass. See this comment in PR #363 for more thoughts.
- this is another case where it mixes distro-specific and non-specific functionality. Specifically,
- those that implicitly use
/sysvia their call dependencies:
- appends to
get_masterreturn value, which is a
- appends to
- there is already a
Distro.apply_network_config_nameswhich in the default implementation calls this function; this and its BSD subclass implementations should be refactored at the same time
strict_busyparameters are never passed, nor are they used in the function definition, so they can be removed
- those that may fall into the above categories, but whose use is only
related to netfailover (which relies on a Linux-specific network
driver, so is unlikely to be relevant elsewhere without a substantial
refactor; these probably only need implementing in
- this is called from
- this is called from
- N.B. all of these take an optional
driverargument which is used to pass around a value to avoid having to look it up by calling
device_driverevery time. This is something of a leaky abstraction, and is better served by caching on
device_driveror storing the cached value on
self, so we can drop the parameter from the new API.
- those that use
/sys(via helpers) and have non-exhaustive BSD logic:
- those that already have separate Linux/BSD implementations:
- those that have no OS-specific functionality (so do not need to be
Note that the functions in
cloudinit.net use inconsistent parameter
names for “string that contains a device name”; we can standardise on
devname (the most common one) in the refactor.
Managing Work/Tracking Progress¶
To ensure that we won’t have multiple people working on the same part of the refactor at the same time, there is a bug for each function. You can see the current status by looking at the list of bugs tagged net-refactor.
When you’re working on refactoring a particular method, ensure that you have assigned yourself to the corresponding bug, to avoid duplicate work.
Generally, when considering what to pick up to refactor, it is best to
start with functions in
cloudinit.net which are not called by
anything else in
cloudinit.net. This allows you to focus only on
refactoring that function and its callsites, rather than having to
update the other
cloudinit.net function also.
- Mina Galić’s email the the cloud-init ML in 2018 (plus its thread)
- Mina Galić’s email to the cloud-init ML in 2019 (plus its thread)
- PR #363, the discussion which prompted finally starting this refactor (and where a lot of the above details were hashed out)