| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |/ /
| |
| |
| |
| |
| |
| |
| |
| | |
We don't have a really working way to detect root UUID for whole
disk images at the moment, which results in an ignored traceback
every time install_bootloader is called with whole disk images in
UEFI mode. Avoid it by skipping GRUB2 if root UUID is unknown.
Change-Id: I84245538f59c664b72d1cafbca8d61be0978f489
|
| |\ \
| |/
|/| |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A transitory connection failure, such as one caused by
a port being held down for traffic forwarding, can experience
intermittent connectivity failures which result in failed
introspections.
Now the agent retries.
Change-Id: I72c5e3aca000d3854a17f8a461b1a2935e5c0d9b
|
| |\ \ |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Scanning the output of mdadm commands for RAID members will
miss component devices which are currently not part of the
RAID. For proper cleaning it is better to scan block devices
for a signature of the md device for which we would like to
get the components.
Story: #2008186
Task: #40947
Change-Id: Ib46612697851e36a16d272ccaeb0115106253863
|
| |\ \ \ |
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Heartbeating in IPA has used select.poll() for years to workaround
a bug where changing the time in the ramdisk could cause heartbeats
to stop and never resume.
Now that IPA syncs time at start and exit, this workaround is no
longer needed. So instead, we'll revert to using threading.Event()
in order to make the code simpler and easier to understand.
Since we need this to be an eventlet-event, and not a standard-thread
event, also monkey_patch threading.
Additionally, there were a few completely unused backoff interval
values set, that were never applied. In respect of maintaining the
5+ years old behavior of not doing error backoffs, that code was
removed instead of being made to work.
Change-Id: Ibcde99de64bb7e95d5df63a42a4ca4999f0c4c9b
|
| | |/ /
|/| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Partions on the holder disk should only be deleted after
all RAID devices have been deleted. Otherwise, super blocks
on partitions which reside on the same disks cannot be cleaned.
Story: #2008199
Task: #40979
Change-Id: I19293f5b992cd1fa68957d6f306dcec8f3b7a820
|
| |\ \ \ |
|
| | |/ /
| | |
| | |
| | |
| | |
| | | |
Also make this API return a proper HTTP code (409 instead of 500).
Change-Id: I5d86878b5ed6142ed2630adee78c0867c49b663f
|
| |\ \ \ |
|
| | | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Currently, IntelCnaHardwareManager inherits GenericHardwareManager
which makes it a new "GenericHardwareManager" with "MAINLINE" priority.
This causes all other hardware-managers with lower priority than
"MAINLINE" never be used. To fix this, make IntelCnaHardwareManager
inherit basic HardwareManager.
Change-Id: I28b665d8841b0b2e83b132e1f25df95e03e7ba10
Story: 2008142
Task: 40882
|
| |\ \ \
| |/ /
|/| | |
|
| | |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Adds a new flag (on by default) that enables generating a TLS
certificate and sending it to ironic via heartbeat. Whether
ironic supports auto-generated certificates is determined by
checking its API version.
Change-Id: I01f83dd04cfec2adc9e2a6b9c531391773ed36e5
Depends-On: https://review.opendev.org/747136
Depends-On: https://review.opendev.org/749975
Story: #2007214
Task: #40604
|
| |\ \ |
|
| | |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The node lookup code added in change
I27201319f31cdc01605a3c5ae9ef4b4218e4a3f6
was slightly broken in that we call a method
with a keyword arguemnt which doesn't exist.
uuid versus node_uuid.
It happens, it is a quick fix!
Spotted on a metalsmith job:
[-] Agent is requesting to perform an explicit node cache update.
This is to pickup any chanages in the cache before deployment.
[-] Failed to update node cache. Error lookup_node() got an
unexpected keyword argument 'uuid'
Change-Id: I59ecec65707a2f03918b233f1925395ebe59b8c4
|
| |\ \
| |/
|/| |
|
| | |
| |
| |
| |
| |
| |
| | |
The latter has a more natural API and does not have a hard requirement
of eventlet. It is already a dependency of ironic-lib.
Change-Id: I68de9e989af137b34c19bbaf9b7c0a5ba6e1d4e3
|
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
Apparently, functional-py36 just runs unit tests.
Fix the test that has regressed in the meantime and make it voting
so that we don't regress again.
Change-Id: Id5efe89a12a00c27e6299380a51cdb840285d691
|
| |\ \ |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This change enables operators to set [DEFAULT]listen_tls to
true configure IPA to be host its WSGI server over TLS using
existing SSL support in oslo.service.
In addition to configuring this in IPA, a deployer will need to
also set [ssl]cert_file, [ssl]key_file, and optionally
[ssl]ca_file in their ipa config, in addition to embedding those
files into the IPA ramdisk in order for this to be functional.
In order to make this change work, we also need to monkey patch
socket library early, or else oslo.service will end up passing an
unpatched socket to the eventlet wsgi server, which causes
deadlocks.
Change-Id: Ib7decae410915f3c27b045ee08538c94d455b030
|
| |\ \ \
| |/ /
|/| | |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Or at least try to.
Some deployments just don't use root device hints, and this is okay.
However, other deployments need root device hints, and with fast
track mode in ramdisks, we created a situation where the node cache
could be updated by a human or software between the time the agent
was started, and the deployment was requested.
As a result, the agent has been updated to check if we have a hint
and if we don't, update the cache from the node lookup endpoint.
This is not needed when the inband deploy steps are executed, as
the process of updating the steps does force the node cache to be
updated.
Change-Id: I27201319f31cdc01605a3c5ae9ef4b4218e4a3f6
Story: 2008039
Task: 40701
|
| | |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The listen_port and listen_host directives are intended to allow
deployers of IPA to change the port and host IPA listens on. These
configs have not been obeyed since the migration to the oslo.service
wsgi server.
Story: 2008016
Task: 40668
Change-Id: I76235a6e6ffdf80a0f5476f577b055223cdf1585
|
| |\ \ |
|
| | |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Heartbeat connection errors are often a sign of a transitory
network failures which may resolve themselves. But an operator
looking at the screen doesn't necessarilly know that.
They don't understand that there could have been a network
failure, or a misconfiguration that caused the connectivity
failure and soft of kind of default to "well it failed"
without further clarification.
As such, this patch adds explicit catching of the requests
ConnectionError exception and rasies a new internal error
with a more verbose error message in that event to provide
operators with additional clarity.
Change-Id: I4cb2c0d1f577df1c4451308bd86efa8f94390b0c
Story: 2008046
Task: 40709
|
| |/
|
|
|
|
|
| |
It's incredibly helpful when debugging and most of consumers seem
to enable and rely on it.
Change-Id: I33bf58b3eb16b63b70f2a23e8a04449dc88fd94c
|
| |
|
|
|
|
|
|
| |
It can be done via ipa-global-request-id kernel commandline parameter.
Story: 2007681
Task: 39792
Change-Id: I6f544327d310c976a1625cfb411947591867882a
|
| |\ |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
Adds a new kernel parameter for manual configuration and also creates
foundation for automatic TLS support later.
Change-Id: If341c3a8a268fc8cab6bd6be04b12ca32b31c8d8
Story: #2007214
Task: #40619
|
| |\ \ |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Agent lookups can fail as we presently use logging.exception,
better known in our code as LOG.exception, which can also generate
other fun issues on journald based systems where additional errors
could be raised resulting in us being unable to troubleshoot the
the actual issue.
Because of the mis-use of LOG.exception and the default behavior
of the backoff retry handler, the retry logic was also not
functional as any error no matter how small caused IPA to
just exit.
Change-Id: Ic4608b7c6ff9773d1403926efb3d59869c71343b
Story: 2007968
Task: 40465
|
| |/ /
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Collects PCI class, revision, and bus information for the pci-devices
collector, these metrics as well as vendor id and device id are
components which can be used to construct device information like
lspci output, which is how cyborg agent collects accelerator devices.
Accelerator device based scheduling is possible after ironic has such
information in place.
Change-Id: I6c37c554f37dd5f1d21c8fd4fad2a4f44a3c75d7
Story: 2007971
Task: 40474
|
| |\ \
| |/
|/| |
|
| | |
| |
| |
| |
| |
| |
| | |
AgentRAID expects it and fails with TypeError if it's not provided.
Change-Id: Id84ac129bba97540338e25f0027aa0a0f51bde52
Story: #2006963
|
| |/
|
|
|
| |
Change-Id: I75f156dd76b0e3aaa1592ba24fe42fb2a7057cc8
Story: #2006963
|
| |\ |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When we added software raid support, we started calling bootloader
installation. As time went on, we ehnanced that code path for non
RAID cases in order to ensure that UEFI nvram was setup
for the instance to boot properly.
Somewhere in this process, we missed a possible failure case where
the iscsi client tgtadm may return failures. Obviously, the correct
path is to not call iscsi teardown if we don't need to.
Since it was always semi-opportunistic teardown, we can't blindly
catch any error, and if we started iSCSI and failed to tear the
connection down, we might want to still fail, so this change
moves the logic over to use a flag on the agent object which
one extension to set the flag and the other to read it and take
action based upon that.
Change-Id: Id3b1ae5e59282f4109f6246d5614d44c93aefa7c
Story: 2007937
Task: 40395
|
| |\ \ |
|
| | |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When no root_device hint is set, an MDRAID partition can be incorrectly
selected as the root device which causes installation of the bootloader
to the physical disks behind the MDRAID volume to fail. See the notes
in the referenced Story for more detail.
This change adds a little more specificity to the listing of block
devices.
Change-Id: I66db457e71a0586723ee753bef961aec5bf58827
Story: 2007905
Task: 40303
|
| |\ \
| |/
|/| |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Adds a new poll extension to provide get_hardware_info and get_node_info
interfaces.
get_hardware_info will be used for node validation by ironic deploy
drivers.
get_node_info will be used for sending lookup data to IPA.
standalone mode is assumed as debug only, but it's not the case
considering the poll mode will be introduced, slightly updates the
description, also prevents the mdns lookup when standalone is true.
Story: 1526486
Task: 28724
Change-Id: I5ad772a18cc4584585c5a7b6fb127547cece1998
|
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
delete_configuration still fetches all devices as it needs to clean
ones with broken RAID.
Story: #2007907
Task: #40307
Change-Id: I4b0be2b0755108490f9cd3c4f3b71a5e036761a1
|
| |\ \ |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Caches hardware information collected during inspection
so that the initial lookup can occur without any delay.
Also adds logging to track how long inventory collection takes.
Co-Authored-By: Dmitry Tantsur <dtantsur@protonmail.com>
Change-Id: I3e0d237d37219e783d81913fa6cc490492b3f96a
|
| |\ \ \
| |/ /
|/| | |
|
| | | |
| | |
| | |
| | |
| | |
| | | |
Change-Id: If1408e4b81d263c56b4bbab618dd0737db5f762e
Story: #2007889
Task: #40268
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | | |
This has been a popular guidance, and diskimage-builder has recently
started following it.
Change-Id: I794c846fb191c15b0a30546bf64d624dfbde0fd4
|
| |\ \ \
| |/ /
|/| | |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In order to ensure grub2 finds all files it needs, mount all
vfat partitions specified in the deployed image.
Story: #2007618
Task: #39629
Change-Id: Ie5b6e0abc3f266409562f9ecb26538126b667056
|