| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tl;dr is that UEFI NVRAM is in encoded
in UTF-16, and when we run the efibootmgr command,
we can get unicode characters back.
Except we previously were forcing everything to be
treated as UTF-8 due to the way oslo.concurrency's
processutils module works.
This could be observed with UTF character 0x00FF
which raises up a nice exception when we try to
decode it.
Anyhow! while fixing handling of this, we discovered
we could get basically the cruft out of the NVRAM,
by getting what was most likey a truncated string
out of our own test VMs. As such, we need to also
permit decoding to be tollerant of failures.
This could be binary data or as simple as flipped
bits which get interpretted invalid characters.
As such, we have introduced such data into one of our
tests involving UEFI record de-duplication.
NOTE: One of the unit tests from the stable/xena backport
were removed, as software raid was still in-flight at the
end of Wallaby.
Closes-Bug: 2015602
Change-Id: I006535bf124379ed65443c7b283bc99ecc95568b
(cherry picked from commit 76accfb880474445a5dcb07825889123b3dd0237)
(cherry picked from commit 9f84c8b3d1fa0e08bf1f799f37a11698f8da07a4)
(cherry picked from commit d77424d7315e24390d6b159eab8dd9b3d4c56942)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case no BOM is present in the CSV file the utf-16 codec won't work.
We fail over to utf-16-le as Little Endian is commonly used.
NOTE: The original change landed this fix in efi_utils.py, however
that was introduced after the Xena development cycle, so this
backport moves the original change to where the code originally
came from to populate the efi_utils.py file.
Change-Id: I3e25ce4997f5dd3df87caba753daced65838f85a
(cherry picked from commit 697fa6f3b6db10408eaadd57450456de87f13519)
(cherry picked from commit 99b9d1403cacd3bbd489a6cf6913f32746ef6083)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While investigating another grub issue, I was confused by the path
taken in the logs reported, and noticed that on a ramdisk, we might
not actually have a valid response to os.path.ismount, I'm guessing
depending on what in memory filesystem is in use while also coupled
with attempting to check a filesystem.
Adds a test to validate that exceptions raised on these commands
where this issue can be encountered, are properly bypassed, and also
adds additional logging to make it easier to figure out what is
going on in the entire bootloader setup sequence.
Change-Id: Ibd3060bef2e56468ada6b1a5c1cc1632a42803c3
(cherry picked from commit e5d552474b21137ae2a66f17bdab5fc1bbf31ec6)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recent releases of redhat grub2 will always fail when installing to
EFI paths, to encourage a transition to the signed shim bootloader.
Partition image deploys avoid calling grub2-install with the
preserve-efi-assets functions. Deploying whole disk images doesn't
require grub2-install. This leaves whole disk images installed onto
softraid devices, which still attempts to call grub2-install.
This change will still attempt to run grub2-install in this
one remaining case, but will ignore any failure.
A future enhancement can avoid calling grub2-install entirely so that
non-redhat secure-boot capable images can keep their signed
bootloaders.
Story: 2008923
Task: 42521
Change-Id: If432ef795d64d76442d739eb4f7d155ff847041e
(cherry picked from commit a057be7dadc898ec813b2cac14913cd8523fbbcc)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logic of adding a partition number to the device path does not work
for devicemapper devices (e.g. a multipath storage device).
Conflicts:
ironic_python_agent/efi_utils.py
ironic_python_agent/extensions/image.py
ironic_python_agent/tests/unit/extensions/test_image.py
ironic_python_agent/tests/unit/test_efi_utils.py
Change-Id: I9a445e847d282c50adfa4bad5e7136776861005d
(cherry picked from commit f09f6c9f1a09c7062d0450b3e0a4d3164fd53f7f)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Depending on the how the stars align with partition images
being written to a remote system, we *may* end up with
*either* a Partition UUID value, or a Partition's UUID value.
Which are distinctly different.
This is becasue the value, when collected as a result of writing
an image to disk *falls* back and passes the value to enable
partition discovery and matching.
Later on, when we realized we ought to create an fstab entry,
we blindly re-used the value thinking it was, indeed, always
a Partition's UUID and not the Partition UUID. Obviously,
the label type is quite explicit, either UUID or PARTUUID
respectively, when initial ramdisk utilities such as dracut
are searching and mounting filesystems.
Adds capability to identify the correct label to utilize
based upon the current state of the block devices on disk.
Granted, we are likely only exposed to this because of IO
race conditions under high concurrecy load operations.
Normally this would only be seen on test VMs, but
systems being backed by a Storage Area Network *can*
exibit the same IO race conditions as virtual machines.
Change-Id: I953c936cbf8fad889108cbf4e50b1a15f511b38c
Resolves: rhbz#2058717
Story: #2009881
Task: 44623
(cherry picked from commit 99ca1086dbfc7b6e41cf800b0bd899565e2e8922)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
I accidently put colons on the test data and remembered taking the
colon character out of the regex I was working on, but apparently
left it in, and accounted for the active entry indicator flag
which appears to have inconsistent support across vendors.
The regex has been fixed, and a test added from a Lenovo SR650
which has some additional string entry data in the UEFI output
which may separate entries.
Change-Id: I1f67b0fb1f645fa82e98bd7c7bba3ffc7755cc74
(cherry picked from commit e10f052c06c03016b0ff4d9c1f3191c79fc50a1a)
|
| |\ |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Some firmware seems to take an objection with EFI nvram
entries being deleted after one is added, resulting in the
entire entry table being reset to the last known good state.
This is problematic, as ultimately deployments can time out
if we previously booted with Networking, and the machine, while
commanded to do other wise, reboots back to networking regardless.
We will now delete entries first, before proceeding.
Additionally, for general use, this pattern may serve the
community better by avoiding cases where we would have
previously just relied upon efibootmgr[0] to warn us of duplicate
entries.
[0]: https://github.com/rhboot/efibootmgr/blob/103aa22ece98f09fe3ea2a0c83988f0ee2d0e5a8/src/efibootmgr.c#L228
Change-Id: Ib61a7100a059e79a8b0901fd8f46b9bc41d657dc
Story: 2009649
Task: 43808
(cherry picked from commit 67eddfa7e3fedbb530045f5b43a2c89db832fa2a)
(cherry picked from commit 33b39705a50513c5af411216b48e2a6f6ac9ab14)
|
| |\ \
| |/ |
|
| | |
| |
| |
| |
| |
| |
| |
| | |
When debugging boot manager problems it can be advantageous to
see all the full entries rather then just their labels.
Change-Id: I6a1bb78acaf5a4284727bdf533d4be6db2099f50
(cherry picked from commit caf695f70ab366498b46cb6f07f6751369c67e30)
|
| |\ \
| |/
|/| |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Re-read the partition table with 'partx -a', rather than 'partx -u'.
This should fix an timing issue where the bootloader installation
fails to mount the EFI partition from a whole disk image since it
is not yet aware of the new partitions (observed with both, the
iscsi and the direct deploy interface).
Change-Id: If5da3075e813ae01df3decf8f0647aba111b0515
(cherry picked from commit dc8c1f16f9a00e2bff21612d1a9cf0ea0f3addf0)
|
| |/
|
|
|
|
|
|
|
|
|
| |
The EFI partition UUID may be None and this will break
the fstab editing. While this is not necessarily fatal when
instantiating a node, it creates an exception at the end of
bootloader installation, so only attempt to add a line to
fstab when the UUID is not None.
Change-Id: I68799980e67c05afe4ca68ca9733605dd166d54d
(cherry picked from commit 333ed70c94e366f16d8f2633f74a5ef05aa5fadb)
|
| |
|
|
|
|
|
|
| |
Check if the ESP is already mounted before attempting to mount it
for the bootloader installation.
Change-Id: Ifd738b2c5663f1a211d7e13b5ba386be631d8db1
(cherry picked from commit 27568204aeb7f063bf236ad7f2f8043db627baa9)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds support to identify and utilize a CSV file to signal which
bootloader to utilize, and set it when the OS is running as opposed
to when EFI is running. This works around EFI loader potentially
crashing some vendors hardware types when entry stored in the
image does not match the EFI loader record which was utilzied to
boot.
Grub2+shim specifically specifically needs the CSV file name
and entry label to match what the system was booted with in order
to prevent the machine from potentially crashing.
See https://storyboard.openstack.org/#!/story/2008962
and https://bugzilla.redhat.com/show_bug.cgi?id=1966129#c37
for more information.
Change-Id: Ibf1ef4fe0764c0a6f1a39cb7eebc23ecc0ee177d
Story: 2008962
Task: 42598
Co-Authored-By: Bob Fournier <bfournie@redhat.com>
(cherry picked from commit 2fab70c36ba40a345a9dd01aeb5019681e567aa5)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To make this function useful for purposes other than efibootmgr
entries, this change moves the path manipulation to _run_efibootmgr.
This change also adds boot*.efi entries to BOOTLOADERS_EFI so that it
includes every entry in the UEFI Spec 2.9[1] Table 3-2 UEFI Image
Types.
[1] https://uefi.org/sites/default/files/resources/UEFI_Spec_2_9_2021_03_18.pdf
Story: 2008923
Task: 42521
Change-Id: Ibe02786609aa0de65115897d8f4a9b4f36c8aed2
(cherry picked from commit 10d18c41136cc645ee99d41acfb6031b9158e1fb)
|
| |\ |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For software RAID in UEFI mode, we create ESPs on all holder disks
and copy the bootloader there. Since there is no mechanism to keep
the ESPs in sync, e.g. on kernel upgrades or when kernel parameters
are updated, the ESPs will get out of sync eventually. This may lead
to a situation where a node boots with outdated parameters or does
not have any of the installed kernels in the boot menu anymore.
This change proposes to RAID the ESPs. While the UEFI firmware will
find an ESP partition (one leg of the mirror), the node will see
an md device and all subsequent updates will go to all member disks.
Also, remove the source ESP after copying in order to avoid mount
confusion (same UUID!).
Story: #2008745
Task: #42103
Change-Id: I9078ef37f1e94382c645ae98ce724ac9ed87c287
(cherry picked from commit c2d04dc1566bb947d0e6afd040b82be55c925b11)
|
| |/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The _manage_uefi code has a check where it attempts to just
identify the precise partition number of the device, in order
for configuration to be parsed and passed. However, the same code
did not handle the existence of a `p1` partition instead of just a
partition #1. This is because the device naming format is different
with NVMe and Software RAID.
Likely, this wasn't an issue with software raid due to how complex the
code interaction is, but the docs also indicate to use only whole disk
images in that case.
This patch was pulled down my one RH's professional services folks
who has confirmed it does indeed fix the issue at hand. This is noted
as a public comment on the Red Hat bugzilla.
https://bugzilla.redhat.com/show_bug.cgi?id=1954096
Story: 2008881
Task: 42426
Related: rhbz#1954096
Change-Id: Ie3bd49add9a57fabbcdcbae4b73309066b620d02
(cherry picked from commit fe825fa97ed1f3c9fa8b1461b63ab133fec20b72)
|
| |
|
|
|
|
|
|
| |
The line we're looking for is not there when IPA is in a container, at least
for CentOS based containers. Just fall back to sysrq on errors.
Change-Id: Ie4ee605ad9c6cda58808512a563247175859c71e
(cherry picked from commit b395181b1b1381ff0802744807a981df8453bc40)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The root UUID changes after a streamed partition image is written to
the block device, causing later deployment failure when assuming the
old UUID.
This change updates the root UUID after streaming the partition image
is complete.
This issue may have been missed in local testing because deploying the
same image repeatedly will result in stable root UUID across runs.
Change-Id: Ice4630c16fc216980488d1427f3b02e1b8a417fa
|
| |
|
|
|
|
|
|
|
| |
The param check_exit_code from the processutils extension execute has
default already at [0]
See:
https://opendev.org/openstack/oslo.concurrency/src/branch/master/oslo_concurrency/processutils.py#L214
Change-Id: Iedff5325e0737556d5eb3da601c984ddfc633873
|
| |
|
|
|
|
|
|
|
|
| |
IPA is not properly checking if the root partition is already
mounted. Device is being passed to os.path.ismount() instead
of the mount point.
Story: 2008631
Task: 41839
Change-Id: I37a6e7e6bbe0bbbb0317c6e55bb822dafe7cce20
|
| |
|
|
|
|
|
|
| |
It's somewhat confusing at the moment, since we're trying to find
a UEFI partition by UUID "None". Don't search for partition if
we don't know its UUID, and provide a better error message.
Change-Id: Ief874084132797a445ddae8009264712a05facfd
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Diskimage-builder installs grub with option '--removable'[1], thus for
aarch64 no 'grubaa64.efi' file in efi directory only got 'BOOTAA64.EFI':
linaro@bm-ubuntu:~$ tree /boot/efi
/boot/efi
└── EFI
└── BOOT
└── BOOTAA64.EFI
2 directories, 1 file
[1]: https://github.com/openstack/diskimage-builder/blob/8f12d9530ed79359fb988688caadfa6dc318f7a5/diskimage_builder/elements/bootloader/finalise.d/50-bootloader#L158
Task: #41698
Story: #2008560
Change-Id: I9fc55c068ea980beae273411db9d3568eec25eb8
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Previously, partition images were hard coded to be bios based
as opposed to consulting all of the values AND the node itself
before making the most appropriate determination. Now the agent
utilises the internal helper to properly determine the boot
mode when calling ironic-lib.
Story: 2008070
Task: 41265
Change-Id: Id5eeda69d5b9de2b393af414472d57b0d4380c43
|
| |
|
|
|
|
|
|
|
|
| |
The partition image support has been telling ironic-lib
that the machine will be local booted. While this is likely
harmless, and doesn't seem to break anythign, we should have
it match moving forward just to be on the safe side so we don't
accidently break things down the road.
Change-Id: I33e5d583964ef8c21aa04d7427bcd3957b89d449
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds support for the EFI partition to be appended to fstab so the
filesystem can be automounted and EFI loader updated should the
deployed operating system need to do so.
This should enable bootloaders to be upgraded by linux based
operating systems after the instance has been deployed when
a partition image was utilized for the initial deployment.
Change-Id: Iec28a8841cc01ec8b01a3f5cca070c934c7a2531
Story: 2008070
Task: 40754
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Partition images can sometimes contain a /boot folder structure
event he assets for EFI booting on that filesystem. Which is a
good thing. The conundrum is that Ironic does not handle this
properly and potentially replaces the bootloader in this sequence
such that grub2-install is used instead of signed bootloader assets.
As such, we should be preserving the assets and using them from
a partition image much like we do when we have a wholedisk
image and can identify the assets.
Now we will preserve the EFI boot assets, copy them to the new EFI
boot partition, and call the EFI setup methods to manage the EFI
nvram.
Note, this change also splits the logic path out that performs the
end call of the EFI boot manager into a reusable method but does
not retool all of the testing as it is intertwined in the
install_grub2 testing.
Also adds some additional debug logging, as much of the bootloader
installation code has multiple fallback/cleanup points which makes
it difficult to debug from logs.
Story: 2008070
Task: 40753
Change-Id: If17d4b4c06df5504987e61a1fde6662e9acd6989
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Partition images through the agent have the unfortunate
side effect of being executed without full node context
by default. Luckilly we've had a similar problem and
cache the node.
This patch changes the lookup from a default of msdos
partitions to use the cached node object.
Change-Id: I002816c9372fdf1cc32f3c67f420073551479fd9
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some hardware is very well intentioned. However this intention
can result in the UEFI NVRAM table being full which prevents us
from adding new records to the table. We can't be sure what to
delete, so in this case some operators just need the ability to
tell ironic "it is okay if this fails, it will still work."
The added ``ignore_bootloader_failure`` option adds
this capability which can be set per-node either in the agent
configuation via the ramdisk image, or in the pxe_append_params
configuration parameter for the node itself with a
``ipa-ignore-bootloader-failure`` option in order to prevent
the failure from being raised.
Change-Id: If3c83fb2ea2025fce092d495a64f32077c70d2d6
Story: 2008386
Task: 41309
|
| |
|
|
|
|
|
|
|
| |
Add possibility to use disk LABEL to identify rootfs uuid for
Software RAID deployment
Change-Id: I77f36e70ddc539af0190db1c1abe0fb2c66f34b4
Story: 2008303
Task: 41188
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By default, grub2-mkconfig scans everything to look for other
environments and then load those into the grub configuration.
It makes sense, but on newer versions of grub2 in distribution
images, os-prober is taking an exceptionally long time in some
cases where more than one storage device exists with other
filesystems.
As a result, of the os-prober execution by grub2-mkconfig, the
bootloader installation can completely time out and fail the
deployment. This is presently experienced with metalsmith on
centos8.
There are numerous sporatic reports of issues like this issue
where grub2-mkconfig hangs for some period of time, and this is
observable on Centos8.2 in our CI. While one report[0] mentions
this issue, Another bug [1] has the dialog that actually helps us
frame the context as to what we likely should do.
Also, fixes the unit testing so we actually test if we're running
with grub2. :\
[0]: https://bugzilla.redhat.com/show_bug.cgi?id=1744693
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1709682
Depends-On: https://review.opendev.org/#/c/748315
Change-Id: I14bf299afef3a1ddb2006fe5f182d7f0d249e734
|
| |
|
|
|
|
|
|
| |
Calling join() does not raise, we need to explicitly check the result.
Change-Id: I81d3d727af220c2b50358edab8139f07874611f0
Story: #2008240
Task: #41083
|
| |\ |
|
| | |
| |
| |
| |
| |
| | |
This is not a normal situation and is likely to cause problems.
Change-Id: Id0668fd160ac0539d85997e985f8c43d9da75c90
|
| |/
|
|
|
|
|
|
|
| |
We don't have a really working way to detect root UUID for whole
disk images at the moment, which results in an ignored traceback
every time install_bootloader is called with whole disk images in
UEFI mode. Avoid it by skipping GRUB2 if root UUID is unknown.
Change-Id: I84245538f59c664b72d1cafbca8d61be0978f489
|
| |
|
|
|
|
| |
Also make this API return a proper HTTP code (409 instead of 500).
Change-Id: I5d86878b5ed6142ed2630adee78c0867c49b663f
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Or at least try to.
Some deployments just don't use root device hints, and this is okay.
However, other deployments need root device hints, and with fast
track mode in ramdisks, we created a situation where the node cache
could be updated by a human or software between the time the agent
was started, and the deployment was requested.
As a result, the agent has been updated to check if we have a hint
and if we don't, update the cache from the node lookup endpoint.
This is not needed when the inband deploy steps are executed, as
the process of updating the steps does force the node cache to be
updated.
Change-Id: I27201319f31cdc01605a3c5ae9ef4b4218e4a3f6
Story: 2008039
Task: 40701
|
| |\ |
|
| | |
| |
| |
| |
| |
| |
| | |
Introducing new function _umount_all_partitions to reduce the size
of _install_grub2
Change-Id: I304468d57b10d677f2a9d58aec42a1bf414c6cba
|
| |\ \
| |/
|/| |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When we added software raid support, we started calling bootloader
installation. As time went on, we ehnanced that code path for non
RAID cases in order to ensure that UEFI nvram was setup
for the instance to boot properly.
Somewhere in this process, we missed a possible failure case where
the iscsi client tgtadm may return failures. Obviously, the correct
path is to not call iscsi teardown if we don't need to.
Since it was always semi-opportunistic teardown, we can't blindly
catch any error, and if we started iSCSI and failed to tear the
connection down, we might want to still fail, so this change
moves the logic over to use a flag on the agent object which
one extension to set the flag and the other to read it and take
action based upon that.
Change-Id: Id3b1ae5e59282f4109f6246d5614d44c93aefa7c
Story: 2007937
Task: 40395
|
| |\ \
| |/
|/| |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Adds a new poll extension to provide get_hardware_info and get_node_info
interfaces.
get_hardware_info will be used for node validation by ironic deploy
drivers.
get_node_info will be used for sending lookup data to IPA.
standalone mode is assumed as debug only, but it's not the case
considering the poll mode will be introduced, slightly updates the
description, also prevents the mdns lookup when standalone is true.
Story: 1526486
Task: 28724
Change-Id: I5ad772a18cc4584585c5a7b6fb127547cece1998
|
| | |
| |
| |
| |
| |
| |
| | |
Shuffle some functions around and reduce size of _is_bootloader_loaded
moving logic out to a new function.
Change-Id: I9c10bf05186dcebb37f175d61bf4ac9ff86b6510
|
| | |
| |
| |
| |
| |
| |
| | |
This has been a popular guidance, and diskimage-builder has recently
started following it.
Change-Id: I794c846fb191c15b0a30546bf64d624dfbde0fd4
|
| |\ \ |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In order to ensure grub2 finds all files it needs, mount all
vfat partitions specified in the deployed image.
Story: #2007618
Task: #39629
Change-Id: Ie5b6e0abc3f266409562f9ecb26538126b667056
|