• Re: how to install grub to non-bootable RAID-formatted drive?

    From Stefan Monnier@3:633/10 to All on Thursday, January 22, 2026 18:50:01
    I've scoured the Internet, but have been unable to find any clear, unambiguous, step-by-step guide as to how to make this drive remaining functioning drive bootable, either from the "grub-rescue>" prompt or by some other mechanism. I tried a couple of rescue disks that I located on the Internet, but they both errored out when I attempted to "rescue" the
    drive. So I've given up, at least for now, on trying to fix the problem from the "grub-rescue>" prompt.

    The grub-rescue prompt suggests that the drive *does* have its own copy
    of Grub (or at least a part of it), so maybe the problem is that the
    drive doesn't have a copy of `/boot`?


    - Stefan

    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From alain williams@3:633/10 to All on Thursday, January 22, 2026 19:50:02
    On Thu, Jan 22, 2026 at 10:03:11AM -0700, D. R. Evans wrote:
    Due to a cascading series of failures (some of hardware, some of my brain),
    I find myself in the following situation:

    You do not say what sort of RAID you are using, but you have 2 disks so I assume RAID-1 (mirrored disks).

    I also assume that you have a root file system that is a primary disk partition, ie you do not use LVM.

    The easiest way to recover is to boot from a bootable Debian USB memory stick or CD-ROM. Do not try to rescue, do something like below - needs work at the command line.

    Boot the machine to give you a desktop.

    Open a terminal, become root

    Identify the hard disk that contains your system, look at /proc/partitions.
    I am going to assume that it is /dev/sda with the root file system as /dev/sda1

    mkdir /tmp/RFS

    mount /dev/sda1 /tmp/RFS

    Copy /dev/ to /tmp/RFS/dev/

    chroot /tmp/RFS /usr/bin/bash

    grub-install /dev/sda

    sync

    exit

    reboot

    The above just typed, not tested.

    I had a linux-raid two-drive system that was working fine for many years.
    The system uses legacy BIOS booting. My notes from long ago say that both drives had a working GRUB; but it seems that my notes were wrong: one of the drives died without warning, leaving me with a drive with a
    fully-functioning trixie (and all the user data, etc.) present, but that drive seems to have no working GRUB in the MBR. Trying to boot it gives me a "grub-rescue>" prompt.

    I've scoured the Internet, but have been unable to find any clear, unambiguous, step-by-step guide as to how to make this drive remaining functioning drive bootable, either from the "grub-rescue>" prompt or by some other mechanism. I tried a couple of rescue disks that I located on the Internet, but they both errored out when I attempted to "rescue" the drive. So I've given up, at least for now, on trying to fix the problem from the "grub-rescue>" prompt.

    I can physically remove the drive and place it on a functioning machine, and have done so. With the drive in the functioning machine, I have checked that indeed all the data on it (that were in the original "/" hierarchy) are readable. So I just want to find a way to install GRUB on the MBR in a way that will cause the disk to be bootable into the system that was on it. That is, I want to be able to remove the disk from the functioning machine that it's currently (temporarily) on, put the drive back in the original machine, power on, and have the system come up as it used to (except now with just
    one active drive in the RAID array).

    From there I can add a new drive to the array and get myself back a fully-functioning two-drive RAID-based system.

    I hope that that's a pretty clear description of the problem. If more information is needed, I can of course provide it.

    I hope that someone here understands all this GRUB-and-boot stuff better
    than I do, and can provide steps that my child-like brain can follow to get me back to a working system.

    Doc

    --
    Web: http://enginehousebooks.com/drevans


    --
    Alain Williams
    Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
    +44 (0) 787 668 0256 https://www.phcomp.co.uk/
    Parliament Hill Computers. Registration Information: https://www.phcomp.co.uk/Contact.html
    #include <std_disclaimer.h>

    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From alain williams@3:633/10 to All on Thursday, January 22, 2026 23:10:01
    On Thu, Jan 22, 2026 at 02:40:10PM -0700, D. R. Evans wrote:

    All the system info is in that second partition. I don't rightly recall why the first partition is present (it's been an awfully long time since I installed this disk). I suspect that it's reserved for swap, although I
    doubt that swapping has ever occurred.

    As it says: it is a boot partition.

    So when you get into the chrooted environment you also should do:

    mount /dev/sda1 /boot

    This means that the root filesystem is /dev/sda2, rather than /dev/sda1 as you assumed.

    Correct.

    Identify the hard disk that contains your system, look at /proc/partitions.

    By "the hard disk that contains your system", I assume you mean the RAID disk.

    Yes

    But should I be using /dev/sda2 or /dev/md126 (as listed in /proc/partitions)??

    Prolly /dev/md126 - what works.

    So here I have a question. This looks like it will try to copy the /dev/
    from the running OS (i.e., the non-RAID drive) and overwrite the /dev that
    is on the RAID disk.

    Why would one do that? The /dev that was on the RAID disk worked fine until the other drive of the pair failed; so why does it need to be overwritten by the
    /dev from the running system?

    If you type the following it will tell you that /dev is a udev file system:

    df -h

    This is where device files are created on the fly as needed.

    You need /dev/sda* and a few others - the easiest way is to copy from the live system. When you reboot into your recovered system the contents that you copy should be wiped out (or mounted over).

    I'm sorry if I'm being dense. In this situation, I'm very nervous about running commands whose purpose I don't understand.

    Good to be nervous!

    --
    Alain Williams
    Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
    +44 (0) 787 668 0256 https://www.phcomp.co.uk/
    Parliament Hill Computers. Registration Information: https://www.phcomp.co.uk/Contact.html
    #include <std_disclaimer.h>

    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Alexander V. Makartsev@3:633/10 to All on Friday, January 23, 2026 00:00:01
    On 1/23/26 03:02, alain williams wrote:
    On Thu, Jan 22, 2026 at 02:40:10PM -0700, D. R. Evans wrote:

    All the system info is in that second partition. I don't rightly recall why >> the first partition is present (it's been an awfully long time since I
    installed this disk). I suspect that it's reserved for swap, although I
    doubt that swapping has ever occurred.
    As it says: it is a boot partition.

    So when you get into the chrooted environment you also should do:

    mount /dev/sda1 /boot

    This means that the root filesystem is /dev/sda2, rather than /dev/sda1 as >> you assumed.
    Correct.

    Identify the hard disk that contains your system, look at /proc/partitions. >> By "the hard disk that contains your system", I assume you mean the RAID disk.
    Yes

    But should I be using /dev/sda2 or /dev/md126 (as listed in /proc/partitions)??
    Prolly /dev/md126 - what works.

    So here I have a question. This looks like it will try to copy the /dev/
    from the running OS (i.e., the non-RAID drive) and overwrite the /dev that >> is on the RAID disk.

    Why would one do that? The /dev that was on the RAID disk worked fine until >> the other drive of the pair failed; so why does it need to be overwritten by >> the
    /dev from the running system?
    If you type the following it will tell you that /dev is a udev file system:

    df -h

    This is where device files are created on the fly as needed.

    You need /dev/sda* and a few others - the easiest way is to copy from the live
    system. When you reboot into your recovered system the contents that you copy should be wiped out (or mounted over).
    You got the right idea, but wrong method.
    You need to "mount --bind" it to that directory not copy.
    ? ? # mount --bind /dev /mnt/chrootdir/dev

    I'm sorry if I'm being dense. In this situation, I'm very nervous about
    running commands whose purpose I don't understand.
    Good to be nervous!


    --
    With kindest regards, Alexander.
    Debian - The universal operating system
    https://www.debian.org


    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From alain williams@3:633/10 to All on Friday, January 23, 2026 00:00:01
    On Thu, Jan 22, 2026 at 03:48:47PM -0700, D. R. Evans wrote:
    alain williams wrote on 1/22/26 11:01 AM:
    Copy/dev/ to/tmp/RFS/dev/

    so is the actual command:
    cp -r /dev/ /tmp/RFS/dev/

    If I try without the -r I get the error/warning message:
    cp: -r not specified; omitting directory /dev/

    Do a ls -l of the target - does it look right ?

    --
    Alain Williams
    Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
    +44 (0) 787 668 0256 https://www.phcomp.co.uk/
    Parliament Hill Computers. Registration Information: https://www.phcomp.co.uk/Contact.html
    #include <std_disclaimer.h>

    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Stefan Monnier@3:633/10 to All on Friday, January 23, 2026 00:20:01
    Copy/dev/ to/tmp/RFS/dev/

    so is the actual command:
    cp -r /dev/ /tmp/RFS/dev/

    If I try without the -r I get the error/warning message:
    cp: -r not specified; omitting directory /dev/

    AFAIK nowadays `/dev` is normally stored in another filesyste,
    dynamically populated. So you don't want to copy it to your
    root filesystem.
    Instead you want to "make it appear" there.
    The way I do that, usually is with:

    mount --bind /dev /tmp/RFS/dev

    I usually use that same approach to populate the `/proc` and `/sys`
    directories before doing the `chroot`.


    = Stefan

    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David@3:633/10 to All on Friday, January 23, 2026 10:10:01
    On Thu, 22 Jan 2026 at 17:11, D. R. Evans <doc.evans@gmail.com> wrote:

    Due to a cascading series of failures (some of hardware, some of my brain), I find myself in the following situation:

    I had a linux-raid two-drive system that was working fine for many years. The system uses legacy BIOS booting. My notes from long ago say that both drives had a working GRUB; but it seems that my notes were wrong: one of the drives died without warning, leaving me with a drive with a fully-functioning trixie (and all the user data, etc.) present, but that drive seems to have no working
    GRUB in the MBR. Trying to boot it gives me a "grub-rescue>" prompt.

    I've scoured the Internet, but have been unable to find any clear, unambiguous, step-by-step guide as to how to make this drive remaining functioning drive bootable, either from the "grub-rescue>" prompt or by some other mechanism.

    Hi,

    Because you see a grub-rescue shell when you try to boot, that confirms
    that grub is installed on your disk. However it is not installed
    "correctly" for the situation you find yourself in, probably because of the missing/failed disk.

    What is likely failing is that the tiny code installed into the boot sector
    by grub is currently unable to perform its intended function of locating
    the actual bulk of the grub code, which is always stored somewhere in
    a filesystem on the drive, usually under /boot. The failure might be
    occurring because the boot sector code contains instructions to look for
    the missing/failed disk, instead of the one you have.

    But, crucially, if RAID made the two disks identical, then the code that
    grub is looking for is very likely already present on the drive you have.

    So, I think it is likely that you should be able to use grub-rescue to
    boot that drive without "installing" grub first. And I would recommend attempting that. It does require following a sequence of grub-rescue commands that might look scary and unfamiliar, but it also avoids a lot of risk.
    Because if you can get it to boot by that method (without any other disks involved), then once booted you will then be able to run a simple
    "grub-install <drive>" from the same booted system that you want to boot in future, and that simplifies this grub-rescue operation.

    While it is more fiddly, this grub-rescue method is less risky than using
    some other drive to boot and then trying to run "grub-install". Because
    doing that introduces the extra variable factor of another boot disk in
    that running environment, and involves using a foreign grub instead of the
    same grub that is on the disk in question, to attempt to repair the grub
    that is on the disk in question. When doing a "grub-install" command in
    that situation, you would need to specify extra parameters to "grub-install"
    to ensure that you're actually fixing the problem, rather than recreating almost the same situation again.

    The grub-rescue instructions you need to follow are at [1].

    At step 2:
    ls # Find out which devices are available:

    we hope that pressing the Tab key after you type "ls" will activate "tab-completion" which is described at the bottom of [2] as follows:

    ... if the cursor is after the first word, the TAB will provide
    a completion listing of disks, partitions, and file names depending on
    the context. Note that to obtain a list of drives, one must open
    a parenthesis, for example:
    ls (
    and then press the Tab key.

    Occasionally tab-completion does not work in grub-rescue. But it usually
    does, in which case it it can be used to explore the disk and thereby
    reveal what values you will need to specify in the "set prefix" and "set
    root" commands.

    If you need guidance during this step, reply here with what you see.

    If you can get these grub-rescue instructions correct, your system should
    boot, and then you can run a simple "grub-install <device>" which should resolve your issue and make the drive bootable.

    I can't guarantee this will work, but it is definitely the approach that
    I would try first, because grub-rescue operations have always been
    successful for me, and following this grub-rescue method is a cautious
    approach because it does not modify the disk in any way.

    [1] https://www.gnu.org/software/grub/manual/grub/grub.html#GRUB-only-offers-a-rescue-shell
    [2] https://www.gnu.org/software/grub/manual/grub/grub.html#Command_002dline-interface

    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From alain williams@3:633/10 to All on Friday, January 23, 2026 12:10:01
    On Fri, Jan 23, 2026 at 09:05:20AM +0000, David wrote:

    But, crucially, if RAID made the two disks identical, then the code that
    grub is looking for is very likely already present on the drive you have.

    Be careful: there are 2 ways of setting up RAID-1 (mirror) for a partitioned disk:

    ? mirror the entire disk, ie sda & sdb

    ? mirror partition by partition, ie sda1 & sdb1; sda2 & sdb2; ...

    If the second way then unless grub-install is run on both disks then it might only be present on one of them.

    --
    Alain Williams
    Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
    +44 (0) 787 668 0256 https://www.phcomp.co.uk/
    Parliament Hill Computers. Registration Information: https://www.phcomp.co.uk/Contact.html
    #include <std_disclaimer.h>

    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David@3:633/10 to All on Friday, January 23, 2026 13:20:01
    On Fri, 23 Jan 2026 at 11:08, alain williams <addw@phcomp.co.uk> wrote:
    On Fri, Jan 23, 2026 at 09:05:20AM +0000, David wrote:

    But, crucially, if RAID made the two disks identical, then the code tha
    t
    grub is looking for is very likely already present on the drive you hav
    e.

    Be careful: there are 2 ways of setting up RAID-1 (mirror) for a partitio
    ned disk:

    ? mirror the entire disk, ie sda & sdb

    ? mirror partition by partition, ie sda1 & sdb1; sda2 & sdb2; ...

    If the second way then unless grub-install is run on both disks then it m
    ight
    only be present on one of them.

    That would be a concern regarding the grub boot sector code.

    So I wrote above "the code that grub is looking for", to mean NOT
    the grub boot sector code, but rather the grub modules in the filesystem.

    And we know that the code that the grub boot sector code is present,
    because it provides the grub-rescue prompt.

    Also, there's not much need to be "careful" when using those grub-rescue commands, because they don't modify anything. There's a much greater
    need to "be careful" when using other rescue methods.

    --- PyGate Linux v1.5.2
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)