So hear me out...
There are still tons of PCIe 3.0 x4 NVMe M.2 drives out there on the market.
Modern CPU's (aside from outrageous ones like Threadripper, or server level), have a severe lack of available PCIe lanes to allocate to devices directly. The best I found for a the latest Ryzen 9 7950X is a board with 2x PCIe 5.0 x16 slots: running x16/x0 or x8/x8 - and both can have bifurcation enabled, but only: x4x4x4x4 / x0 or x4x4 / x4x4.
I want to run a GPU (x8 is fine) but this leaves me with only another x8 slot to make x4x4 with bifurcation enabled. Meaning I could only connect a card like the CP073 to get 2 NVMe drives running on it. (disregarding the onboard slots on the motherboard at the moment). I WANT to have the 4 slot card, like the ASUS Hyper M.2 X16 Gen 4 Card.
So this brought up an idea... PCIe 4.0 is double the rate as PCie 3.0 (at least on paper). So theoretically, a PCIe 4.0 x8 slot should have the capacity to carry the bandwidth of 4 NVMe M.2 drives of the PCI 3.0 x4 generation.
Of course you cannot map the PCIe lanes directly like this, so there should be an AIC that hosts a direct PCI 4.0 x8 slot, NO bifurcation required, it takes the whole thing. Then has onboard PCIe switching (ideally maybe a lightweight RAID-0/1 controller) to allocate the bandwidth of 2 lanes from the PCIe 4.0 x8 host slot per each mounted NVMe M.2 drive of PCIe 3.0 x4. There could be 4 M.2 NVMe slots on said AIC, and theoretically this would not exceed bandwidth limitations, and 2 lanes of PCIe 4.0 equate to 4 lanes of PCIe 3.0.
This doesn't create any record breaking storage of course, but that is not what it is always about. Alas, as I searched again for PCIe 3.0 x4 drives, I realize their cost is not much more competitive over their PCIe 4.0 x4 counterparts... However, this principle could be applied at a level higher, take the PCIe 5.0 x8 slot, and mapping 4x PCIe 4.0 x4 drives on it with 2 lanes per drive. 4 drives of PCIe 4.0 x4 could make use of the capacity of 8 lanes on PCIe 5.0.