To avoid confusion, I think we should be careful about the name used for this problem. The phrase “data availability problem” is currently used to refer to the problem of verifying that data behind a block header was even published in the first place. The problem you’re describing seems to be about how efficiently light clients can retrieve that data, if only very few nodes are distributing that data. I propose that we call that problem the data distribution problem.
I strongly agree with the approach of allowing each node to contribute a small amount of storage to the network, creating a peer-to-peer BitTorrent-like data sharing network. This is the approach we’ve taken in the LazyLedger proposal, and other researchers have taken this approach too.
While this is fairly simple to do for distributed block data storage, I’m not sure how it can be efficiently done for the computed state of the chain, because it changes every block. Do you have any thoughts about this? I suppose if the state was committed as a Sparse Merkle tree, clients could download “diffs” of the subtree they are storing.