The paper https://eprint.iacr.org/2015/718.pdf looks very similar to merkle mountain ranges (see here). So that category of thing is definitely not a new idea, and there are other ways to do this as well, for example by adding a patricia tree of previous state roots into each state root.
There definitely are going to be many history-focused use cases inside sharding. Particularly, asynchronous cross-shard calls pretty much have to be done with this paradigm of “create a receipt on shard A, then later prove that the receipt exists to shard B”, and if you then use a model where users delay sending receipts as long as possible and only use receipts to create new receipts then you’ve basically reinvented the UTXO model.
The idea of having explicit data structures in the system to make low-witness-update-frequency history objects easily usable is definitely an interesting one. That said, there are limits: in general, any application that allows you to reference objects from the history is very often also going to require some kind of stateful mechanism for efficiently proving whether or not those objects have already been consumed.