You can spill in SSA form, they said. SSA-based spilling is amazing, they said. Linear time, they said. How do you handle phis? Apparently, that's an exercise lift to the reader. 🙄

@gfxstrand why would you do register allocation in ssa form?, I mean register allocation should be one of the last step, as it is very hardware dependent.

@ekg @gfxstrand You'd be surprised how much stuff happens after register allocation. Coming from an LLVM backend perspective, staying in SSA form is very appealing.

Follow

@nh @gfxstrand I imagine so. but handling phi nodes is certainly something that would be beyond me, and I would much rather strip the ssa form. Unless of course there existed spelled out compelling reasons to stay in ssa form.

· Librem Social · 1 · 0 · 1

@ekg @nh To sum up:

- Graph coloring is NP-hard
- Conventional spilling happens as a side-effect of graph coloring failing so you do the NP hard thing multiple times if you spill
- Graph coloring RA is well understand and in every compiler textbook

On the other hand...

- SSA-based RA is O(n).
- SSA-based RA is optimal and you know how many registers it needs up-front.
- Known pressure means you can spill before RA, also in linear(ish) time.
- SSA-based spilling in really frickin' hard

@gfxstrand @ekg Yes. Though to be fair, I don't think you strictly need SSA form to stay within register pressure bounds.

@nh @ekg Maybe not. Standard graph coloring certainly doesn't have predictable pressure. Naive linear scan is predictable but suffers from inefficiencies in register utilization. The linear scan modifications I'm aware of that efficiently use the run register file and give predictable pressure (such as second-chance binpacking) are roughly equivalent to going into SSA as part of RA. I don't claim to have a particularly complete knowledge of the register allocation literature, though.

@gfxstrand @nh @ekg

I'm increasingly of the opinion that SSA doesn't really matter. If you're willing to introduce parallel moves/split live ranges you can get good RA.

Just SSA is never good enough for actual constraints and if you use parallel moves for those your RA also needs to be able to insert "phis". Hence a lack of initial phis/SSA is only a minor inconvenience.

Which kinda brings us back to the tradeoff of reg pressure vs. number of moves/coalescing.

@gfxstrand @nh @ekg

memory->memory phis are actually one example where you really want coalescing of the memory locations to work and is a great example of how the SSA splitting might cause more moves unless lots of care is taken.

I don't know what you had trouble with, but my guess for an easy implementation is just not considering the memory locations as SSA, lower the memory phis at the end of the spiller and then rerun the spiller to deal with introduced temporaries in your phi lowering.

@bas @nh @ekg I'm still working through the details of spilling and memory phis. Over-all, I don't think we fundamentally disagree, actually. Once you introduce shuffling and resolving across edges in your linear scan RA, it's basically the same algorithm at that point. It's just a matter of what you call things and how you handle range splitting. You can do it with or without SSA as the input and you have the same headache worth of problems either way. 🤷🏻‍♀️

Sign in to participate in the conversation
Librem Social

Librem Social is an opt-in public network. Messages are shared under Creative Commons BY-SA 4.0 license terms. Policy.

Stay safe. Please abide by our code of conduct.

(Source code)

image/svg+xml Librem Chat image/svg+xml