Why bother making it "watertight" (manifold)?
The engines I have worked with doesn't care if you have intersecting parts. The GPU doesn't care. For example, the roof could be just two big quads.
There might be good technical reasons in some engines (like transparent objects and such), so please correct me if I'm wrong. I mean, surely noone is using vertex lighting or some such nowdays (and if they did, these ugly triangles you are forced to keep will ruin it anyway)
For sharp corners (as the 90 degrees you have here for environment assets) Z-ordering will not be an issue either.