Disclaimer: Im a noob at ML. This is all just some random thoughts of mine.
Maybe you could try predicting the differences instead? For example predict a, a-b and a-d. Then use something like ReLU on the differences. I dont know how to deal with c as you have to ensure that it is <= both b and d. I was thinking of predicting both b-c and b-d but this may cause problems as there would be 2 equations of c giving different answers. I thought of just taking the min but it seems very sketchy. If only there were an additional constraint of b>=d lol

in order to enforce these dependencies you can
1) use additional loss elements to penalize network for breaking constraints
2) predict output in a sequential fashion, starting from the smallest element c, then predict differences with other elements and use max(diff, 0)

why not, you can use Convs for this they can easily handle square matrix

You don't need to use matrices for this task; you can flatten them into 4-element vectors and then reshape the output into a matrix.

Disclaimer: Im a noob at ML. This is all just some random thoughts of mine. Maybe you could try predicting the differences instead? For example predict a, a-b and a-d. Then use something like ReLU on the differences. I dont know how to deal with c as you have to ensure that it is <= both b and d. I was thinking of predicting both b-c and b-d but this may cause problems as there would be 2 equations of c giving different answers. I thought of just taking the min but it seems very sketchy. If only there were an additional constraint of b>=d lol

in order to enforce these dependencies you can 1) use additional loss elements to penalize network for breaking constraints 2) predict output in a sequential fashion, starting from the smallest element c, then predict differences with other elements and use max(diff, 0)