Constraint-Aware, Scalable, And Efficient Algorithms For Multi-Chip Power Module Layout Optimization