I suppose they "dropped the ball" in the sense that those instructions cannot be assumed to be available, thus will not be encoded by the compiler by default. Any future processors which include the instructions may not benefit until developers recompile for the new instructions and go through the extra work required to conditionally execute when available.
That said, to get the best performance on vector math it has long been recommended to use Apple's own Accelerate.framework, which has the benefit of enabling use of their proprietary matrix math coprocessor. One can expect the framework to always take maximum advantage of the hardware wherever it runs with no extra development effort required.
That said, to get the best performance on vector math it has long been recommended to use Apple's own Accelerate.framework, which has the benefit of enabling use of their proprietary matrix math coprocessor. One can expect the framework to always take maximum advantage of the hardware wherever it runs with no extra development effort required.