The exact method for determining our spots is something we will never know. The Z-score method would rely heavily on the standard deviation for each category they assess, and differences in the spread could have more of an effect than the actual weighing they do to each category. The raw scores would be correlated with Z-score (how correlated is something we can't say), but the fact that it doesn't directly mean anything does make me question the reason for even providing it.
This is all speculation, and just based on things I've read here. Take what I say with a grain of salt.