There are two generally accepted methods of parameter estimation. They are leastsquares estimation (LSE) and maximum likelihood estimation (MLE). The former is well known to us as many of the familiar statistical concepts such as linear regression, the sum of squares error, the proportion variance accounted for (i.e., r2), and the root mean squared deviation are tied to the method. On the other hand, MLE is not widely recognized among modelers in psychology, though it is, by far, the most commonly used method of parameter estimation in the statistics community. LSE might be useful for obtaining a descriptive measure for the purpose of summarizing observed data, but MLE is more suitable for statistical inference such as model comparison (i.e., which model should we choose). LSE has no basis for constructing confidence intervals or testing hypotheses whereas both are naturally built into MLE.
OLS has no basis for constructing confidence intervals or testing hypotheses

Go away, undergrad, and read the proofs in any graduate econometrics textbook.
Hint: you can express the leastsquares estimand's probability limit in terms of moments, which under relatively weak assumptions converge in distribution to a Normal distn by the CLT (and with White's standard errors that are robust to arbitrary heteroskedasticity, even obtain tests of null hypotheses with the correct size).

Go away, undergrad, and read the proofs in any graduate econometrics textbook.
Hint: you can express the leastsquares estimand's probability limit in terms of moments, which under relatively weak assumptions converge in distribution to a Normal distn by the CLT (and with White's standard errors that are robust to arbitrary heteroskedasticity, even obtain tests of null hypotheses with the correct size).E(xu)=0 is not a weak assumption. Neither is E(zu)=0 for any instrument z.

Go away, undergrad, and read the proofs in any graduate econometrics textbook.
Hint: you can express the leastsquares estimand's probability limit in terms of moments, which under relatively weak assumptions converge in distribution to a Normal distn by the CLT (and with White's standard errors that are robust to arbitrary heteroskedasticity, even obtain tests of null hypotheses with the correct size).E(xu)=0 is not a weak assumption. Neither is E(zu)=0 for any instrument z.
It is weaker than E(ux)=0 which is needed to think about causality, and it is also weaker than full independence between x and u. By construction it is always the case that E(xuhat)=0 so what you really need is
uhat>u. 
Are you an undergrad?
I assume his point is that psychologists are morons.
However the paper was written 17 years ago, so you would hope that the methodology in psychology has improved since then (spoiler: it hasn't).Any groundbreaking developments in OLS since 2000?