http://ideas.repec.org/c/boc/bocode/s457555.html
not a joke, apparently it works by using second moment information.
There's a stata program for doing 2SLS without instruments!

I've used it, worked pretty well to increase efficiency. I wouldn't trust it as my only source of identification though. Best is to use it along with real instruments to get overidentification. Then can do standard tests of if all the instruments (the real ones and the ones this creates) are valid.

Several years ago when I was finishing my PhD, I didn't have a good instrument and my adviser suggested I look into Lewbel's estimator (he had used it himself and coded an algorithm  and no it wasn't Baum). I read the paper then but never incorporated the method  I don't know, it seemed kind of voodoo to me.

OP is basically correct: no exogenous source of variation in the endogenous explanatory variable is needed. That's the sense in which no IV is needed.
This sort of research is BS. It's an example of a poorly identified model because a certain kind of nonlinearity is needed  heteroskedasticity in this case  or the method breaks down.
Lewbel has an earlier paper suggesting using highermoment restrictions to handle measurement error, and it was published in Econometrica. Lewbel could do himself a favor and stop pushing this crap. It detracts from his other highquality work.

Are functional restrictions the same as voodoo now? It's testable, so why not try it and test it?
You can't test the functional form in any meaningful sense if you are using it to achieve identification. For example, if I conclude the square of the error is correlated with regressors, was it because the measurement error is heteroskedastic or the "structural" error in the original model?

Of course it can be tested. It's over identified, generates more moments than parameters. Ever heard of a J test?
Youngster, let me help you out with an analogy. Consider a linear model
y = x'b + u
where I know the key assumption for consistency of OLS is E(x*u) = 0. But suppose I want to test whether u is independent of x. Then I can, for example, test for heteroskedasticity using the OLS residuals. What if I reject? Well, it might mean u and x are not independent. But it tells me nothing about E(x'u) = 0. What if I don't reject? It still tells me nothing about E(x'u) = 0. Thus, the test of the second moment is useless for determining whether b is identified.
If I now require the first two moments for just identification then to get an overidentification test I have to add another  say, the third moment. But then the exact same logic applies. If E(u^3x) depends on x it tells me nothing about whether the other two moments do. And if
E(u^3x) doesn't depend on x, how good should I feel about my other assumptions? 
2f9f is an absolute idiot who doesn't like Econometrics as a study in itself as he doesn't understand it in enough depth and hence is afraid of it. Anybody wishing to do serious and well argued empirical economic research cannot do so with a key understanding of econometric theory being able to choose a form estimation given what is known at a theoretical level and how likely what ever assumptions are made hold for a problem at hand. This cannot be achieved without research on methods of identification under an array of different identification assumptions. It may be indeed that just running OLS is preferable in many ways to more seemingly convoluted methods of identification however again the study of "best" estimator and the comparison of estimator properties all must be studied first. To discount new econometric theory like this is nonsense.

2f9f is an absolute idiot who doesn't like Econometrics as a study in itself as he doesn't understand it in enough depth and hence is afraid of it. Anybody wishing to do serious and well argued empirical economic research cannot do so with a key understanding of econometric theory being able to choose a form estimation given what is known at a theoretical level and how likely what ever assumptions are made hold for a problem at hand. This cannot be achieved without research on methods of identification under an array of different identification assumptions. It may be indeed that just running OLS is preferable in many ways to more seemingly convoluted methods of identification however again the study of "best" estimator and the comparison of estimator properties all must be studied first. To discount new econometric theory like this is nonsense.

The 2012 Lewbel paper is hardly original anyway identification through covariance restrictions has research going back to at least the 1960s. Just another example where the key theory /idea was expressed in a paper written many moons ago though barely cited or acknowledged due to lack of relevance or applicability at the time of publication

As far as I can tell, no one is this thread actually read the paper. The estimator is based conditions involving the correlation of product of the errors in the two equations with regressors. So the claim that it is based on the 1960's type covariance restrictions is wrong. Also, it generates as many moments are there are exogenous regressors, so there is overidentification that can be tested without going to higher moments.

Whew! Good thing I found this thread. Now I can just pop in this lil' Stata program like I do with OLS and pump out "causal" results. Yay!
It'll be an Oklahoma Land Rush to AER to find causal effects for all those pesky questions we've been asking over the years.