The Volokh Conspiracy
Mostly law professors | Sometimes contrarian | Often libertarian | Always independent
Zestimates, Estimates, and Opinions
Zillow has no obligation to take down (or revise) property value estimates to which the property owners object.
From last week's Seventh Circuit decision in Patel v. Zillow—an interesting case that I was fortunate enough to consult on (on Zillow's side)—the facts:
A Zestimate is an estimated value for real estate, available on the Zillow web site for about 100 million parcels. Zillow generates Zestimates by applying a proprietary algorithm to public data, such as a building's location, tax assessment, number of rooms, and the recent selling prices for nearby parcels. But because Zillow does not inspect the building, it cannot adjust for the fact that any given parcel may be more attractive and better maintained, raising its likely selling price, or the reverse. Zillow states that its median error (comparing a Zestimate with a later transaction price) is less than 6%, though the Zestimate is off by more than 20% in about 15% of all sales. Zillow informs users that none of the parcels has been inspected and that Zestimates may be inaccurate, though Zillow touts them as useful starting points.
Plaintiffs filed this suit after learning that the Zestimates for their parcels were below the amounts they hoped to realize. For example, Vipul Patel listed his home with an asking price of $1.495 million and contends that the Zestimate of $1,333,350 scared away potential buyers. Plaintiffs asked Zillow either to increase the Zestimates for their parcels or remove them from the database. When it declined to take either step, they filed this suit, under the diversity jurisdiction, invoking [among other things] the Illinois Uniform Deceptive Trade Practices Act, which forbids unfair or misleading trade practices….
The court rejected the deceptive trade practices claim, in an analysis related to the Illinois statute, but one that is parallel to the approach used in libel cases under the common law and under the First Amendment:
[T]he statute deals with statements of fact, while Zestimates are opinions, which canonically are not actionable. Plaintiffs want us to brush this rule aside because, they say, Zillow refuses to alter or remove Zestimates on request. This does not make a Zestimate less an opinion, however….
That Zillow sells ads to real estate brokers [also] does not affect the statutory analysis. Having labeled Zestimates as estimates (something built into the word "Zestimate"), Zillow is outside the scope of the trade practices act. Almost all web sites, like almost all newspapers and magazines, try to finance their operations by selling ads. That they do so without telling customers exactly what pitches are being made to potential advertisers does not convert a declared estimate into an inaccurate statement of fact….
And the court also noted (perhaps because the author of the opinion, Judge Frank Easterbrook, is a noted law and economics scholar):
[P]laintiffs are mistaken to think that the accuracy of an algorithmic appraisal system can be improved by changing or removing particular estimates.
Suppose plaintiffs are right to think that the Zestimates for their properties are too low. Removing them from the database would skew the distribution, because all mistakes that favored property owners would remain, not offset by errors in the other direction. Potential buyers would be made worse off.
Suppose instead that plaintiffs are wrong—that they have overestimated the value of their properties, while the Zestimates are closer to the truth. Then removing them from the database would not just skew the distribution but also increase the average error of estimates. Potential buyers of plaintiffs' properties would be deprived of valuable knowledge.
Finally, suppose that plaintiffs are behaving strategically—that they know the Zestimates to be accurate (or at least closer to the likely sales price than are plaintiffs' asking prices). Then removing their parcels from the database, or "correcting" the Zestimates to match plaintiffs' asking prices, would degrade the accuracy of the database as a whole without any offsetting benefits to the real-estate market. In general, the accuracy of algorithmic estimates cannot be improved by plucking some numbers out of the distribution or "improving" others in ways that depart from the algorithm's output. The process is more accurate, overall, when errors are not biased to favor sellers or buyers.
Editor's Note: We invite comments and request that they be civil and on-topic. We do not moderate or assume any responsibility for comments, which are owned by the readers who post them. Comments do not represent the views of Reason.com or Reason Foundation. We reserve the right to delete any comment for any reason at any time. Comments may only be edited within 5 minutes of posting. Report abuses.
Please
to post comments
The trial court judge whose opinion was affirmed is now on Easterbrook's colleagues on the 7th Circuit.
I can't believe someone filed this law suit over a 12% difference in price, or about $160,000 out of over a million.
If it were MY $160,000, I might be pissed to not get it. Especially if the dude is close to breaking even on the sale, or underwater.
Right? And think about how mad he'd be if the listing price were $9M? Then it would be HIS $7.66M.
Huh?
I'm saying if he had listed the price for more than $1.495M, he'd be even madder, since he wouldn't even be getting HIS money.
Ah. No, that still doesn't make any sense.
So he lost $160K of his money because that was the amount he listed it above Zillow's price. He'd lose even more money if he just listed his price for higher. We can keep doing this, I'm enjoying it.
" We can keep doing this, I'm enjoying it."
Knock yourself out, I guess.
Damn it NToJ,
You know that Jimmy the Fish can't follow logic, reason, and critical thinking!
Ooh, what a zinger!
If the house were worth more people would be willing to pay it.
Nobody's buying a house with a price tag of over a million dollars without having their own appraisal done. Nobody's looking at the difference between the Zestimate and the asking price and getting scared off making an offer. They're looking at other properties in the area and seeing that he's over-valued his home.
And he's not lost 'his' $160k. It was never his. All that was ever his is whatever he and the buyer can agree on. If that's 30k then its 30k. If its 2mil then its 2 mil.
Exactly.
This guy didn't consult an experienced realtor in the area or get an appraisal done?
This seems clearly correct to me.
An interesting question is whether Zillow's estimates are unbiased - that is, the average estimate is correct, neither high nor low. I'm guessing they are, because bias should be easy to spot and fix, and Zillow has every incentive to do so.
But if not, then Easterbrook's argument is wrong.
Suppose plaintiffs are right to think that the Zestimates for their properties are too low. Removing them from the database would skew the distribution, because all mistakes that favored property owners would remain, not offset by errors in the other direction. Potential buyers would be made worse off..
Not so if the Zestimates run low. Then removing a low estimate makes the distribution more accurate.
Zillow only provides Zestimates on properties in states where the sale price is publicly reported. I have property in Georgia where Mint automatically updates the value with Zestimates. I have property in Louisiana which does not publicly report sale prices and no Zestimate is available. I can go to the Zillow site and get an estimated value for my property in Louisiana.
Zestimates are updated based on the reported sales according to their algorithm. It ought to be self correcting..
"Not so if the Zestimates run low. Then removing a low estimate makes the distribution more accurate."
I don't think that's necessary correct. Take a simplified situation--all houses are worth $1 million, but the Zestimates are $900,000. The average error is -10%.
If I remove one house from the group, the average error is still -10%.
The removal of some homes only improves the accuracy if the removal is of homes with greater than normal inaccuracies. So it is possible to improve the error through selective removal, but by no means is it certain. And, as the judge noted, the greater incentive to remove homes with under-estimated values than those with over-estimated values would clearly skew the results.
If the estimates are systematically skewed, that shows that the algorithm that generates them is flawed. Exposing this improves likelihood that Zillow will be motivated to find and correct the flaw(s) in their proprietary algorithm.
Since only sellers would be able to sue Zillow, allowing such lawsuits will virtually guarantee that Zillow will skew high. There's no allegation that Zillow is systematically skewed low. The plaintiff wants to cause Zillow's estimates to skew high.
"The plaintiff wants to cause Zillow's estimates to skew high."
No, the plaintiff wanted to cause Zillow's estimates to skew high on HIS property (from his point of view, of course, he wants it to accurately reflect the higher value of his property) or stop listing an estimate on his property at all.
Strictly speaking they want Zillow to list their individual house higher. The result (of liability) would be Zillow would skew high.
"Strictly speaking they want Zillow to list their individual house higher."
Or not at all.
Yes but as the court noted, if they only pulled listings that were too low, Zillow's results would not get more accurate.
Try reading ALL the sentences in my comment.
"Exposing this improves likelihood that Zillow will be motivated to find and correct the flaw(s) in their proprietary algorithm."
Would THAT make Zillow's results get more accurate?
No, for several reasons. First, as already noted, allowing plaintiffs to sue will only cause Zillow to "correct the flaw(s) in their proprietary algorithm" to the extent they undervalue the property. If Zillow makes corrective action, that will result in them removing from their database only one set of data, which will skew the results higher, but not make them (in the aggregate) any more accurate.
Second, more fundamentally, the "flaw(s)" is that Zillow doesn't have perfect information from homesellers. Requiring Zillow to change only when homeowners fill those gaps will have the exact same result--skewing higher--because the people who will submit more accurate data are only the people trying to increase the size of their listing. Zillow's numbers are the most accurate when there are equal sets of bad data on each side of perfectly accurate.
"No, for several reasons."
Is there a reason you're declining to address what I actually said?
I've answered it. The flaw is not in the algorithm, it's in the data. If you allow the data that you want submitted, submitted, the results will be less accurate. That's what the "Second,..." is saying.
So, it's because you just don't care, then?
I don't know how much clearer I can be. Here is your question:
"Would THAT make Zillow's results get more accurate?"
My answer is no.
Where is this assumption that Zillow is incorrect? The case information doesn't provide any evidence for this - only the fact that there's a difference between the price the seller listed the house at and Zillow's estimate. You seem to be assuming that the seller's estimate (and that's all it is) of his home's market value is correct when there's no evidence to suggest that he is (or isn't).
Duck916,
You are correct if there are no high estimates.
In your example of million-dollar houses, suppose there were nine houses valued by Zillow at $900K, and one at $1.1 million. Now if you take out one of the low ones Zillow's average the average price goes from $900K to $922K, and the error goes from -10% to -7.8%.
When you don't know if it runs low or runs high then the best thing to do is leave it alone until you do. Otherwise you're skewing the results and have a 50% chance of making them less accurate.
Don't assume that Patel's asking price is the 'right' price. There's no evidence that Patel's 'algorithm' is any more accurate at determining his home's market value than Zillows - and a lot of reasons why it might be skewed high.
"Don't assume that Patel's asking price is the 'right' price."
The right price, of course, is whatever the closing price is in an open market with willing buyer and willing seller.
"There's no evidence that Patel's 'algorithm' is any more accurate at determining his home's market value than Zillows"
You're confusing "I don't know that the evidence is or might be" for "there is no evidence".
" - and a lot of reasons why it might be skewed high."
...speaking of making an assumption with no evidence...
1. So what is the evidence? I don't see any in the court case.
2. I'm not making an assumption. I am deliberately avoiding making an assumption. You seem to be assuming the sellers valuation is the 'correct' one and that Zillow's is 'incorrect'. But you're not providing anything to support that.
Yes, people who want to spend 1.5 million on a house use Zillow, not a broker, to identify likely properties. Of course.
"(something built into the word "Zestimate")"
Good snark
And, amazingly Mr. Patel appears to have dropped the asking price on his house from $1,495,000 to $395,000, days after the Seventh Circuit's opinion came out. Perhaps Mr. Patel should have been thanking Zillow for the $1mm error in the "Zestimate" in his favor instead of suing them?
Frankly, the plaintiffs sound like litigious zassholes. [trademark!!!]
Zillow will probably sue you for trademark infringement.
You mean, trademark "zinfringement"
[this same joke never gets old]
It started old, and isn't getting younger.
Ugh. zyounger.
I'm curious... with the general trend towards laws favoring cursory "privacy"?, will it become a new "right" to have your property delisted with such sites?
For example, I recently bought a place that is listed on Zillow. Months later, while it does report as "sold", it also still shows all the photos that the previous owners put up. Photos that show the interior of my house. Since it is no longer for sale, and most likely won't be for some time, any "public interest" in these photos is now gone.
As the owner of the property, do I have a "right" to tell Zillow to delist the house and stop sharing photos of my bathroom? I suspect that, currently, any such claim would be a very difficult one to make in court. But I wonder how long that'll stay true. While some matters of public interest may be beyond my control (property taxes, assessment, etc.), specific lists of home features and non-street view photographs may reasonably be the subject of "take down" requests.
________
?Not real privacy, mind you. Just the appearance thereof.
I'd guess the previous owner or their real estate agent holds the copyright on those photographs and you have no rights in them. Whether Zillow has permission to use them is an interesting question.
"As the owner of the property, do I have a "right" to tell Zillow to delist the house and stop sharing photos of my bathroom?"
No. Just like all those young women who made a porn film and then decided that they'd rather NOT be remembered as a porn performer can't demand that people stop having copies of their stolen youth and innocence.
James, are you talking about any particular examples?...I'm unfamiliar with this phenomenon (other than one or two really well-publicized cases where the actress was underage at the time the flicks were made, which obviously is a separate issue).
I don't doubt at all that tons of adult film actors and actresses--having moved on to other careers--would rather that the public just forgot about their early work in front of the camera. But I am not familiar with actual attempts to force consumers to give up their hard (heh) earned porn.
"I don't doubt at all that tons of adult film actors and actresses--having moved on to other careers--would rather that the public just forgot about their early work in front of the camera. But I am not familiar with actual attempts to force consumers to give up their hard (heh) earned porn."
Because they have no legal right to make the attempt.
Look at the question I answered.
Yeah. It was rhetorical, as was obvious if you'd read the next line. The real "question" was about how notions of privacy rights and the internet would continue to evolve.
"The real 'question' was about how notions of privacy rights and the internet would continue to evolve."
You're free to address that question, if you want to.
EE, you can go into the Zillow record, demonstrate your ownership of the property and remove everything added from other than public records. House will still be listed by address but description will be similar to:
------------
xxxx xxxth St S, Tacoma, WA is a single family home that contains 1,364 sq ft
and was built in 1951. It contains 3 bedrooms and 1 bathroom. This home last
sold for $115,000 in June 2013.
The Zestimate for this house is $255,063, which has increased by $797 in the
last 30 days. The Rent Zestimate for this home is $1,595/mo, which has
increased by $75/mo in the last 30 days.
---------------
Seems reasonable, and simple enough.
"T]he statute deals with statements of fact, while Zestimates are opinions, which canonically are not actionable. "
Except, that's not true. Opinions based on undisclosed facts are actionable. What's the basis for Zillow's opinion? A proprietary algorithm whose details are not disclosed? Uh.... don't we have a match, there?
It seems to me that the problem for plaintiff is that plaintiff can't prove damages.
"A proprietary algorithm..."
based on "public data, such as a building's location, tax assessment, number of rooms, and the recent selling prices for nearby parcels."
I thought the disclosed facts issue was for defamation only.
"I thought the disclosed facts issue was for defamation only."
Yeah. but...
"an analysis related to the Illinois statute, but one that is parallel to the approach used in libel cases under the common law and under the First Amendment"
"based on "public data," should be prefaced by "allegedly"
Zillow no doubt will not reveal the source code because that is the proprietary part.
Come to think of it, Zillow only allegedly uses an algorithm.
Wrong. You've confused the statement "Opinions based on disclosed facts are not actionable" with "Opinions based on undisclosed facts are actionable."
It's right both ways, What's your point?
How can an opinion based on undisclosed facts be actionable?
Suppose I write a stock market newsletter, and recommend selling XYZ because I think their business is declining. Can XYZ sue unless I disclose all the information I used to come to that conclusion?
(To head off nitpickers, assume I came by the information legally, and have no particular reason to want to harm XYZ.)
"How can an opinion based on undisclosed facts be actionable?"
Ordinarily, opinion can't form the basis of a defamation claim. But opinions based on undisclosed facts sometimes can, because they imply defamatory claims of fact.
Allow me to offer an example.
If I claim that my opinion is that X is a sex pervert, that's opinion, and not actionable, true or not. But if I claim that, based on facts that I will not disclose here that X is a sex pervert, that's opinion but it's based on undisclosed fact. The opinion implies that there are facts that support that opinion. Even though I didn't say them out loud, if the claim I made implies defamation (facts that aren't true) and the other elements of defamation also exist, then the opinion based on non-disclosed facts can fulfill the elements of a defamation claim.
To circle back to your stock newsletter, if you say "it feels to me like company X is in decline", that's an opinion. If you claim to have done extensive research into the company, and therefore believe that the company is in decline, you may have a problem. This is why actual financial-advisors who offer stock tips make all their research available. Opinion based on disclosed facts is not actionable, even if incorrect or false.
No. This is wrong. Opinions that actually imply defamatory facts are actionable. Not opinions based on undisclosed facts, which all opinions are.
Nope. That's the canonical example of a purported opinion that absolutely is actionable, because it implies defamatory facts. (You don't need to say "based on facts I won't disclose here." That wouldn't be an implication, but an express statement.) What brings it to the opinion safe harbor is if one states the basis for one's assertion. "IMO X is a sex pervert because of A, B, and C." As long as one is not lying about A/B/C, one is okay, even if A/B/C don't really prove one's claim.
No, it's not right. If I say that you are a jerk (or you say it of me), that's opinion whether or not we cite to the comments (facts) that led us to that opinion. And the opinion is completely unactionable either way.
Opinions based on undisclosed facts can form part of a defamation action. If that means "completely unactionable" to you, then you are not very bright.
Nope. You're still getting the law wrong.
My point is that you understand neither the applicable law nor logic.
"Suppose plaintiffs are right to think that the Zestimates for their properties are too low. Removing them from the database would skew the distribution, because all mistakes that favored property owners would remain, not offset by errors in the other direction. Potential buyers would be made worse off."
Judge Easterbrook needs to take a longer view. If the situation he describes were to arise, then it would be strongly in Zillow's interest to get the Zestimates right on the first try, thus benefitting all parties.
This isn't true, if Zillow faces more punishment for having its prices too low rather than too high, it will skew high if the lawsuits persist.
" it will skew high if the lawsuits persist."
If Zillow's estimates are demonstrably accurate, the lawsuits won't persist.
That would be true only if there was zero cost to Zillow to defend the lawsuits. Zillow would have an incentive (in the amount of its defense costs) to skew its results high to avoid lawsuits. As I said, buyers can't sue, so Zillow would have an incentive to placate only one side, thus guaranteeing a skewed, higher, result.
"Zillow would have an incentive (in the amount of its defense costs) to skew its results high to avoid lawsuits."
If they are shown to be intentionally publishing inaccurate data, there's no reason to go to them, and the advertising revenue evaporates.
But who will show them to be intentionally publishing inaccurate data if they skew high? There won't be any lawsuits.
There are sources of information that aren't lawsuits.
Right, but since Zillow doesn't have control over who is going to submit information versus who is not, it can't improve the accuracy of its estimates by incorporating select sets of data from only certain homeowners.
" since Zillow doesn't have control over who is going to submit information versus who is not"
They don't have this now.
" it can't improve the accuracy of its estimates by incorporating select sets of data from only certain homeowners."
Assertion without evidence.
"They don't have this now."
Well they don't need it, since their algorithm isn't based on information submitted by sellers. It's based on public records. That's why it's useful. If they took in private data only from motivated homesellers, Zillow would become a tool for increasing home prices, rather than a tool for assessing home prices based on public records.
"Well they don't need it, since their algorithm isn't based on information submitted by sellers"
Nor, and I think it's important to point this out to you... has anyone suggested it should be.
The plaintiff has said that their data set--on which the algorithm operates--should be modified based on information submitted by sellers.
"The plaintiff has said"
Then go argue with HIM about what HE said, instead of arguing with ME about what I DIDN'T SAY.
Troll. You said "nor ... has anyone suggested" that. Then when he pointed out that someone actually has, you try to change the subject.
I think you're actually more wrong then my response indicates.
So since Zillow's estimates can't be based off certain information that it doesn't know, and since only sellers would ask Zillow to raise the price (based on this information that the seller has, but that Zillow doesn't have), that would necessarily skew results high since Zillow would not be lowering prices based on information that other sellers have that would lower the price. Sellers who are benefiting from a higher Zillow list price than is realistic would never sue to get it lowered. There's just no way to improve Zillow's algorithm by allowing a small subset of sellers to threaten suit, and have Zillow change its price based on the demand.
"Sellers who are benefiting from a higher Zillow list price than is realistic would never sue to get it lowered."
People who are paying real-estate taxes on the assessed values might prefer to see a lower number, though.
Zillow uses the tax assessment value as one input to its algorithm but there is no connection whatsoever going in the other direction. Zillow could drop all their numbers to zero or raise them all to $1 googlplex and it would have no effect in either the long or short term on real estate taxes.
And, of course, all real-estate owners are completely rational.
Just so we understand your point, you think Zillow's estimates of property values will get more accurate if it takes into account data submitted by admittedly irrational homeowners who are trying to artificially lower the value of their property, outside of the sales context for which Zillow exists?
"Just so we understand your point"
You do not. You insist on substituting your own, repetitively and despite correction.
"you think Zillow's estimates of property values will get more accurate if it takes into account data submitted by admittedly irrational homeowners who are trying to artificially lower the value of their property, outside of the sales context for which Zillow exists?"
This is entirely your own invention.
Ok, let's break it down. You said people who want the value of their homes to go down (because they pay property taxes) might manipulate Zillow. Rossami pointed out that's not how property value taxes work--and you didn't disagree. Instead, you suggested that homeowners would still submit the information because they aren't rational.
My question to you is: If homeowners who irrationally believe that lowering their property value on Zillow will keep their taxes down--or want it lower for any reason--submit information to Zillow with the intent to lower their property values, do you think this will improve the accuracy of Zillow's algorithm?
"Ok, let's break it down."
You've BEEN broken down, and seem to prefer it that way.
"You said people who want the value of their homes to go down (because they pay property taxes) might manipulate Zillow."
No, I didn't say this.
:" you suggested that homeowners would still submit the information because they aren't rational."
I didn't suggest this, and when I asked where you got the idea that anyone was submitted anything to anyone, you just repeated yourself.
" If homeowners who irrationally believe that lowering their property value on Zillow will keep their taxes down--or want it lower for any reason--submit information to Zillow with the intent to lower their property values"
You're REALLY fixated on submitting information to Zillow. Where did you get this idea from?
" you think this will improve the accuracy of Zillow's algorithm?"
I have no opinion on your theory.
"No, I didn't say this."
Wonderful. Your statement about homeowners not being rational, and about them wanting Zillow to lower their home prices because of tax appraisals, was just a pointless fucking nothing. Thank you for clarifying.
"You're REALLY fixated on submitting information to Zillow. Where did you get this idea from?"
From the plaintiff in the case that this post is about, who was asking Zillow to increase (or remove) the valuation based on information the plaintiff had that it didn't think Zillow was considering.
So, the plaintiff wanted something, therefore, anyone who comments on the story wants the same thing?
James, in a thread about response bias (where the hypothesis is that only those who want to see the Zillow listing higher will complain, therefore acceding to complaints will lead to high-side bias), you replied that some users (those seeking lower property taxes) would value lower listings.
The implication to me (and I suspect to pretty much everyone else) was that you thought those users would also complain to Zillow, tending to cause the biases to cancel out.
If you did not mean to imply that by your comment, then I must say that your comment was very ambiguously written and your actual meaning was entirely unclear.
This analysis is fine, but doesn't go far enough.
Once it becomes known that Zillow is biased high, anyone using it will discount the estimate by some amount.
But if you don't know how much to discount, their number is useless.
If word gets around that their numbers are useless, what does the company have to offer? Specifically, to advertisers?
That's the point, Zillow is only useful if it doesn't take into account the information you and the plaintiff insist it should take into account.
WTF are you high on?
Upstream you pull an "Oh what a zinger" on somebody for not making a substantive point. And then you post this? You're a real piece of shit. If you don't understand the point I'm making, ask questions for clarity. If you aren't interested in clarity, move on.
You didn't answer the question.
They're estimates, James - there is no 'right'. Its just someone saying 'Based on the difference between your home and theirs, your home gets devalued for some things, increased value for other things, and this is about where we think other people will value it.'
"They're estimates, James"
Thanks for stating something obvious.
I lice in a rural area. My son is looking at buying a house. Zillow is one of the sites we've been looking on.
Houses that sell in the area sell pretty quickly. The Zestimate on houses on Zillow are, well, over estimates because they're based on "similar" property that's been sold. "Similar" being the key word.
Ex: Both have plumbing (Note- not all houses in this area do.) The one tnat sold is copper or has been completely gutted and replaced with PEX. The one that didn't has a mixture of galvanized, PVC, and CPVC, replaced piecemeal. Both houses have electricity. The one that sold has circuit breakers, the one that hasn't has a fuse box. Both houses have wells, grandfathered. The one that sold meets today's standards for well location. The one that's still on the market doesn't. That last is why I told him to reject one house. The well location is totally unacceptable.
Of course, the more real estate transactions in a area, the more accurate Zillow's, or anyone else's algorithms are likely to be accurate, remembering that in commerce, something is worth only what someone else is willing to pay for it, regardless of estimates or anything else- the purchaser determines the price. In the 2 zip codes that make up our school district, Zillow has a whopping 11 homes listed. 2 of them were originally built after 1900. I know that most recent home sales are of newer homes.I don't know if Zillow's propriety algorithm takes "1870" for year built as a significant factor.
" In general, the accuracy of algorithmic estimates cannot be improved by plucking some numbers out of the distribution or "improving" others in ways that depart from the algorithm's output."
This is flatly incorrect.
Here's my algorithm: All properties are valued at exactly $500,000.
Now, here's my correction. Take all the properties out of the data set that aren't worth exactly $500,000. When you finish doing this, my data set will consist exclusively of properties that are valued at exactly $500,000, or 100% accuracy.
Congratulations! You've discovered that the term "In general" does not apply to all the absurd non-realistic edge cases you can make up.
Congratulations! In general, you're stupid.
A retort as in line with the rest of your argument.
Do you truly not realize what the words "in general" mean?
Do you truly not understand that a statement that is incorrect in all cases is not correct "in general"?
Does it hurt being that ignorant Jimmy?
Or do you have to work hard to achieve that level of ignorance?
:
"Does it hurt being that ignorant Jimmy?"
Apparently, not enough to prevent you from using a computer.
And your algorithm will also be useless as a predictive tool.
"And your algorithm will also be useless as a predictive tool."
No, it just has a sharply limited data set. Within the data set, it still works.
Now, this is also true of ANY data set. If you remove the data points where the algorithm doesn't work, the algorithm gets more accurate.
Here's a prediction algorithm: In January, it will be cold outside. Now, exclude all the data points where the algorithm produces inaccurate results. For the points that remain, the algorithm becomes more accurate. For Australia or Tahiti, it's useless. But there's still plenty of places where it's accurate.
Nope, useless.
There are no use cases or scenarios where your "algorithm" has any utility at all.
And your "January" example isn't even comparable. That one is still predictive. "In the month of January, outside of Australia or Tahiti, it will be cold outside". We can ask it about a future date/location pair, and it can give us a "cold" or "don't know" answer. Without having perfect knowledge of the future or past.
Yours? Give it an address. If we're before the pruning step, it'll give us $500,000 for every address. Useless. If we're after the pruning step, then we have the full data set of actual valuations already, and there's no reason to use an algorithm.
Simply put, since your algorithm is only accurate after we have perfect data, at which point we have no reason to use it.
"Nope, useless.
There are no use cases or scenarios where your "algorithm" has any utility at all."
That's your lack of imagination at work.
"And your "January" example isn't even comparable. That one is still predictive."
Duh. It was intended to be. But the algorithm has a limited domain, and produces known-to-be-inaccurate results when applied outside its domain. Because it's illustrating the existence of those features. It also produces inaccurate results for lunar locations, which may be cold or hot, but depending on time of month, not time of year.
" We can ask it about a future date/location pair, and it can give us a "cold" or "don't know" answer. Without having perfect knowledge of the future or past."
In other words, it works, WHERE IT WORKS, and it doesn't work where it doesn't work. Just as it was intended to illustrate.
"Simply put, since your algorithm is only accurate after we have perfect data, at which point we have no reason to use it."
You get that it was deliberately oversimplified, right?
So, its useless as a predictive tool then.
James, how do you know which homes are not worth $500,000?
Because if you make a mistake and remove the homes *not* worth $500,000 - then you're introducing errors into your results.
If you're erroneously removing an equal amount of high/low/equal datapoints then your error doesn't increase - but its still there and its still massive. If you don't do that - and you can't remove the high/lows exclusively because you don't know which ones they are (if you did you wouldn't need the algorithm in the first place) then you increase the error.
"James, how do you know which homes are not worth $500,000?"
Sorry, that algorithm is proprietary.
" if you make a mistake and remove the homes *not* worth $500,000 - then you're introducing errors into your results."
I think you made a mistake here. If I remove the cases where the algorithm produces incorrect results, I'm removing errors, not introducing them.
"If you're erroneously removing an equal amount of high/low/equal datapoints then your error doesn't increase - but its still there and its still massive"
Again, removing the datapoints that are high or low isn't erroneous. The error doesn't increase because it is decreasing. Once the bad data is pruned, the data rate is 0. If you consider that "massive"... (Now, the range of datapoints that produce no result in the algorithm is quite massive, decreasing, as noted above, the utility of the algorithm. But controlling an algorithm's range of input is normal.
Only if you know what the errors are - you don't know which homes are not $500,000.
You don't know which data points are high or low.
[P]laintiffs are mistaken to think that the accuracy of an algorithmic appraisal system can be improved by changing or removing particular estimates.
Well, this is a hard statement to evaluate. Presumably the algorithm may sometimes go astray, and produce an obvious outlier. Remove it and the accuracy improves. OTOH, you can argue that identifying and removing obvious outliers is part of the algorithm.
Suppose a seller demonstrates to Zillow's satisfaction that the estimate, while not stupidly wrong, is nonetheless inaccurate for some reason not picked up by the algorithm. I don't see how it can hurt to change the estimate. Yes, if the average price in the database is correct, then this would make it incorrect - Easterbrook's narrow point - but why is that the measure of the accuracy of the algorithm?
I'd say something like mean squared error is a better measure, and making the correction will improve this. Easterbrook is not quite as clever as he thinks. (Unlike the rest of us.)
"OTOH, you can argue that identifying and removing obvious outliers is part of the algorithm."
You can't always detect them from within the algorithm. For example, if the algorithm doesn't know that the structure burned to the ground last night, it's going to produce inaccurate results for the value of the property.
"Suppose a seller demonstrates to Zillow's satisfaction that the estimate, while not stupidly wrong, is nonetheless inaccurate for some reason not picked up by the algorithm. I don't see how it can hurt to change the estimate"
The processing costs might consume your profit margin. From Zillow's point of view, that would hurt. If they set up an "opt-out" that could be handled automatically by computer, then the processing cost for that might be kept low enough to be cost-effective. But if human labor is involved, I don't see it ever being cost-effective.
"...but why is that the measure of the accuracy of the algorithm?"
Since the information is only based on publicly available information, it isn't subject to manipulation by homeowners. Allowing the manipulation would render the service less reliable, even if the cost is that some home are under or overpriced.
Allowing the manipulation would render the service less reliable, even if the cost is that some home are under or overpriced.
That doesn't answer the question.
And if the "manipulation" is justified then the service would be more reliable, not less.
Of course James has a point that allowing it likely to be too costly, but I am speaking more abstractly, trying to evaluate Easterbrook's point. The more I think about it, the more I think he is wrong.
Suppose we have four houses, valued by Zillow at $800K, $900K, $1.1M, and $1.2M respectively. Unbeknownst to Zillow, the high and low estimates are both off by $200K, as those two houses should each be valued at $1M. The $900K and $1.1M figures are accurate.
Well, they got the average value - $1M - right. Now suppose the owner of the $800K place objects, shows lots of internal improvements, maybe points out some other things that Zillow didn't consider, and convinces them that $1M is the right number (as we are assuming it is.) Now their average is wrong - it goes to $1.05M, but the overall accuracy has improved. They have 3/4 right, and the MSE has been cut in half.
Again, I am speaking abstractly, but so is Easterbrook and, law and economics scholar or not, I think he's wrong.
Yes but the net overvalue of the houses in the data set has increased from $0 to $200K. If we assume that the changes/revisions will cluster on undervalued houses, we're going to have a drastic and continued increase in the net overvalue of houses the larger the data set. To avoid systematically overvaluing the housing market, you need the undervalued data to offset. The solution to the accuracy problem would be to change both the $800K and the $1.2M. But there's only a feedback mechanism for increasing the price of the $800K, not decreasing the $1.2M house. Also, if the $800K was just removed (rather than modified) your average goes up to $10.66M, and the number of houses you have correct goes from 75% to 66.66%.
If the purpose of Zillow is to have an exact market price for the most number of houses, your argument would make more sense. But I don't see that as what Zillow is intended to do. It appears that you and Easterbrook agree on the math, you just disagree about the definition of accuracy.
So the "net overvalue" goes up. Who cares? The "net undervalue" goes down. and the total overvalue stays the same. The MSE, which combines undervalue and overvalue, goes down. Overall error is reduced.
Yes. Easterbrook and I disagree, and I think he is wrong. At a minimum, needs to be a lot more precise in his claims. That he wasn't suggests to me that he really doesn't understand what's going on and is doing a poor job of playing economist.
Look, if the goal is to get the correct average price across all houses then he is right. But I don't think think that's the objective. It should, IMO, be to get as close as possible to the correct individual values. I think that's what users of Zillow are looking for. Getting the overall average right doesn't matter. Is Zillow's objective to measure the total value of the US housing stock, or to give guidance on the value of individual properties? If the latter, then he is just wrong.
Suppose you and I are asked to value two houses, both of which are in fact worth $1M. You come up with $1.1M and $1M. I come up with $1.1M and $900K. Do you really think that I did a better job than you, that my estimates are more accurate? That's ludicrous.
"So the "net overvalue" goes up. Who cares? The "net undervalue" goes down."
This is getting confusing. Originally there was no net overvalue or net undervalue. (So the net undervalue doesn't go down.) When you make the proposed change, the net value-error change goes from $0 to $200K. I said "overvalue" because it happened to bias the numbers up. With the original numbers, there's no overvalue of the market. When you remove the $800, or increase $800K to $1M, you have an overvalue of the entire data set.
"I come up with $1.1M and $900K."
If we remove the $900K, my data set becomes less accurate. Your $1.1M and $1M estimate makes an assumption about the data set that is inconsistent with what we're analyzing. If the data set contains no low estimates, the problem Easterbrook is talking about, goes away. But so does the plaintiff.
Originally there was no net overvalue or net undervalue. (So the net undervalue doesn't go down.) When you make the proposed change, the net value-error change goes from $0 to $200K. I said "overvalue" because it happened to bias the numbers up. With the original numbers, there's no overvalue of the market. When you remove the $800, or increase $800K to $1M, you have an overvalue of the entire data set.
So what? The issue is the appropriate measure of error. You, like Easterbrook, seem to think that what is all-important is getting the total value of the housing stock, or the average value of an individual house, right.
It's not all-important. It's not even very important. It might be, if Zillow were being used to make some types of decisions that depended on that value, but it's not primarily used that way, if at all.
If we remove the $900K, my data set becomes less accurate.
You mean mine, not yours, but the statement is only true for a not very useful definition of "accurate."
(Continued)
(Continuation)
Your $1.1M and $1M estimate makes an assumption about the data set that is inconsistent with what we're analyzing. If the data set contains no low estimates, the problem Easterbrook is talking about, goes away. But so does the plaintiff.
Again, you have the example backwards as to who estimated what. I gave you credit for the $1M and $1.1M estimates, and took the $900K and $1.1M for myself.
Regardless, what assumption does either pair of estimates make about the data set that is inconsistent with anything? In the example we are just two people, maybe real-estate brokers, asked to estimate the value of two houses. Anyway, I'm not talking about plaintiffs and lawsuits and manipulation. I'm talking about his views as to what would or would not make the data more accurate.
Suppose Zillow, on its own, discovered an error in its algorithm that caused some house to be overvalued. Should they not make corrections to those estimates? If they do, should they arbitrarily increase the estimates for other homes to keep their total the same? That would be absurd, wouldn't it, yet that is what Easterbrook's claim amounts to.
You are in the position of arguing that, given two houses each worth $1M, someone who estimates their value at $900K and $1.1M is more accurate than someone who estimates $1M and $1.1M.
I think you should reconsider.
"That would be absurd, wouldn't it, yet that is what Easterbrook's claim amounts to."
I don't think you're accurately capturing Easterbrook's claim. He's operating on an assumption that the error rate for Zillow is approximately equal on both sides of the equation, or at least we don't have information to show otherwise. When you change that assumption (by saying there are actually more systematic errors on one side), presumably Easterbrook would agree that Zillow should address those.
"You are in the position of arguing that..."
if you allow only undervalued properties to challenge errors, the overall data set will bias upwards, in a way that won't increase the accuracy of the data set. That's true whether the $900K is modified upward, or removed. We're really arguing over the purpose of the algorithm. If I'm a home buyer, though, and the data set skews high (as your $1M and $1.1M set does, moreso than the $900K and $1.1M set), home buyers are going to believe that the value of homes is higher than it is. That hurts them. The reason the algorithm works best at the data set level is because presumably it serves both home sellers and home buyers. A bias in the data set in either direction will harm sellers or buyers.
I don't think you're accurately capturing Easterbrook's claim. He's operating on an assumption that the error rate for Zillow is approximately equal on both sides of the equation, or at least we don't have information to show otherwise.
Sorry, that's exactly what he's saying. And you too. It's not just an assumption. It's a statement that this equality is the most important measure of accuracy. You want the data set to be unbiased, at the cost of accuracy on individual houses. You prefer the set of estimate that has both houses off by $100K to the one that has one right and one off by that amount. That's exactly the import of what you are saying - that Zilllow shouldn't correct an error. I think that's nonsense.
Suppose there are lots of houses worth $1M. Is it better to value half at $500K and half at $1.5M than to value most at $1M and a few at $1.1 or $1.2M? Of course not.
(Cont.)
(continuation)
if you allow only undervalued properties to challenge errors,
I'm not talking about challenges or lawsuits. I'm talking about changing estimates, in either direction when they are known to be wrong, as in my example of Zillow discovering an error on its own. I think you are too focused on this litigation, and not on the broader claim by Easterbrook.
If I'm a home buyer, though, and the data set skews high (as your $1M and $1.1M set does, more so than the $900K and $1.1M set), home buyers are going to believe that the value of homes is higher than it is.
There is no reason to think that correcting errors will predictably cause the values to get higher. In the example I could just as easily have used $900K and $1M as the estimates and made the same (correct) argument. Indeed, correcting errors should have the long-run effect of making the database more accurate even by your standard.
A bias in the data set in either direction will harm sellers or buyers.
An error in the data set will do more harm.
A low error may hurt the seller, but it may also hurt the buyer by causing him to waste time on a property he really can't afford. Similarly, a high estimate may cause a buyer to overpay, but it may also hurt the seller by discouraging some potential buyers from looking at the house. Even by your definition the unbiased pair of estimates hurts one buyer and one seller, while the biased one hurts only one buyer (maybe).
The thing is - there's no one way to do an estimate. There's no one set of data that everyone uses and no one procedure.
So you can show Zillow that *your* estimator got result Y using technique Z out of dataset X - even though that's different from what Zillow got using their stuff, that doesn't mean Zillow is wrong. Doesn't mean they're right either.
In order to measure the accuracy of the estimating algorithm, you have to have an objective standard. And that's pretty much impossible to do with real-estate because its so unstandardized.
You could compare the estimates to actual selling prices.
Right - so you look at the zestimate and the final sales prices of other homes and see how accurate it is.
Zillow seems to think its accurate enough. Patel doesn't. How accurate is accurate enough?
Whether professional appraiser, zillow, or any other method of divining the value of real estate, in the end it comes down to bona fide transactions.
I'm thinking of a number. If you write a number larger than that on a cashiers check, then the house is yours, and the value of the house is the mean of those two numbers. If your number is lower than my number, we don't yet know the value, as no sale has occurred.