PART II: MARKET MODELING REGRESSION AND PAIRED SALES
David Braun, MAI, SRA (a la mode labs Fellow)
In "Part I: Introduction to Market Modeling", I made a strong argument that there was really no difference in the mathematics of market modeling whether performed by a computer or by hand. Admittedly, the science of market modeling is going the direction of the computer, as it should be. I made a simple regression application designed for use by the real property appraiser available in conjunction with this series of articles. This was done to allow appraisers to have some hands-on experience with regression analysis. You will have a pretty good idea of what regression does well and its limitations after performing a few analyses with it.
For over a decade the real property appraiser has been blessed with the promise of regression analysis tools that would allow for a much more accurate appraisal to be performed in a fraction of the time, and cursed by the prophecies of automated market models (AVMs) that would replace him/her. While data is more prevalent than ever and computer science has evolved, neither of these scenarios has occurred to the degree predicted- has the appraiser been duped? In theory, much of what the appraiser has been told is true. However, the rigors of the real world have severely hampered the prophecies (theory) from becoming fact. One of the goals of this series of articles is to educate the appraiser on the practical issues of market modeling while avoiding the theoretical rhetoric of the nature of its mathematical methods. An array of well written articles on the theoretical issues can be found by typing the word "Regression" into most any web search engine.
This article is being originated in the a la mode labs, which by definition is an environment guided by trial and error, not theory. What "rigors" of the real world are dampening the successful implementation of the recognized mathematical methods of market modeling? Let's define the appraiser's goals and demands of market modeling tools.
The successful tool will do one or both of the following:
- Produce a more reliable (accurate) opinion of value without adding too much time to perform the appraisal.
- Shorten the time to perform an appraisal of equal quality.
Basically, the market of users of appraisals is requiring inexpensive and quick appraisals. There are exceptions to this, but the masses of appraisers do only a small fraction of their practice in this upper fee arena. The long and short of why the use and success of modeling tools have been lagging, with the exception of the low end AVMs for lenders, is because developers have not been designing modeling tools that meet one or both of the requirements presented above.
Product 1
The tool appraisers need to produce a more accurate value opinion would be one that develops justification and support for the adjustments made and the final value opinion. Regression has been the "talked about" method for accomplishing this. It is my conclusion that regression might be able to achieve this goal, but not without a lot of sales data, knowledge of how to best present data for the analysis, and without adding an unacceptable increase in time. Those of you that have experimented with the regression application that I provided on the Lab's web site know that even when the model resulting from regression analysis produces a reasonable value opinion, the specific adjustments are often questionable which would not support an appraiser's adjustments.
Product 2
Regression can serve as a tool for forming a value opinion even if the individual market adjustments appear questionable, when the assignment requires low accuracy in the final value opinion. By in my experience, regression analysis does not appear to shorten the time it takes to perform an appraisal based on a modest to extensive scope of work. Some examples of assignments acceptable for a low level of scope of work are things like appraisals done in conjunction with low risk mortgage transactions, preliminary values (comp checks), and to assist in the review function.
Regression and paired sales techniques work great when there is accurate data, the independent variables (property characteristics) are not correlated with each other, and the market forces are constant. The following is an example of such a market. In this case we have defined the market forces as $100 per each gross living area, $3,000 per rating for car storage, $5,000 per rating for location, and $4,000 per bedroom. Fifteen records (sales) have been created that have sales prices based on exactly that model. For example the sales price for Record 1 was found by the following calculations: [(2,100 x $100) + (1 x $3,000) + (5 x $5,000) + (2 x $4,000)].
PERFECT MARKET |
|||||
Rating |
1-3 |
1-5 |
|||
Value/unit |
$100 |
$3,000 |
$5,000 |
$4,000 |
|
Record |
SP |
GLA |
Car Stg |
Location |
BR |
1 |
$246,000 |
2,100 |
1 |
5 |
2 |
2 |
$214,000 |
1,800 |
2 |
4 |
2 |
3 |
$197,000 |
1,650 |
3 |
3 |
2 |
4 |
$215,000 |
1,900 |
1 |
2 |
3 |
5 |
$243,000 |
2,200 |
2 |
1 |
3 |
6 |
$266,000 |
2,200 |
3 |
5 |
3 |
7 |
$304,000 |
2,650 |
1 |
4 |
4 |
8 |
$357,000 |
3,200 |
2 |
3 |
4 |
9 |
$215,000 |
1,800 |
3 |
2 |
4 |
10 |
$210,300 |
1,903 |
1 |
1 |
3 |
11 |
$256,400 |
2,134 |
2 |
5 |
3 |
12 |
$581,000 |
5,400 |
3 |
4 |
3 |
13 |
$207,700 |
1,817 |
1 |
3 |
2 |
14 |
$288,300 |
2,643 |
2 |
2 |
2 |
15 |
$247,500 |
2,255 |
3 |
1 |
2 |
The multiple linear regression analysis of this data returns a market model that perfectly matches the market created above. The regression model returned follows:
GLA |
Car Stg |
Location |
BR |
Intercept |
R2 |
Confidence |
$100 |
$3,000 |
$5,000 |
$4,000 |
$0 |
1.000 |
1.000 |
Why do techniques such as regression and paired sales have such a hard time interpreting market behavior in a real market? Put another way, why is it so hard to extract adjustments from the market using these techniques? There are at least three reasons for this:
- These methods attempt to measure market activity with no knowledge of the causation of that activity.
- There is a great deal of correlation in the property characteristics often used as the dependent variables.
- Real property markets are imperfect.
While the appraiser understands that a plant closing caused a slowdown in the market leading to a loss in value, the mathematical methods only recognize a decrease in value. The mathematical methods have an insufficient amount of data to base a prediction of the resulting real property cycle that will result from the plant closing. The appraiser may also lack good data, but will isolate the cause and effect and analyze it with common sense and other information not available to the mathematical analysis. Real property characteristics are very correlated one to another.
The number of bathrooms is typically related to the number of bedrooms, which is typically related to the overall gross living area. Countless examples of correlation can be sited in real property. The human nature of buyers and sellers results in imperfect markets. This means that for numerous reasons a property with a real or intrinsic value of $200,000 will sometimes sell for more or less than that amount.
Example 1
A developer/builder develops a residential subdivision and builds three models of residential units.
Model |
GLA |
BR's |
Baths |
Car Stg |
Kitchen |
List |
A |
1,200 |
2 |
1.5 |
Sgl Carport |
Standard |
$150,000 |
B |
1,500 |
2 |
2 |
Dbl Carport |
Standard |
$202,500 |
C |
1,800 |
3 |
2 |
Dbl Garage |
Deluxe |
$270,000 |
In the first month a couple of each Model sold at the asking price. Then the developer had some medical problems resulting in a personal cash drain causing the next several to sell considerably below the asking price. As the developer’s health returned, a new sales firm was employed and the next bunch sold considerably above the original asking price. All of this occurred within a 6 month period. Keep in mind that the closing dates were not the same as the off market dates. In addition, the original sales firm had multiple errors in the listing sheet including understating the gross living areas and had some of the bedroom and bath counts listed incorrectly. The new sales firm listed this information more accurately on the listings they initiated.
This example is not atypical of an actual real-property sub-market and has some elements of causation, correlation, and market inconsistency. Can you identify them? Can you see how these issues could play havoc on a purely mathematical analysis such as regression and paired sales?
Market Model 2 uses the same data as the "Perfect Market" presented earlier except the value of the square footage is not constant. After 1,000 SqFt the rate drops from $125.00 to $100.00.
MARKET MODEL 2 |
|||||
Rating |
1-3 |
1-5 |
|||
Value/unit |
* |
$3,000 |
$5,000 |
$4,000 |
|
Record |
SP |
GLA |
Car Stg |
Location |
BR |
1 |
$273,500 |
2,100 |
1 |
5 |
2 |
2 |
$234,000 |
1,800 |
2 |
4 |
2 |
3 |
$213,250 |
1,650 |
3 |
3 |
2 |
4 |
$237,500 |
1,900 |
1 |
2 |
3 |
5 |
$273,000 |
2,200 |
2 |
1 |
3 |
6 |
$296,000 |
2,200 |
3 |
5 |
3 |
7 |
$345,250 |
2,650 |
1 |
4 |
4 |
8 |
$412,000 |
3,200 |
2 |
3 |
4 |
9 |
$235,000 |
1,800 |
3 |
2 |
4 |
10 |
$232,875 |
1,903 |
1 |
1 |
3 |
11 |
$284,750 |
2,134 |
2 |
5 |
3 |
12 |
$691,000 |
5,400 |
3 |
4 |
3 |
13 |
$228,125 |
1,817 |
1 |
3 |
2 |
14 |
$329,375 |
2,643 |
2 |
2 |
2 |
15 |
$278,875 |
2,255 |
3 |
1 |
2 |
The outputs of the model for the independent variables are listed below. Not that the output is different when the intercept is set to "0". It may be best to not set the intercept to zero. If the output for the intercept is very large this is a clue that something is probably not right and the data should be checked.
* |
||||||
$125 |
< &= |
1,000 |
SqFt |
|||
$100 |
> |
1,000 |
||||
GLA |
Car Stg |
Location |
BR |
Intercept |
R2 |
Conf |
$125 |
$3,000 |
$5,000 |
$4,000 |
-$25,000 |
1.000 |
1.000 |
$124 |
$362 |
$3,660 |
-$555 |
$0 |
0.977 |
0.999 |
Market Model 3 uses the same data as the "Perfect Market" presented earlier except some market randomness has been added. A column was added and the Excel function "Randbetween" was used to get some truly random numbers. The limits were set as -8 to a +8 to simulate a market that has a typical variance of +- 8%. This percentage was converted to a number by multiplying the percent variance times the sales price. Market noise shows up in the sales prices, so the noise is added to the sales prices. Â
MARKET MODEL 3 (MM 1 + Noise) |
|||||||
Rating |
1-3 |
1-5 |
-8% to +8% |
||||
Value/unit |
$100 |
$3,000 |
$5,000 |
$4,000 |
|||
Record |
SP |
GLA |
Car Stg |
Location |
BR |
Noise |
|
1 |
$244,020 |
2,100 |
2 |
5 |
2 |
-2 |
-$4,880 |
2 |
$184,450 |
1,800 |
3 |
4 |
2 |
-15 |
-$27,668 |
3 |
$174,000 |
1,650 |
4 |
3 |
2 |
-13 |
-$22,620 |
4 |
$187,480 |
1,900 |
2 |
2 |
3 |
-14 |
-$26,247 |
5 |
$243,540 |
2,200 |
3 |
1 |
3 |
-1 |
-$2,435 |
6 |
$252,860 |
2,200 |
4 |
5 |
3 |
-6 |
-$15,172 |
7 |
$270,160 |
2,650 |
2 |
4 |
4 |
-12 |
-$32,419 |
8 |
$327,600 |
3,200 |
3 |
3 |
4 |
-9 |
-$29,484 |
9 |
$185,300 |
1,800 |
4 |
2 |
4 |
-15 |
-$27,795 |
10 |
$194,103 |
1,903 |
2 |
1 |
3 |
-9 |
-$17,469 |
11 |
$254,212 |
2,134 |
3 |
5 |
3 |
-2 |
-$5,084 |
12 |
$543,120 |
5,400 |
4 |
4 |
3 |
-7 |
-$38,018 |
13 |
$210,700 |
1,817 |
2 |
3 |
2 |
0 |
$0 |
14 |
$291,300 |
2,643 |
3 |
2 |
2 |
0 |
$0 |
15 |
$240,480 |
2,255 |
4 |
1 |
2 |
-4 |
-$9,619 |
There is a chance that the random numbers changed before I copied the results in, so, do not be surprised if you get slightly different outputs.
GLA |
Car Stg |
Location |
BR |
Intercept |
R2 |
Conf |
$85 |
$4,205 |
$2,568 |
-$1,555 |
$30,504 |
0.975 |
0.928 |
$88 |
$5,488 |
$5,885 |
$3,405 |
$0 |
0.997 |
0.926 |
You can see that the model's output varied as the market data became more challenging. Note that the when the intercept was not forced through zero it returned $30,504. Does this mean it would be inappropriate to rely on the adjustment rate of $85.00 per square foot on the sales grid? Does it matter that more property characteristics were used on the sales grid than were considered in the regression analysis when using the adjustment for square feet extracted by regression analysis? Appraisers have a lot of questions to get answered.
CONCLUSION
Download the regression application and replicate these examples. Investigate some of the questions I have posed and let the labs know what you found out.
In Part 3 (the final part) we will discuss what specific tools and applications are available to help appraisers produce a more reliable (accurate) opinion value and/or shorten the time to perform an appraisal of equal quality.




