Research proposal: causal relationship between vertical greening and house prices

This essay proposes a fixed effects regression design to investigate the causal relationship between vertical greening and house prices in Singapore, and evaluates the strengths and limitations of this causal inference technique.

This is a research design proposal written as part of my assignment requirements for POLS0013: Causal Analysis module at UCL. I will outline how this research may be carried out, how to obtain the required data, a justification of my proposed methodology, and an overview of the potential limitations of this proposal.

Comprising vertical plantscapes (Fig 1, 2), vertical greening is a novel urban design tool with transformative potential on the urban landscape. Beyond environmental and health benefits, vertical greening also enhances the aesthetic of buildings and remarkably improves the visual amenity of a city (Timur & Karaca, 2013). However, vertical greening can also contribute to the vertical secession of elites (Graham & Hewitt, 2012) by increasing property values, causing a perpendicular splintering of urban residents through differentiated property prices.

Fig 4. Illustration of vertical greenery (NParks, 2020)
Fig 5. Vertically greened building in Singapore (NParks, 2019)

This research investigates whether vertical greening increases property prices of the greened building, using a fixed-effects regression design for an observational study. The study will be conducted in Singapore, a pioneer of vertical greening due to its land scarcity and simultaneous focus on green space conservation and creation, culminating in widespread adoption of vertical greenery. The findings will provide a basis to incentivise vertical greening in other urban areas.

Methodology

Repeated observations of property prices for each building over time will be collected. Whilst the time dimension will necessitate more data collection on this front, the trade-off is that data for time-invariant covariates will not necessarily have to be collected.

Data on vertically greened buildings can be obtained from data.gov.sg. Data of controls, which are all other buildings in Singapore, can be extracted from openstreetmap.org. The temporal dimension, year of greening, is obtainable from either NParks or Building & Construction Authority. These are public agencies in Singapore with well-maintained, confidential datasets. Table 8 shows a sample.

Property transaction data dated from 2000 to 2020 will be scraped from real estate websites. The web scraping will focus on gathering unstructured, temporal price data (Table 9) from a variety of local real estate resources, such as 99.co.

The two datasets will be joined in Table 10, which will be subsequently used to fit the models. For buildings with multiple property sales, the mean per-square-feet (psf) price will be calculated. A new variable, years since greening, will be created.

Analysis

The fixed-effects design generalises difference-in-differences using panel data, contextualised by multiple groups and time periods. Since buildings are nested within neighbourhoods, the inclusion of fixed-effects also controls for cluster-level components that may cause correlation between buildings. Using each individual unit as its own control (Allison, 2009) enables the fixed-effects design to ‘purge the estimating equation of all characteristics, measured or unmeasured, that are constant over time or constant within groups’ (Treiman, 2009: 363). Therefore, fixed-effects regressions accounts for both observed and unobserved confounders through focusing on within-unit changes.

This design is suitable when faced with the possibility of time-invariant unobserved heterogeneity concerning selection into treatment. For example, buildings with higher priced units would be able to self-select into vertical greening by virtue of having wealthier owners or property managers. Furthermore, the fixed-effects estimation is not biased by building-specific heterogeneity. The effect of time-invariant covariates cannot be estimated, which aids in estimating causal effects by significantly reducing the number of potential confounders to be measured, such as building height. Finally, there is little to no possibility of reverse causality, which cannot be tested for in this model specification, making fixed-effects apropos. The direction of causal effect of vertical greenery on property prices is theoretically clear, and higher-priced individual properties are unlikely to have any causal effect on the installation of vertical greenery.

The fixed-effects estimator performs a pure within-individual comparison: mean property prices of a building are compared before and after vertical greening. This within-individual comparison is not biased by any unobserved heterogeneity at the individual level. Hence, the fixed-effects estimation infers the causal effect by comparing the within-building change that is ‘induced by a treatment event’ (Brüderl & Ludwig, 2015: 331).

A fixed-effects model will be specified for a single building i in year t:

· Y_it denotes the observed outcome of property price of building i at time t
· y_i is the fixed-effect for groups (dummy for each building, i, for k buildings)
· α_t is the fixed-effect for time periods (dummy for each year, t)
· δ is the difference-in-differences estimate based on GREEN_it (1 for treated unit-period observations and 0 otherwise)

The building effects are the coefficients on the building dummies, y_i. The time effects, α_t, are coefficients on the year dummies. This analysis uses a dataset with k buildings and 20 years in a building-year panel format. This model will identify the average treatment effect for the treated (ATT), or vertically greened buildings, with the coefficient δ. The value of δ will be interpreted, with its statistical significance stated.

However, given that time-series data typically exhibit some form of autocorrelation or heteroskedacity (Zeileis, 2004), the errors may be serially correlated. Consequently, the traditional standard error reported may be too optimistic (too small), increasing the likelihood of a Type I error. To mitigate this, vcovCL() in R will be used to report cluster-robust standard errors at the group level. A statistically significant

indicates that a difference in the treatment of vertical greening between these two periods of time is associated with a within-building difference in property prices. This estimate posits a more credible argument for causality, relative to ‘static comparisons between [units] with different levels of the focal dependent independent variables (between variation) by differencing out all unobserved confounders as long as they are constant over the observation period’ (Krug & Fuchs, 2019: 9).

The time fixed-effect controls for national-level shocks, common across all buildings, that affect property values. Relevant economic shocks would include any global financial crises, changes in GDP, income growth rates, employment rates, and inflation rates. Specifically for Singapore, the government is able to intervene heavily in the real estate market. These shocks would therefore also include any nationwide policy enacted as property cooling measures.

A third fixed-effects model will be specified to investigate the time path of the causal effect, with this equation:

This model doubles as a robustness check for lags and leads. 5 years-after-greening and 1 year-before dummies are added. The coefficient on year-before, δ_6, should be 0 if there is no anticipation effect. Checking the coefficients associated with years-after-greening allows us to identify the causal effect’s time path in a differentiated way (Brüderl & Ludwig, 2015). A time path of vertical greening will be plotted to show the change in property price over years since vertical greening (Fig 3).

Fig 3. Sample time path plot (edited from source: Brüderl & Ludwig, 2015)

This model removes heterogeneity across groups by controlling for unmeasured or unobservable confounders, and focusing on within-unit variation in property prices to calculate the causal effect. The building fixed-effect controls for any time-invariant differences across buildings. As buildings vary in geographical location, physical, environmental and socioeconomic attributes will also vary spatially, which will affect property valuation. These attributes include crime rates, socioeconomic strata of residents, etc.

The validity of this research design relies on a few modelling assumptions. Firstly, the parallel trends assumption posits that the important unmeasured confounders are either time-invariant individual factors, or time-varying factors that affect all buildings equally. In the absence of treatment, buildings are likely to have similar property prices at one time period. Property prices are a function of macro factors that vary nationally such as the economy, population growth, supply and demand, and government policies. Other driving factors vary at a higher level of spatial aggregation than the building level: proximity to the Central Business District, presence of transport nodes, etc. Thus, within a building, property prices of treatment and control units tend to follow similar trends in time. The parallel trends assumption is likely to hold.

To empirically verify the parallel trends assumption, a t-test of the difference in house prices across treatment and control groups, during the pre-treatment period, can be conducted. Statistical insignificance of the test statistic implies that the parallel trends assumption is valid (though it could also indicate low power of the test statistic). A graph illustrating the levels of the two groups over a pre-treatment period will be plotted to allow for visual inspection of whether these two groups are similar ex ante in levels and in distribution (Fig 4).

Fig 7. Sample plot showing parallel pre-treatment trends (edited, source: StackExchange)

Secondly, while this fixed-effects model is able to limit the sources of bias to time-varying confounders, unobserved time-varying heterogeneity may still occur, violating the assumption of temporal homogeneity — that no factors change over time apart from treatment and national-level attributes (Collischon & Eberl, 2020). These time-varying omitted variables cannot be accounted for within the fixed-effects model. If implemented for a subset of properties property cooling measures are likely to affect buildings differentially. For example, the introduction of the Minimum Occupation Period (MOP) of public flats in the 1990s, wherein flat owners can sell their properties only after meeting the MOP requirement, may have affected the selling price of public flats for the MOP. Such a nationwide policy may be a individual-varying and time-varying shock. Fixed-effects estimates are unbiased only if treated and untreated units are the same, on average, with respect to the trend in the outcome over time (Vaisey and Miles, 2017). To account for this and to control for trends specific to buildings, interaction effects can be included in a fixed-effects individual slope model to detrend the data (Brüderl & Ludwig, 2015: 337).

The ability of the fixed-effects model to adjust for unobserved unit-specific and time-specific confounders ‘critically’ depends on ‘the assumption of linear additive effects’ (Imai and Kim, 2020: 1). This can be checked visually by plotting a graph between these variables. Another threat to inference would be a possible spillover effect. Vertical greening of one building may affect the property prices of adjacent buildings from the visual amenity that greening provides to the surrounding landscape, thus engendering an overestimation of the ATT. Additionally, compound treatments may have been implemented during vertical greening, which may threaten the validity of the causal inference. The decision to select into treatment may not be in a vacuum and may be accompanied by other upgrades and renovations, which will have an effect on property prices. As such, the causal effect of vertical greening may not be able to be completely isolated. A possible solution would be to include these additional upgrades as covariates. A last potential limitation of note would be the data source, which is a community map and may have outdated or inaccurate data.

References

Brüderl, J., & Ludwig, V. (2015). Fixed-Effects Panel Regression (pp. 327–358).

Collischon, M., & Eberl, A. (2020). Let’s Talk About Fixed Effects: Let’s Talk About All the Good Things and the Bad Things. KZfSS Kölner Zeitschrift Für Soziologie Und Sozialpsychologie, 72(2), 289–299. https://doi.org/10.1007/s11577-020-00699-8

Treiman, D. (2009). Quantitative Data Analysis: Doing Social Research to Test Ideas. San Francisco, CA: Jossey-Bass.

Graham, S., & Hewitt, L. (2013). Getting off the ground: On the politics of urban verticality. Progress in Human Geography, 37(1), 72–92. https://doi.org/10.1177/0309132512443147

Glennon, D., Kiefer, H., & Mayock, T. (2018). Measurement error in residential property valuation: An application of forecast combination. Journal of Housing Economics, 41, 1–29. https://doi.org/10.1016/j.jhe.2018.02.002

Imai, K., & Kim, I. S. (undefined/ed). On the Use of Two-Way Fixed Effects Regression Models for Causal Inference with Panel Data. Political Analysis, 1–11. https://doi.org/10.1017/pan.2020.33

Limitations of Fixed-Effects Models for Panel Data — Terrence D. Hill, Andrew P. Davis, J. Micah Roos, Michael T. French, 2020. (n.d.). Retrieved 8 January 2021, from https://journals.sagepub.com/doi/full/10.1177/0731121419863785?casa_token=MsTP-XUcW4AAAAAA%3AOJcLoTdXVrO_8dgJ0yNwQsaKZC0jd1uwvKZ70Z_uN08fYUwNGuIqojCKIhhlrG5mGsdt9k3pkj_zhkk

Lousdal, M. L. (2018). An introduction to instrumental variable assumptions, validation and estimation. Emerging Themes in Epidemiology, 15. https://doi.org/10.1186/s12982-018-0069-7

NParks (2019). ‘Greenery That Defies Gravity’ in Gardening, 41, 2. https://www.nparks.gov.sg/nparksbuzz/issue-41-vol-2-2019/gardening/greenery-that-defies-gravity

NParks (2020). ‘Types of vertical greenery systems’ https://www.nparks.gov.sg/skyrisegreenery/explore/vertical-greenery

Timur, Ö., & Karaca, E. (2013). Vertical Gardens. https://doi.org/10.5772/55763

Vertical Greenery — Moving Upwards. (n.d.). National Parks Board. Retrieved 4 January 2021, from /skyrisegreenery/explore/vertical-greenery

Zeileis, A. (2004). Econometric Computing with HC and HAC Covariance Matrix Estimators. JSS Journal of Statistical Software, 11(10).

Geographic data scientist and undergraduate at UCL. https://www.linkedin.com/in/zenn-wong/