WOLFRAM

Wolfram Summer School

Alumni

Natasha Dada

Technology and Innovation

Class of 2018

Bio

Natasha Dada just finished her freshman year at Columbia University, where she is studying computer science and applied math. In addition to participating in the Wolfram Summer School, she will be a summer intern at Johns Hopkins University’s Extreme Materials Institute. Natasha hopes to apply what she learns at the Wolfram Summer School to her work at Johns Hopkins and to future research. In her free time, Natasha enjoys participating in the Columbia Space Initiative, reading and playing chess.

Computational Essay

Applications of Zipf’s Law beyond Language »

Project: An Analysis of Word Choice in the Real Estate Industry

Goal

The goal of this project is to analyze property descriptions of real estate and identify words that positively or negatively affected the success of a sale. Success of a sale is measured as the percent difference between the listing and selling prices of the property. Data for this project was gathered from MLS through redfin.com and analyzed using Wolfram Mathematica’s text analysis functions and neural nets from the Wolfram Neural Net Repository. Through a TF-IDF analysis of the data and clustering functions and an analysis of frequently used words, correlations between word content and sale success were identified.

Main Results in Detail

A TF-IDF analysis of the data found that words with slashes or hyphens, misspelled words, most abbreviations and words associated with measurement, water and public transportation were unsuccessful while words associated with time in a house and locations were successful. A feature space plot of all the words, mapped by a neural net, shows that often similar words performed very differently; for example, the words “wifi,” “gracious” and “fantastic” did well while the words “wireless,” “gorgeous” and “awesome” did poorly. Though more data is needed to confirm this trend, these results indicate that word choice may be extremely important.

Future Work

This analysis used data from a three-month period in Boston. Future research could examine different cities, rural locations and a timeframe of more than three months. Additionally, the description of a home is only one of many factors in the success of a sale. Future work could identify and examine the many factors that affect the success of a sale and the importance of each factor.