Case study examples of machine learning we have applied to real estate ad technology

## Machine Learning Case Studies

#### Study 1: Demographic Clustering

#### Cost effective targeting of segmented markets.

**Introduction:**

In online marketing little is known about an individual’s property preferences upon first ad impression.

**Problem Statement:**

- To whom do you market which property?
- How do you market the correct property to the correct person and maximize Return On Investment (ROI)?

**Study Summary:**

**The Data:**

A large amount of data is available on the property market in the USA, some of this can be found in the USA Census Data. A data set like this would be too large to be analyzed in a meaningful time frame by humans and conventional statistical analysis techniques would produce less meaningful results.

**The Process:**

- Select data with key characteristics such as property values, mean income etc. Census and neighborhood data can be chosen.
- Run through k-means to produce market segment clusters which include property archetypes and demographic segments.
- Group the resulting market segments together in advertisement campaigns.
- Test resulting advertisement groups using AB testing, multi-armed bandit etc.

**Resulting Information:**

An advertisement showing a specific property archetype was shown per above to a few different demographic clusters, which are represented by color groups green, orange and red. The green group, which contained zips 10464, 32832 and so forth, performed well with this ad, while the red group containing zips 10526, 10501 etc. did poorly.

**Findings Summary:**

- It is possible to predict the property archetypes a person would be interested in, upon first contact with the advertisement.
- One can rapidly improve ROI on advertising campaigns, using proper clustering.
- Coincidental finding: The key variable that influences people’s perception of a property archetype is the image that represents the property in an ad. Price and square footage do not influence this to a significant degree.

#### Study 2: Polynomial Modeling of Auctions

#### Optimal bidding that adapts.

**Introduction:**

In Search Engine and Display Network Marketing, determining the optimal bid for maximum profit on ads can be complex and fraught with pitfalls due to hidden variables.

**Problem Statement:**

- How much should you bid for a particular ad group to get the maximum profit?
- How many bids would need to be placed in order to have an optimal bid?

**Study Summary:**

**The Data:**

Similar data sourcing techniques were applied as in Study 1 above. The vast amount of property data available in the USA (some from USA Census Data) would be too large to be analyzed manually by humans within an economical timeframe, and the application of the conventional statistical approach would supply less accurate results.

**The Process:**

- Simplify the equation in a pre-processing step by calculating the profit for each given bid.
- Generate an initial model that is based on all existing ad groups within the target campaign (this reduces the number of bids required to find the optimal bid).
- Match ad group’s data with a polynomial linear regression model, that accounts for time through weighting and avoids over-fitting with a carefully selected amount of regularization.

**Resulting Information:**

As can be seen above in sample Adgroup 47, for time periods T1 – T20, the model learns what the optimal bid would be within a few iterations.

**Findings Summary:**

- The optimal bid for a given ad group can be found by using polynomial regression within a few automated iterations of bidding.
- Coincidental finding: The trough between 0 and 30 cents is caused by a non-linear relationship between conversion rate and bid which is positive at that intersection.

#### Study 3: Search Term NLP (Natural Language Processing)

#### Optimizing ad group structure towards most valuable search terms.

**Introduction:**

Search term report analysis is an important part of evaluating and optimizing Search Engine Marketing Account performance. In some market verticals, increased dimensionality renders data that is too granular for gathering of meaningful information from these reports. An example is the inclusion of location-based indicators of interest from searchers when searching for homes and estate agents.

**Problem Statement:**

- Which search terms result in the conversions that are most sought-after by a marketer?
- Have all the best search terms, being used by the target audience, been covered in closely-knit ad groups?

**Study Summary:**

**The Data:**

Many Search Engine Marketing (SEM) platforms provide statistical data about the search terms being used by the target audience that ads are displayed to. In the real estate industry, the search term often includes location data, which makes it possible to group the search term report by these location data specifications, as well as by other similar words, using NLP techniques.

**The Process:**

- Preprocessing: Tokenize the words in the search term report that represent groupings, such as city, county, state, etc.
- Processing: Group the search term report into a simpler, more comprehensive report for easier analysis.
- Analyze the new report to discover more about the target audience’s searches.

**Resulting Information:**

**Findings Summary:**

- Better ad groups and ads can be created around closely–themed searches when correctly analyzed.
- Seller leads are considered more valuable than buyer leads. Plural search terms convert to seller leads at an up to 15x higher rate than singular search terms, therefore it should be subdivided into different ad groups and be bid on separately.
- Coincidental finding: Home buyers and sellers that convert well tend to include search terms such as “top” and “best”.