Powered by RealTown Blogs
Matt's Real Estate Technology Blog
Clareity ConsultingReal Estate Information Technology Consultants
Home PageAbout ClareityServicesClientsPublicationsEventsContact

Matt's Real Estate Technology Blog

Archives

July 2009

Jul. 6, 2009 - Search Engines and the MLS Data Scraping Question revisited

This is a continuation of Search Engines and the MLS Data Scraping Question (http://www.realtown.com/mattcohen/blog/scraping-and-search-engines). That blog post started a lot of conversation, and Brian Larson did an excellent job of filling in a lot of the detail on his blog, starting here (http://www.mlstesseract.com/2009/06/search-engines-indexing-idx-sites.html). So far, he and I seem to be substantially in agreement, and after his five part series we are generally to the same point:

Our industry has a long way to go in discussing how it relates to the Internet - and topics such as how data is used by search engines, syndicators and others are ripe, perhaps over-ripe, for discussion. Through that discussion, policy around use of data should be developed. How one broker uses another broker's listings online is especially to be considered. This policy needs to be reflected on web sites in Terms of Use, anti-scraping, and other technical details. That said, we must ensure that policy enacted in our own industry does not disadvantage brokers online, in relation to sites that the policy does not  or cannot affect.

When a search engine blurs the line between their traditional role as a "conduit" site with a role as a "destination" in its own regard, indexing may be more controversial. See how Google is using real estate data in Australia:  http://maps.google.com.au/help/maps/realestate/. Google is now also heading in the same direction in select U.S. cities.

There has been an expectation that, when a search engine crawls your site, its purpose was to allow the public to enter search terms, get back a link with a small amount of text under it, and encourage the public user to click on the link and visit your site. This is referred to as the search engine being a "conduit". When a search engine crawls your site and not only indexes your content, but stores a copy of your data and presents that content - perhaps in conjunction with other content - it can become a "destination" site in its own regard.

Though in the example / URL provided, Google still links out to an original source of the data, getting users to that source may not be the primary focus of the page. What would you think of Google if the focus was on the "More info" link and you only saw a link of traditional destination sites when you clicked on the "Web Pages" tab? How about if the design changed further and there was a LOT more content on Google - public records data, demographics, etc.? Or what if Google added additional functionality - what if users could bookmark their favorite listings and share them with friends? What if they could get email updates or  RSS  updates via Google Reader when new matches to their criteria were found?

Where does a site cross the line from being a search engine and start seeming like any other 'scraper'?

As per my original posting on this subject, I still believe usage is at the heart of the IDX / search engine policy question. Ideally, there should be rigorous strategic discussions of how the listings are used by various parties today - and how they might be used tomorrow.  

Comments (6) :: Post A Comment! :: Permanent Link
View more entries tagged with: , , ,


Matt Cohen
Matt Cohen has consulted to MLSs, Associations, franchises, brokerages, and many real estate industry software companies for over 12 years. Matt is a well-regarded real estate industry expert on industry trends, software design, product management, project management, and information security. Matt speaks at conferences, workshops and leadership retreats around the country on a wide variety of MLS-related topics.

Twitter
Facebook

Subscribe

Your E-mail Address:

Links

Disclaimer: The opinions expressed on this blog are the responsibility of the author and do not necessarily reflect the opinion of my employer