Welcome to the New RealTown! Submit Feedback
Member Login | Join RealTown
The Real Estate Network

Matt's Real Estate Technology Blog

Blog by Matt Cohen
Minneapolis, Minnesota

Matt Cohen has consulted to MLSs, Associations, franchises, brokerages, and many real estate industry software companies for over 12 years. Matt is a well-regarded real estate industry expert on industry trends, software design, product management, project management, and information security. Matt speaks at conferences, workshops and leadership retreats around the country on a wide variety of MLS-related topics.

Subscribe

Your E-mail Address:
Subscribe to:

Recent Comments

RE: Top 10 MLS Features for 2009
RETS created a standard for accessing the dat...
RE: Top 10 MLS Features for 2009
Matt, Being rather new to the technology side of s...
Size ALWAYS matters
No matter how fast connections get, if we can redu...
RE: Survey: Initial Feedback on HouseLogic and Realtor Property Resource
Brian - just under 50, mostly MLS execs, a few sta...
RE: Survey: Initial Feedback on HouseLogic and Realtor Property Resource
 Matt, thanks for putting this together so qu...

Site Feed

RSS Feed

Matt's Real Estate Technology Blog

Search Engines and the MLS Data Scraping Question revisited

Jul. 6, 2009
Tagged with: idx, mls, search engines, syndication

This is a continuation of Search Engines and the MLS Data Scraping Question (http://www.realtown.com/mattcohen/blog/scraping-and-search-engines). That blog post started a lot of conversation, and Brian Larson did an excellent job of filling in a lot of the detail on his blog, starting here (http://www.mlstesseract.com/2009/06/search-engines-indexing-idx-sites.html). So far, he and I seem to be substantially in agreement, and after his five part series we are generally to the same point:

Our industry has a long way to go in discussing how it relates to the Internet - and topics such as how data is used by search engines, syndicators and others are ripe, perhaps over-ripe, for discussion. Through that discussion, policy around use of data should be developed. How one broker uses another broker's listings online is especially to be considered. This policy needs to be reflected on web sites in Terms of Use, anti-scraping, and other technical details. That said, we must ensure that policy enacted in our own industry does not disadvantage brokers online, in relation to sites that the policy does not  or cannot affect.

When a search engine blurs the line between their traditional role as a "conduit" site with a role as a "destination" in its own regard, indexing may be more controversial. See how Google is using real estate data in Australia:  http://maps.google.com.au/help/maps/realestate/. Google is now also heading in the same direction in select U.S. cities.

There has been an expectation that, when a search engine crawls your site, its purpose was to allow the public to enter search terms, get back a link with a small amount of text under it, and encourage the public user to click on the link and visit your site. This is referred to as the search engine being a "conduit". When a search engine crawls your site and not only indexes your content, but stores a copy of your data and presents that content - perhaps in conjunction with other content - it can become a "destination" site in its own regard.

Though in the example / URL provided, Google still links out to an original source of the data, getting users to that source may not be the primary focus of the page. What would you think of Google if the focus was on the "More info" link and you only saw a link of traditional destination sites when you clicked on the "Web Pages" tab? How about if the design changed further and there was a LOT more content on Google - public records data, demographics, etc.? Or what if Google added additional functionality - what if users could bookmark their favorite listings and share them with friends? What if they could get email updates or  RSS  updates via Google Reader when new matches to their criteria were found?

Where does a site cross the line from being a search engine and start seeming like any other 'scraper'?

As per my original posting on this subject, I still believe usage is at the heart of the IDX / search engine policy question. Ideally, there should be rigorous strategic discussions of how the listings are used by various parties today - and how they might be used tomorrow.  

MLS and the Future of Listing Distribution

May. 11, 2007
Tagged with: mls, security, syndication, web 20

The following is a high-level overview of a session from Clareity Consulting's 2007 MLS Executive Workshop. Every year the Clareity's Workshop provides fresh, in-depth updates on the most pressing issues facing MLS executives and leaders and creates an intimate environment for participants to share their knowledge and experience with each other. You can check the dates and/or register for next year's event on the Clareity Consulting web site – www.callclareity.com

Listings and other real estate content were at one time jealously guarded by brokers, and in turn by the MLS – especially when it came to putting that content on the Internet. Though MLS public web sites are still controversial in some corners, the recent trend has been toward wider distribution of such content and the question of day is no longer "Should my listings be available on multiple Internet sites?" and is more commonly, "Where should I send the listings?" There have been two trends contributing to the current environment: the first trend is the market slow down which has increased pressure on brokers and agents to facilitate greater marketing exposure for properties and the second trend is a part of wider culture, the "Web 2.0" trend of "syndication".

In the "Web 1.0" world, companies wanted all consumers to visit their site and stay as long as possible. In the "Web 2.0" world, the trend is to make content available to other sites or have customized information delivered directly to individual site subscribers. The most common mechanism for this is called "RSS", which stands for Really Simple Syndication.

So, what we have traditionally referred to as listing distribution could also be referred to as content syndication. Such syndication may benefits all parties. Today, consumers must typically visit multiple sites to see all the listings for their area: Realtor.com/Move.com, Yahoo!, craigslist, Google, NewHome Source, FSBO.com, Trulia, remax.com, coldwellbanker.com and numerous other sites. When listings are fully syndicated, consumers may not need to visit multiple sites. For the broker or agent, syndication drives listing exposure across numerous online platforms - generating new traffic and content exposure — making syndication a free and easy form of advertisement.


RSS Icon

Between the continued growth of listing content syndication and the evolution of data standards such as RETS (the Real Estate Transaction Standard), the MLS may have an evolving role to fill in the collection of data and its syndication. First, RETS 2.0 will make it more feasible for real estate professionals to manage their listings in a number of systems and have those changes syndicated. This may mean that an agent enters and manages a listing in the MLS and has it syndicated to the broker back-office system, other systems they use, and web sites – much as it is today – but it could also mean that the agent can manage listings directly into the broker system and have syndicated to the MLS and other systems. Managing that syndication may become a core function of the MLS and other real estate software. This scenario is illustrated below:

While it is clearly up to brokers to determine where their listings are advertised, as an industry it is in our best interest to encourage balancing the benefits of content distribution with the interests of those that have worked to create the content as well as providing appropriate levels of information security and consumer privacy.

While some MLSs have started dealing with their information security responsibilities, in terms of MLS system authentication and the hacking threat, less attention has been paid to listing distribution. Some say, "This is information already out there on the Internet – its 'public' information. We don't publish the consumer's name or phone number. So what's the big deal?" If a consumer provides information to an agent for the purpose of selling their home and, due to uncontrolled distribution or inadequate information security practices, the consumer is immediately overrun with telephone, mail or email marketing from real estate related services it could lead to backlash and damage the trust in the real estate professional. It's all too easy to scrape the content off of most MLS web sites, and combining that information in a "mash up" with even the most basic reverse telephone directory creates a consumer privacy issue. If industry critics like Dave Barry have their way and open up access to the MLS, we will surely have other attorneys attacking the industry for that breach of consumer privacy.

As the scope of content syndication continues to expand it will be important for MLSs, brokers and others that may distribute content to implement next generation data distribution policies – addressing the security of everything from data exports created directly from the MLS to RETS feeds and other content distribution mechanisms. Such policies must:

  • establish common practices used to evaluate and establish third party relationships
  • establish conditions on those relationships and responsibilities within them
  • determine the data that may be accessed
  • describe how that data may be securely transmitted and stored
  • enumerate numerous other detailed steps needed to provide appropriate information security for the content to which third parties, are entrusted.

While providing information security assessments to MLSs and brokers, Clareity has found that such policies are rarely thoroughly defined or implemented. Clareity has worked with a number of clients to adopt and implement more robust listing distribution policies, integrate these policies into appropriate contracts, educate staff and members and third parties on those policies, and implement pro-active controls as well as means for monitoring and enforcement. There are hundreds of details to attend to in such a policy, and it is rare that Clareity finds a policy that is better than 'fair' during an assessment. Ask yourself, "Does my policy cover secure coding practices to ensure listings can't be scraped off member or MLS/IDX vendor web sites? For the secure transfer of information using encrypted protocols? For the encrypted storage of information, in databases and on backups? For security compliance monitoring mechanisms?" Again, there are scores of detailed questions to be asked, and which need be addressed in a comprehensive policy.

The current trend toward increasing listing content syndication is going to create new roles for the MLS and new challenges for our industry. Finding a balance of giving the consumers access to the information they desire and protecting broker rights, industry interests and consumer privacy will be very important, and Clareity encourages its clients to take security of data distribution to the next level through policy, agreements, MLS rules, education, implementation, monitoring, and enforcement so they are prepared for the future of listing content syndication.

About the author:

Matt Cohen is Clareity Consulting's Chief Technologist. Matt has spoken at many conferences, workshops and leadership retreats around the country on a wide variety of MLS-related topics, and is a well-regarded real estate industry expert on software design, product management, project management, data center reliability, scalability, and information security. Clareity Consulting was founded in 1996 to provide information technology consulting to the real estate industry and its related businesses. For more information, visit www.callclareity.com