How Information Scatter Informs Content

Posted by jhullman at 2:45 pm | Filed In Analytics

Not too long ago, I centered a post around a diagram of the web from a network perspective, with components labeled. The point was to take a closer look at the structure of the internet, beyond the simple notion of billions of webpages connected by hyperlinks.

Network metrics like degree, betweenness, etc have also been discussed, but still on the level of webpages.

This level of analysis ignores what the primary function of most webpages - to inform visitors on a given topic. Many company sites intended only to provide marketing information have adopted the strategy of including information on the topic of interest. SEOMoz advises in their SEO for Beginners Guide, “Get out into the forums, blogs, and communities where folks in your industry spend their online discussion time. Note the most frequently asked questions, the most up-to-date topics, and the posts or headlines that generate the most interest. Apply this knowledge when you create high-quality content and directly address your market’s needs.” This works because (in addition to increasing link equity) it expands targeting to visitor’s earlier in the sales process, those still in the research process, but soon to move into the buying stage.

So how does one design the pages on a site that will provide the information, in a way that increases the likelihood of reaching these visitors? Should you provide rare facts, or more general ones? How dense should the facts per page be? It’s easy to find bloggers championing relevant content, but hard to find anyone addressing these questions online.

Research on the distribution of information across the web, called ‘Information Scatter’ by UM researchers helps address them. By creating a biparitite (or two-mode) network, one with two different types of nodes, those representing facts relevant to a given industry or subject area, and those representing the webpages that contain them. The researchers constructed a graph of this type around melonoma facts online:

melanoma info web graph
info_graph

It has been previously determined that the distribution of facts online is highly skewed, with a few documents containing many facts, and many documents having only a few. Facts can be either general, or rare. This study found that the rare facts are found on the pages with many facts, while general facts tend to appear alone or with only a few other facts.

The common random walk model of online behavior says only that lots of links in to your site will increase the site’s chances of being found. By modeling a potential visitors research process, assuming searching for pages around a fact, starting at one results page and then incrementally expanding the search to another fact encountered on that page, the UM researchers went further than the random walk theory, using the two-mode network to investigate which types of facts and pages are most important in a visitor’s search process.

The results showed the pages most likely to lie on paths between topics, or in network terms, those with high betweenness, to be most important. One of the most interesting findings of this study involves what these pages look like in terms of fact type and distribution.

Lets assume an example around the tire industry. Topics pertinent to those buying tires might include tire care, tire mileage, tire problems, buying tires, etc. The results from this study suggest that a page that included tire facts pertinent to all these categories would be the most important to network traversal. In addition, it was found that a page could address these topics with just one or two facts each without losing its value to an information seeker.

Another option to increase a page’s betweenness, based on this study, is the inclusion on the page of a rare fact that ties two topic areas together. For example, a fact might center around a rarer tire problem, and suggest that the only way to diagnose it properly is by checking your tire air pressure, which pertains to tire care. This would link the tire problems and tire care topics.

2 Comments

  1. doneil
    Posted August 18, 2008 at 3:53 pm | Permalink

    Hi Jessica,

    So, if you were going to give the “elevator summary” of this excellent article, what would it be?

  2. jhullman
    Posted August 18, 2008 at 3:57 pm | Permalink

    looking at the web as a network of linked pages and linked facts (instead of just linked pages) can give you clues for the best way to lay out info on your site.

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*
Yahoo Search Marketing Google Adwords Partner Google Analytics Consultant

Pure Visibility Analytic Services

Need help with Google Analytics

Twitter Logo

201 South Main Street · Fifth Floor · Ann Arbor, Michigan 48104 · Phone: (734) 213-8100 · eFax: (734) 401-6015
Pure Visibility® and Own Page One® are registered trademarks of Pure Visibility, Inc., registered in the U.S. All rights reserved.
Copyright 2005-2009 · Pure Visibility Inc.