<< by on June 18th, 2009
What is Federated Search?
I really feel like an idiot because I was unaware of this term until yesterday but I guess it’s a pretty obvious thing. Eric Ward discusses it in his latest article over at Search Engine Land. After all the reading I’ve done… was I not paying attention or is this topic just not really something most SEOs focus on? It’s just not something I’ve had to really deal with in my job. It is something I’ve used while in college, however.
So do we really need Federated Search for our every day stuff? Unless we’re down to some really niche-y searches, do we need this type of search or are we OK with what we use every day?
A comment from Takeshi to Eric:
“…are the SERPs of these so-called “federated” search sites also indexed by Google?
Usually, when I do link building, the purpose is to build inbound links for a specific site in order to help boost its ranking in Google. If the goal of this article is just to show up in these niche meta-search engines, why not just rank highly in Google, and you’ll likely show up in the results anyway?”
“The simultaneous search of multiple online databases or web resources and is an emerging feature of automated, web-based library and information retrieval systems. It is also often referred to as a portal or a federated search engine.” (Wikipedia)
What is known as the ‘Deep Web’-basically everything OTHER than Google, Yahoo, MSN etc… that we use every day-there are search engines that serve certain niches that people may or may not know about. For this, meta-search or federated search will help the user immensely.
So this topic got me thinking about libraries and the types of scholarly searches and journals required for most research. Doing papers in college (and I suppose in high school now) use this type of search but the user most likely unaware of what they are really using. To be a librarian, you must a.)Really love libraries and b.)Have the ability to get the student everything they need for their assignment. Those of us who have attended college (some of us for a really long time…) know that there are certain resources that are not approved for use due to the fact that they are edited by the community at large. For this, we need to be able to find more specific things.
Now I was aware of all the major search engines and even some niche ones. But really, the only instance I’ve had for federated search is when doing research for a college paper.
How often have you searched for something in Google and not found it? I know that’s happened to me a lot but this is why we modify the search terms. So why on Earth did I not think about how the libraries do their business? Well, I think I’ve fallen victim to search engine laziness. Google is supposed to have all the answers, right? Guess not. That’s not disappointing but it has awakened me from my search engine coma.
Federated Search is an Industry of its own.
On the Federated Search Blog, Sol writes about the 23rd Annual Computers in Libraries Conference last year. There are companies (of course) that provide federated search products and in a review of an El Federated Search Web Conference presentation Sol writes:
“The presenters introduced what they considered to be the two major problems of federated search, the “prior knowledge” problem of not knowing what a library has and how it’s organized, and the “duplication of effort” problem where individual searches of different resources yields a number of identical documents.”
W all knew about Federated Search but just didn’t know we were using it. Honestly,it’s (in the grand scheme) a pretty new thing we’re all working with here. The libraries I grew up using had card catalogues. Man, did I hate that card catalogue!
Here is an article The Truth About Federated Search by Paula J. Hane that “humanizes” Federated Search. It’s not quite as perfect as we’d all like it to be and it is limited. Here are some ideas dispelled:
“1. Federated search engines leave no stone unturned.
Reality: Not all federated search engines can search all databases, although most can search Z39.50 and free databases.
2. De-dupe really works.
Reality: For federated search engines, true de-duplication is virtually impossible. In order to de-dupe, the engine would have to download all search results and compare them.
3. Relevancy rankings are totally relevant.
Reality: It’s impossible to perform a relevancy ranking that’s totally relevant. A relevancy ranking basically counts the occurrence of words being searched in a citation. The abstract and full-text data, as well as the indexing that content providers use to relevancy-rank their content, are unavailable to federated search engines.
4. Federated searching is software.
Reality: It certainly is software, but it’s best consumed as a service. A federated search engine searches databases that update and change an average of 2 to 3 times per year. This means that a system accessing 100 databases is subject to between 200 and 300 updates per year—almost one per day!
5. We don’t make your search engine. We make your search engine better.
Reality: You can’t get better results with a federated search engine than you can with the native database search… it’s restricted to the capabilities of the native database’s search function. A federated search can’t do a three-term search with Boolean operators in a native database whose interface doesn’t support it. “
So the lesson here is that Federated Search is not perfect. We wouldn’t expect it to be. It is certainly more broad-should you need a search like that-and will yield some results that you perhaps wouldn’t have gotten with the regular search engines. Keep in mind, there is a TON of information returned from places like Google and that most of the time it’s the user who lacks the ability to properly search.
For those who are attached to Google, they do actually offer a federated search. It’s called Google Scholar-yes I’ve used that too.
Here is a video about the Deep Web, where Federated Search is present over your everyday search engine like Google or Yahoo! Also, watch this video of Matt Cutts describing Google, Deep Web and optimization and Google’s crawling criterion.