Search Engine Terms

Posted on September 10th, 2006 by Admin.
Categories: SEO, Web-Crawler, Page Rank, Googlism, Search Engine.

Hey ,

Here are some terms commonly used in Search Engine technologies. ..

Algorithm : The complex mathematical formulae that Search Engines use to determine the rank of a particular website for specific keyword. The algorithms evolve constantly to keep ahead of unfair practices and provide more accurate and relevant search results for user queries.

Alt Text : The text that is displayed when the mouse cursor is held static on an image. It is primarily used as a place holder in case the image on the web browser is unable to load.

Anchor Text : The text/copy written on the web page that works as a link

Back Link : A link from another page which links to your page. It is also called an inbound link.

BLOG : Web LOG is a journal kept on the internet. This journal is often updated daily and contains all information that the person maintaining the BLOG (Blogger) wishes to share with the world. Also applies to websites dedicated to a particular topic and being updated with the latest news, views and trends.

Broken Link : A link that no longer points to an active destination. It is also called a dead link.

Cloaking : A web optimisation technique used to serve a different version of the same page to Search Engine spiders and a different version to users. Sometimes separate versions of the page are created for different Search Engine spiders to promote rankings.

Crawlers : Programs created by Search Engines that go around the internet collecting information about the websites. The process of visiting a website and recording information is known as indexing. The information collected by these spiders is then used to rank the websites. Crawlers are also known as spiders and Robots .

Cross Linking : A simple process where two websites provide links to each other.

Deep Linking : Process of linking pages embedded in the directories of your website from your home page or other pages, to facilitate indexing of the page by the Search Engine spiders.

Description Tag : A Meta tag that provides certain Search Engines spiders/ crawlers with a description of the web page. This description is often displayed along the search result for your website.

Header Tag : The tag that defines the Title, Description and other Meta information of the web page.

Hidden Text The text on the webpage that is difficult to see because it is the same colour as the background. This technique is considered spamming and should be avoided.

Inbound Link : A link from another page that links to your page.

Indexing : The process by which a Search Engine spider visits your web page and collects information about it.

Keyword : A word entered into the search box of a Search Engine or Directory by a user to look for information pertaining to the word on the internet.

Keyword Density
: A percentage calculated on the basis of the number of times a keyword occurs in your web page copy against the total number of words on your web page.

Keyword Stuffing : A technique where too many instances of the keyword are put into a web page without any context or use to make it keyword rich or increase keyword density. This practice is considered spamming and should be avoided at all times.

Keyword Tag : A Meta tag that defines the keywords that the web page is targeting. It is considered, in the current day and age, to be of little or of no use. Search engine spiders no longer assign any importance to this tag.

Link farms : A process by which independent websites create a complex linking structure to build their Link Popularity. Since websites participating in these link farms create links to each other without any context or genuine reason, they are considered an infringement by most Search Engines and if caught carry a penalty.

Link Popularity
: It is defined by the quality and quantity of inbound and outbound links of your web page. By quality we mean the reputations of the websites that link to you, the titles of the pages that link back to you, the text used to link to your website and a few other factors. By quantity we simply mean the number of links on the internet which point to your website.

Linking Strategy : The planning process that goes into creating link popularity. Deciding who to form associations with, who to exchange links with and who to buy links from.

Meta Tag : A tag created to provide keyword, description and other information to Search Engine spiders and other user agents. This tag is invisible when the page is rendered on the web browser and can be seen by viewing the source of the web page.

Mirror Sites : Websites or web pages with the same or similar content as another. They could be used to target near same keywords as the other. As they provide no new and no useful information a Search Engine may penalize a mirror website.

Optimisation / Optimization : When used in the context of Search Engine Optimisation, it is a series of steps that promote a webpage or website on the internet and strive to achieve higher rankings on the Search Engines.

Outbound Link : The opposite of inbound link. An outbound link is a link from your web page to another page.

Paid Inclusion :The process of inclusion in a Web Directory or a Search Engine by paying a fee.

Penalty : A violation of Terms of Inclusion of a Search Engine or Web Directory can result in a penalty. The penalty is usually a ban on the website by the Search Engine. It effectively means that the banned web page or web site will no longer be included in the searches within the Search Engine for a particular period or until the violation is corrected. The violation can be triggered by any factor like spamming, participating in Link farms, using hidden text etc.

PPC (Pay Per Click) : A model of Website inclusion where you pay an agreed amount every time a user clicks to your website through a Search Engine. Usually you buy the position you want to rank at, for a particular keyword or keywords and pay the Search Engine every time it generates a hit to your website.

PR (Google PageRank) : A number of link popularity factors combine to produce a rating that Google assigns to your website. This rating is called PageRank and is a score between 0-10. It is an extremely important factor for ranking high in Google.

Query : A search conducted in a Search Engine using a keyword or key phrase.

Ranking : The position of a web page in the SERP, plus the number of results displayed prior to it on previous pages.

Reciprocal Link : A link to your web page by an associate site. When you create a link to another web page (Page B) from your own page (Page A) and the other website does the same, then a link from Page B to Page A is a reciprocal link.

Reciprocal Linking : The process of exchanging links with other websites is called Reciprocal Linking. Since both participating websites get an inbound link, it helps in building link popularity.

Referrer : When a user visits your website by clicking a link from another website, the other website is called a referrer. The referrer could be a Search Engine or an associate website that provides links to your web page.

Robot.txt : A file written and stored in the root directory of a website that restricts the Search Engine spiders from indexing certain pages of the website. This file is used to disallow certain spiders from seeing files that you not want them to see. You can also prevent a certain spider to look at any of the web pages through this file.

And many others …

Cheers,

Kyle

0 comments.

Search Engine Glossary

Posted on September 10th, 2006 by Admin.
Categories: SEO, Web-Crawler, Page Rank, Googlism, Search Engine.

Hey ,
Here are some of major Search Engines and Web Directories …

Google (http://google.com)

A Search Engine, inclusion in which is free. it’s the world biggest . It’s highly recommended as a first stop in your hunt for whatever you are looking for.

Yahoo (http://www.yahoo.com)

Launched in 1994, Yahoo is the web’s oldest “directory,” a place where human editors organize web sites into categories. However, in October 2002, Yahoo made a giant shift to crawler-based listings for its main results. These came from Google until February 2004. Now, Yahoo uses its own search technology. Learn more in this recent review from our SearchDay newsletter, which also provides some updated submission details.

The Yahoo Directory still survives. You’ll notice “category” links below some of the sites lists in response to a keyword search. When offered, these will take you to a list of web sites that have been reviewed and approved by a human editor.

It’s also possible to do a pure search of just the human-compiled Yahoo Directory, which is how the old or “classic” Yahoo used to work. To do this, search from the Yahoo Directory home page, as opposed to the regular Yahoo.com home page. Then you’ll get both directory category links (”Related Directory Categories”) and “Directory Results,” which are the top web site matches drawn from all categories of the Yahoo Directory.

Sites pay a fee to be included in the Yahoo Directory’s commercial listings, though they must meet editor approval before being accepted. Non-commercial content is accepted for free. Yahoo’s content acquisition program also offers paid inclusion, where sites can also pay to be included in Yahoo’s crawler-based results. This doesn’t guarantee ranking, Yahoo promises. The CAP program also bring in content from non-profit organizations for free.

Like Google, Yahoo sells paid placement advertising links that appear on its own site and which are distributed to others. These are sold through Overture. Yahoo purchased Overture in a company Yahoo purchased in October 2003.

Overture was formerly called GoTo until late 2001. More about it can be found on the Paid Listings Search Engines page. Overture purchased AllTheWeb in March 2003 and acquired AltaVista in April 2003. Now Yahoo owns these, gained as from its purchase of Overture.

Technology AltaVista and AllTheWeb was combined with that of Inktomi, a crawler-based search engine that grew out UC Berkeley and then launched as its own company in 1996, to make the current Yahoo crawler. Yahoo purchased Inktomi in March 2003.

AOL Search http://search.aol.com/
AOL Search provides users with editorial listings that come Google’s crawler-based index. Indeed, the same search on Google and AOL Search will come up with very similar matches.

Ask Jeeves (AJ) (http://ask.com)
Search Engine part of the Teoma Search Engine Network. AJ has a paid inclusion / paid listing program.

Ask Jeeves initially gained fame in 1998 and 1999 as being the “natural language” search engine that let you search by asking questions and responded with what seemed to be the right answer to everything. Today, Ask Jeeves instead depends on crawler-based technology to provide results to its users .

Ask Jeeves also owns now closed Direct Hit service.
DMOZ ( http://dmoz.org )
Web Directory edited by human editors. Also known as ODP (Open Directory Project). Provides directory results to Google and other search engines. It is considered very important from the Search Engine Optimisation view and it is also the most tedious and difficult to get in.

HotBot  (http://hotbot.com )
HotBot provides easy access to the web’s three major crawler-based search engines: Yahoo, Google and Teoma. Unlike a meta search engine, it cannot blend the results from all of these crawlers together. Nevertheless, it’s a fast, easy way to get different web search “opinions” in one place.

Lycos had acquired HotBot  .

Teoma   (http://teoma.com )
Teoma is a crawler-based search engine owned by Ask Jeeves.

LookSmart (http://search.looksmart.com )

LookSmart is primarily a human-compiled directory of web sites, very much like an electronic “Yellow Pages “  .

Lycos (http://lycos.com)

Lycos is one of the oldest search engines on the web, launched in 1994. It ceased crawling the web for its own listings in April 1999 and instead provides access to human-powered results from LookSmart for popular queries and crawler-based results from Yahoo for others.

MSN (http://www.msn.com )

Microsoft Network A Search Engine.

MSN Search is in transition. It provides access to Yahoo listings but not as much functionality in terms of other types of searches that you’ll find at Yahoo itself, MSN is developing its own crawler-based technology .

ODP (Open Directory Project) : Web Directory edited by human editors

And many others. …

Cheers ,

Kyle,

http://sgugal.com

0 comments.

What is Rank ???

Posted on September 10th, 2006 by Admin.
Categories: SEO, Web-Crawler, Page Rank, Googlism.

PageRank is developed by Page and Brin, founders of Google.

It is proprietary website ranking system (0 to 10) . A website’s PageRank not just depends on the number of backlinks, but also on the quality/ importance of these backlinks.

Alexa Rank is a measure of website popularity.
The Alexa Rank ranges from 1 to the number of sites in its database
(0 = unranked). Owned by Amazon.com, the Alexa Rank utilizes the optional Alexa toolbar to track each and every website visit. Each visit to a site counts as one visit (called ‘reach’); multiple visits to the same site during a day only count as a single visit. Alexa also tracks the number of sub page visits to each website (called ‘page views’). The Alexa Rank is calculated daily (or so) by using values collected for ‘reach’ and ‘page visits’ and then factors in the values for the prior three months. Updates seem to occur every 3 days or so.

Cheers,

Admin

0 comments.

Search Engine Basics

Posted on September 4th, 2006 by Neo.
Categories: SEO, Web-Crawler.

The term “search engine” is often used generically to describe both crawler-based search engines and human-powered directories. These two types of search engines gather their listings in radically different ways.

Crawler-Based Search Engines

Crawler-based search engines, such as Google, create their listings automatically. They “crawl” or “spider” the web, then people search through what they have found.

If you change your web pages, crawler-based search engines eventually find these changes, and that can affect how you are listed. Page titles, body copy and other elements all play a role.

Human-Powered Directories

A human-powered directory, such as the Open Directory, depends on humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted.

Changing your web pages has no effect on your listing. Things that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site.

“Hybrid Search Engines” Or Mixed Results

In the web’s early days, it used to be that a search engine either presented crawler-based results or human-powered listings. Today, it extremely common for both types of results to be presented. Usually, a hybrid search engine will favor one type of listings over another. For example, MSN Search is more likely to present human-powered listings from LookSmart. However, it does also present crawler-based results (as provided by Inktomi), especially for more obscure queries.

The Parts Of A Crawler-Based Search Engine

Crawler-based search engines have three major elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site. This is what it means when someone refers to a site being “spidered” or “crawled.” The spider returns to the site on a regular basis, such as every month or two, to look for changes.

Everything the spider finds goes into the second part of the search engine, the index. The index, sometimes called the catalog, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with new information.

Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been “spidered” but not yet “indexed.” Until it is indexed — added to the index — it is not available to those searching with the search engine.

Search engine software is the third part of a search engine. This is the program that sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant

Search for anything using your favorite crawler-based search engine. Nearly instantly, the search engine will sort through the millions of pages it knows about and present you with ones that match your topic. The matches will even be ranked, so that the most relevant ones come first.

Of course, the search engines don’t always get it right. Non-relevant pages make it through, and sometimes it may take a little more digging to find what you are looking for. But, by and large, search engines do an amazing job.

As WebCrawler founder Brian Pinkerton puts it, “Imagine walking up to a librarian and saying, ‘travel.’ They’re going to look at you with a blank face.”

OK — a librarian’s not really going to stare at you with a vacant expression. Instead, they’re going to ask you questions to better understand what you are looking for.

Unfortunately, search engines don’t have the ability to ask a few questions to focus your search, as a librarian can. They also can’t rely on judgment and past experience to rank web pages, in the way humans can.

So, how do crawler-based search engines go about determining relevancy, when confronted with hundreds of millions of web pages to sort through? They follow a set of rules, known as an algorithm. Exactly how a particular search engine’s algorithm works is a closely-kept trade secret. However, all major search engines follow the general rules below.

Location, Location, Location…and Frequency

One of the the main rules in a ranking algorithm involves the location and frequency of keywords on a web page. Call it the location/frequency method, for short.

Remember the librarian mentioned above? They need to find books to match your request of “travel,” so it makes sense that they first look at books with travel in the title. Search engines operate the same way. Pages with the search terms appearing in the HTML title tag are often assumed to be more relevant than others to the topic.

Search engines will also check to see if the search keywords appear near the top of a web page, such as in the headline or in the first few paragraphs of text. They assume that any page relevant to the topic will mention those words right from the beginning.

Frequency is the other major factor in how search engines determine relevancy. A search engine will analyze how often keywords appear in relation to other words in a web page. Those with a higher frequency are often deemed more relevant than other web pages.

Spice In The Recipe

Now it’s time to qualify the location/frequency method described above. All the major search engines follow it to some degree, in the same way cooks may follow a standard chili recipe. But cooks like to add their own secret ingredients. In the same way, search engines add spice to the location/frequency method. Nobody does it exactly the same, which is one reason why the same search on different search engines produces different results.

To begin with, some search engines index more web pages than others. Some search engines also index web pages more often than others. The result is that no search engine has the exact same collection of web pages to search through. That naturally produces differences, when comparing their results.

Search engines may also penalize pages or exclude them from the index, if they detect search engine “spamming.” An example is when a word is repeated hundreds of times on a page, to increase the frequency and propel the page higher in the listings. Search engines watch for common spamming methods in a variety of ways, including following up on complaints from their users.

Off The Page Factors

Crawler-based search engines have plenty of experience now with webmasters who constantly rewrite their web pages in an attempt to gain better rankings. Some sophisticated webmasters may even go to great lengths to “reverse engineer” the location/frequency systems used by a particular search engine. Because of this, all major search engines now also make use of “off the page” ranking criteria.

Off the page factors are those that a webmasters cannot easily influence. Chief among these is link analysis. By analyzing how pages link to each other, a search engine can both determine what a page is about and whether that page is deemed to be “important” and thus deserving of a ranking boost. In addition, sophisticated techniques are used to screen out attempts by webmasters to build “artificial” links designed to boost their rankings.

Another off the page factor is clickthrough measurement. In short, this means that a search engine may watch what results someone selects for a particular search, then eventually drop high-ranking pages that aren’t attracting clicks, while promoting lower-ranking pages that do pull in visitors. As with link analysis, systems are used to compensate for artificial links generated by eager webmasters.

–Neo

0 comments.