Easy ways to determine if search engines can find your site
If search engines can’t find your site, it may as well be the Invisible Man. Search engines like Yahoo!, Google, Bing and others are often a primary source of traffic and determine your ranking. Unfortunately, publishing your files to the Internet does not guarantee that the search engines are going to find them.
Search engines crawl the Web, indexing pages and following links to find more pages. That’s their job. Pages that are newly published can appear in a search engine’s index (and in search results) within minutes, but sometimes it takes hours or even days.
This article will help you to get an idea of whether the search engines are finding your site, and what they see. In the examples below we’ll look at Yahoo!, Google and Bing, the three search engines with the highest market share in the U.S.
How to tell if your pages are being found by search engines
To determine if your site is indeed being indexed, do this simple search in the search box on any of the search engines by putting “site:” before your URL. Specific instructions like this are called “operators.” By typing that before the URL, you’re using what’s known as the “site operator.”
Example: site:yoursite.com
Don’t leave any spaces in the query. It should look like this:

The results will bring back pages from that site only. If you do not see any results from a site:yoursite.com search, then the search engine is not finding your site.
Google does not show duplicate pages in these results, but it does allow you to see what’s been filtered. In order to see all pages, including ones Google deems as duplicates, look for a link after your very last search result listing that says, repeat the search with the omitted results included. Click on that link to see all pages that Google considers duplicates of the ones listed in the initial query results.

Or simply add &filter=0 to the end of the URL in your browser address bar and hit enter.

If you don’t see the “repeat the search with the omitted results included” link or do not see any changes when you add &filter=0 to your URL string, then you don’t have any previously filtered duplicate pages. This is a good thing because duplicate pages can split your in-link value among many landing pages instead of one, potentially hurting the rankings of your canonical landing page.
Is your content being crawled by search engines?
Search engines may find your page URLs when they crawl the Web following links, but they may not have the content of your pages indexed. To determine what search engines are actually indexing, you can click on the “cache” link on listings in search results.
If you’re looking for any page on your site–You can use the same site: operator referenced above.
If you’re looking for a specific page on your site–You can do a search for the page by entering the exact URL in the search box.

If you’re looking for specific content on your site—Enter the site: operator followed by an exact phrase in quotes in the search box (no spaces).

To check what each search engine has cached, click on the cache link under the result you’re interested in. Here are screenshots of the cache link on Yahoo! (Google and Bing look pretty much the same).

When you click on the cached page, you will see the content that the search engine has actually indexed. Compare that to the page you see in your browser when you visit the page itself. Do you see any content missing?
Note the content may have changed since the last time the search engine crawled your page. Search engines also sometimes choose not to index “noise” on a page such as advertisements. What is important to look for here is that the topical content of the page at the time the crawler visited was indeed indexed. If there is important content missing, there could be various reasons why.
Read the Search Engine Guidelines referenced below for more information, and stay tuned for the next SEO article where we’ll discuss steps you can take to make sure you’re doing everything you can to help the search engines index and crawl your content.
Search engine guidelines for webmasters
—Laura Lippay, Director of Technical Marketing
(Image courtesy ‘J’, via Flickr, CC 2.0)