Archive

Posts Tagged ‘Search Engines’

How to Eliminate 404 Page’s


404 define and eliminating 404 error

Define 404:  A 404 error means “not found”. This is usually the page you get when you make a mistake spelling page name in a site, or if the page is deleted or moved. The problem is that the standard 404 page is ugly and unhelpful.

If your website has too many 404 Errors floating around in the search engines a few things may happen:

  • Your website popularity may be harm.
  • Your website rankings are affected.
  • Your website may be penalized
  • Your website may be banned from the index

Eliminate 404 Page’s from Google’s index:

  • Set Up In Google Webmaster Tools
  • Finding your 404-error pages
  • Killing your 404-error pages

Recommended Solution: You can redirect [404 error pages/ not found pages] using 301 redirect to home page.  In this case visitors and Google see your home page instead of error page. Its a way and its search engine and Google friendly too.

 

Hope this helps…@araghuwanshi6

Ways To Make Search Engines Can Crawl Your Site


I’m talking about creating your website to ensure that search engines can find your products, services and all the content you have published.

Here are eight ways to ensure that search engines have no problem finding and indexing your web pages:

1. Avoid flash: Flash is not inherently bad. When used correctly, can improve the visitor experience. But your site should not be built entirely in Flash or your site navigation is only done in Flash. Search engines have stated a few years now that they are better at crawling Flash, but it’s still not a substitute for good, the menus and the site searchable content.

2. Avoid AJAX: The same ideas mentioned above apply here for flash AJAX. You can add the user experience of your site, but AJAX is, historically, has not been visible to search engines. Google offers a guide to help make AJAX-based content search, but it is complicated and SEO “best practices” recommendations remain the same: Do not put important content on AJAX.

3. Avoid complex JavaScript menus: JavaScript is another technology that search engines are increasingly crawl, but it is always better to prevent the main method of presentation of the site navigation. In 2007, Google said:

While we strive to understand the JavaScript code, the best option to create a site that indexed by Google and other search engines is to provide HTML links to the content.

It remains the best practice today: Make your site navigation is presented in simple, easy to crawl HTML links.

4. Avoid long dynamic URLs: A “dynamic URL” is defined simply as having a “?” In it, as

http://www.yourdomain.com/page.src?ID=1987

This is a very simple dynamic URLs, and search engines are now indexing something. But when the dynamic URLs are always longer and more complex, search engines may be less likely to index them (for various reasons, one of which is that research shows that researchers prefer a short URL). So, if the URL looks like this, you may need crawlability problems with:

http://www.yourdomain.com/page.src?ID=0897&dasda=453456565&CID=336794445&=93009asd09a9sc

Google Webmaster Help page which reads: “… be aware that all search engines crawl dynamic pages and static pages. It help keep the parameters short and the number of them few.”

5. Avoid session IDs in URLs: This is an offshoot of the previous section, but must be listed separately. Search engines do not crawl and index URLs that has a session ID. Why? Because even if the session ID is different URL each time the spider visits, the contents of the current page is the same. If indexing URLs with session IDs, then there would be a ton of duplicate content appear in search results.

SESSION=9875e907332atf56

6. Avoid robots.txt blocking: First of all, there is no need to have a robots.txt file on a web site, millions of web pages very well without it. But if you use something (perhaps because you want to make sure your administrator or members-only pages are not indexed), be careful not to completely prevent the robot, the entire website.

In any case, your robots.txt file is something like this:

User-agent: *
Disallow: /

That code block all spiders from accessing your site. If you ever have any questions about using a robots.txt file, visit robotstxt.org.

If you take care of all the above questions, you can be sure you’ve made it as easy as possible for search engines to crawl and index your site.