Announcement

Collapse
No announcement yet.

XML Site Maps, Image URLs and Robots Files (AKA Lions, Tigers and Bears...oh my)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    XML Site Maps, Image URLs and Robots Files (AKA Lions, Tigers and Bears...oh my)

    Just a heads up to anyone using an XML Site Map (like our Moogle, XML Site map module) AND including product image urls.

    Makes sure you check your robots.txt file. Many may have something like this:

    User-agent: *
    Disallow: /mm5/

    To prevent search bots from listing your Merchant URLs when using SEO style links. If so, you need to add.

    User-agent: *
    Disallow: /mm5/
    Allow: /mm5/graphics/00000001/

    (Thanks to Greg Lewis at tandampac.com for bringing this up.)
    Bruce Golub
    Phosphor Media - "Your Success is our Business"

    Improve Your Customer Service | Get MORE Customers | Edit CSS/Javascript/HTML Easily | Make Your Site Faster | Get Indexed by Google | Free Modules | Follow Us on Facebook
    phosphormedia.com

    #2
    Re: XML Site Maps, Image URLs and Robots Files (AKA Lions, Tigers and Bears...oh my)

    This is definitely needed with Google moving to the Image Bot to digest images for Google Shopping and other feeds some time back. Good advise :)

    Comment


      #3
      Re: XML Site Maps, Image URLs and Robots Files (AKA Lions, Tigers and Bears...oh my)

      Which directory show contain the robots.txt file? PS Found it via Google. Larry
      Last edited by wajake41; 10-07-13, 08:21 AM.
      Larry
      Luce Kanun Web Design
      www.facebook.com/wajake41
      www.plus.google.com/116415026668025242914/posts?hl=en


      Comment


        #4
        Re: XML Site Maps, Image URLs and Robots Files (AKA Lions, Tigers and Bears...oh my)

        Hi: How does the robots.txt file affect Google indexing of the site? Larry
        Larry
        Luce Kanun Web Design
        www.facebook.com/wajake41
        www.plus.google.com/116415026668025242914/posts?hl=en


        Comment


          #5
          Re: XML Site Maps, Image URLs and Robots Files (AKA Lions, Tigers and Bears...oh my)

          robots.txt file typically resides in the web root directory. However, it can go in any directory, though it can then only effect directories below that point.

          robots.txt file is a search engine 'helper' file. All major search engines look for, and when found, read the instructions in this file. They then, more or less, abide by those instructions. So, 'how they effect' is dependent on what instructions are in there. Sites that use SEO links frequently have the instruction:

          disallow: /mm5/*

          Which prevents them from indexing merchant queries, which, when indexed along with the SEO versions of the link, can cause 'duplicate' content penalties.

          However, since images are called, even with SEO short links, as /mm5/graphics/00000001/filename.jpg, you can't get images indexed with that disallow statement. The follow up statement will instruct SEs to access those images.
          Bruce Golub
          Phosphor Media - "Your Success is our Business"

          Improve Your Customer Service | Get MORE Customers | Edit CSS/Javascript/HTML Easily | Make Your Site Faster | Get Indexed by Google | Free Modules | Follow Us on Facebook
          phosphormedia.com

          Comment


            #6
            Re: XML Site Maps, Image URLs and Robots Files (AKA Lions, Tigers and Bears...oh my)

            So the usage of:


            User-agent: *
            Allow: /Merchant2/graphics/
            Allow: /Merchant2/images/

            User-Agent: Googlebot
            Allow: /Merchant2/graphics/
            Allow: /Merchant2/images/


            would be beneficial? Is there any possibility that something like this could cause the Googlebot to encounter "15 errors while attempting to connect to your site." I think it had more to do with the site being down a number of times in the past few days but I wanted to make sure.
            Leslie Kirk
            Miva Certified Developer
            Miva Merchant Specialist since 1997
            Previously of Webs Your Way
            (aka Leslie Nord leslienord)

            Email me: [email protected]
            www.lesliekirk.com

            Follow me: Twitter | Facebook | FourSquare | Pinterest | Flickr

            Comment


              #7
              Re: XML Site Maps, Image URLs and Robots Files (AKA Lions, Tigers and Bears...oh my)

              No, that instruction would not cause this type of error. The error specifically refers to "connections". The robot.txt file directives have nothing to do with connections, just where google and other bots are allowed to go. (hint, they wouldn't see the robots.txt file UNLESS they could connect.)

              BTW: those rules are redundant since * means all, which includes Google.
              Bruce Golub
              Phosphor Media - "Your Success is our Business"

              Improve Your Customer Service | Get MORE Customers | Edit CSS/Javascript/HTML Easily | Make Your Site Faster | Get Indexed by Google | Free Modules | Follow Us on Facebook
              phosphormedia.com

              Comment


                #8
                Re: XML Site Maps, Image URLs and Robots Files (AKA Lions, Tigers and Bears...oh my)

                That's what I thought, makes me a little concerned when a host would suggest that the robots.txt file could be the cause of the connection issues.
                Leslie Kirk
                Miva Certified Developer
                Miva Merchant Specialist since 1997
                Previously of Webs Your Way
                (aka Leslie Nord leslienord)

                Email me: [email protected]
                www.lesliekirk.com

                Follow me: Twitter | Facebook | FourSquare | Pinterest | Flickr

                Comment


                  #9
                  Re: XML Site Maps, Image URLs and Robots Files (AKA Lions, Tigers and Bears...oh my)

                  Well, yea, nothing under a hosts control could effect connection issues so it has to be something you did.
                  Bruce Golub
                  Phosphor Media - "Your Success is our Business"

                  Improve Your Customer Service | Get MORE Customers | Edit CSS/Javascript/HTML Easily | Make Your Site Faster | Get Indexed by Google | Free Modules | Follow Us on Facebook
                  phosphormedia.com

                  Comment


                    #10
                    wrong place..
                    Last edited by Datagg; 03-20-16, 10:44 AM.
                    Dan

                    Girlfriends Lingerie - "Keeping It Sexy!"
                    Sexy Lingerie - Twitter - Facebook- Pinterest - YouTube

                    Comment

                    Working...
                    X