如何使用Google的Indexing API ? ► emperinter

如何使用Google的Indexing API ?

该文章创建(更新)于08/7/2022，请注意文章的时效性！

最近搞了一站点提交给sitemap给search console，结果一直显示无法读取此站点地图，sitemap多次核查是没有问题的，百度和bing等等收录正常，自己写了一个simemap生成器仍是如此。后来发现Google也支持API提交的就尝试搞了一下。

参考: https://developers.google.com/search/apis/indexing-api/v3/quickstart?hl=zh-cn
注册感觉没啥可说的按上面教程来就行，注意给授权就行。
requirements.txt

beautifulsoup4==4.11.1
bs4==0.0.1
cachetools==5.2.0
certifi==2022.6.15
charset-normalizer==2.1.0
google-api-core==2.8.2
google-api-python-client==2.55.0
google-auth-httplib2==0.1.0
google-auth==2.9.1
googleapis-common-protos==1.56.4
httplib2==0.20.4
idna==3.3
oauth2client==4.1.3
pip==21.3.1
protobuf==4.21.4
pyasn1-modules==0.2.8
pyasn1==0.4.8
pyparsing==3.0.9
requests==2.28.1
rsa==4.9
setuptools==60.2.0
six==1.16.0
soupsieve==2.3.2.post1
uritemplate==4.1.1
urllib3==1.26.11
wheel==0.37.1

按照requirements.txt安装

pip install -r requirements.txt

注意配置修改代码与自己相关的json验证文件和网站地图文件的路径
代码如下

https://github.com/emperinter/GoogleIndexAPI

from oauth2client.service_account import ServiceAccountCredentials
import httplib2
import requests as req
from bs4 import BeautifulSoup

def index(url):
    SCOPES = [ "https://www.googleapis.com/auth/indexing" ]
    ENDPOINT = "https://indexing.googleapis.com/v3/urlNotifications:publish"

    # service_account_file.json is the private key that you created for your service account.
    JSON_KEY_FILE = "your-index-api.json"

    credentials = ServiceAccountCredentials.from_json_keyfile_name(JSON_KEY_FILE, scopes=SCOPES)

    http = credentials.authorize(httplib2.Http())

    # Define contents here as a JSON string.
    # This example shows a simple update request.
    # Other types of requests are described in the next step.

    content = "{\"url\": \"%s\", \"type\": \"URL_UPDATED\"}" % url

    response, content = http.request(ENDPOINT, method="POST", body=content)
    return response

all_link = []
origin_rul = 'https://your-sitemap-url.xml'
r = req.get(origin_rul)
bs = BeautifulSoup(r.content, 'html.parser') #解析网页
hyperlink = bs.find_all(name = 'loc')  # 标签是否要附加信息，如要附加。去BeautifulSoup查看文档，我目前测试过attrs={'alt' : ''}
for h in hyperlink:
    hh = h.string
    all_link.append(hh)

all_link.reverse()

sent = []

# 打开文件
fo = open("sent.txt", "r")
print("文件名为: ", fo.name)

for line in fo.readlines():  # 依次读取每行
    line = line.strip()  # 去掉每行头尾空白
    sent.append(line)  # 将每行的内容添加到列表中
    print("读取的数据为: %s" % (line))

# 关闭文件
fo.close()

for link in all_link:
    if link not in sent:
        print(link)
        res = index(link)
        if res.get("status") == "200":
            with open("sent.txt", 'a+') as f:
                f.write(str(link) + '\n')  # 加\n换行显示
        else:
            print(res)
            break
    else:
        print(str(link) + '已经发送过了')
        continue

要不赞赏一下?

微信

支付宝

PayPal

Bitcoin

除非特别说明，本博客所有作品均采用知识共享署名-非商业性使用-禁止演绎 4.0 国际许可协议进行许可。转载请注明转自-
https://www.emperinter.info/2022/08/07/how-to-use-googles-indexing-api/

阿里云国际版	20美元
Vultr	10美元
搬瓦工 \| Bandwagon	应该有折扣吧？
Just My Socks	JMS9272283 【注意手动复制去跳转】
域名 \| namesilo	`emperinter`(1美元)
币安	币安

要不赞赏一下?

要不聊聊？

YouTube | B站

微信公众号

My Project

My Github Contributions

优惠码

近期文章

相关文章：

要不赞赏一下?

要不聊聊？

YouTube | B站

微信公众号

My Project

My Github Contributions

优惠码

近期文章