用Pyecharts实现数据可视化

该文章创建(更新)于05/1/2020，请注意文章的时效性！

文章目录[隐藏]

简介
官网教程
难点：数据的处理
例子：实现博客文章的可视化

2TlC3SGPdFJHKVe

简介

Echarts 是一个由百度开源的数据可视化，凭借着良好的交互性，精巧的图表设计，得到了众多开发者的认可。而 Python 是一门富有表达力的语言，很适合用于数据处理。当数据分析遇上数据可视化时，pyecharts 诞生了。

官网教程

官网的介绍以及例子已经十分详细了，我就不当复读机了；

难点：数据的处理

这个我是先选择一个可视化所要达到的效果，去看例子的数据格式是怎么样的，然后把自己的数据转换成其对应所需要的数据格式！

例子：实现博客文章的可视化

可视化思路

按照年份和月份来可视化自己发布文章的数量，我自己文章链接刚好就是按照年份和月份加文章标题实现的，所以爬取链接就行了！

如何获取网站的链接

我自己是写了一个php，放到了网站下，详情去看：python主动推送sitemap到百度,现在想想是之前没太弄明白sitemap的格式，这个php仅适用于wordpress站，能力强的自己可以写一个爬虫去爬取自己的sitemap来获取文章就行了；我sitemap有点多就不想改了！

<?php
require('wp-blog-header.php');

header("Content-type: text/xml");

header('HTTP/1.1 200 OK');

$posts_to_show = 2000; 

echo '<?xml version="1.0" encoding="UTF-8"?>';

echo '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:mobile="http://www.baidu.com/schemas/sitemap-mobile/1/">'

?>

<?php

$myposts = get_posts( "numberposts=" . $posts_to_show );

foreach( $myposts as $post ) { ?>

 <url>

 <loc><?php the_permalink(); ?></loc>

 </url>

<?php }?>

</urlset>

Python代码实现

#实现把所有文章的链接生成对应的json
import requests as req
import json
import math
import time
import re
from bs4 import BeautifulSoup
from pyecharts import options as opts
from pyecharts.charts import Tree

# 用于获取网站的所有文章链接
# 文章链接已经过网站目录下写的all.php生成
i = 0   # links number
all_link = []
origin_rul = 'https://www.emperinter.info/all'
r = req.get(origin_rul)
bs = BeautifulSoup(r.content, 'html.parser') #解析网页
hyperlink = bs.find_all(name = 'loc') 
for h in hyperlink:
    hh = h.string
    all_link.append(hh)
    i += 1    

# 
#
# 把所有文章目录按年-月的组合形式生成json文件
#
#
year = 2019 #2018单独拿出来
month = 1 # 1月单独拿出来

# 用正则表达式匹配特定年和月是否有文章存在，若存在则返回，若无则返回 NO！
def rematch(year,month):
        for m in all_link:
            if(re.match(r'^((https|http|ftp|rtsp|mms)?://)www.emperinter.info/' + str(year) + '/' + str(month),m)):
                    return(m)
            else:
                return('No!')

string = '{"name":"https://www.emperinter.info","children":[{"name":"2018","children":[{"name":"*"}'
while(month <= 12):
    if(month < 10):
        zero_month = '0' + str(month)
    else:
        zero_month = month
    string = string + ',{"name":"' + str(zero_month) + '","children":[{"name":"*"}'
    for m in all_link:
        if(re.match(r'^((https|http|ftp|rtsp|mms)?://)www.emperinter.info/' + str(2018) + '/' + str(zero_month),m)):
            string = string + ',{"name":"' + str(m) + '"}'
    string = string + ',{"name":"*"}]}'
    month = month + 1
string = string + ',{"name":"*"}]}'

now = int((time.strftime("%Y", time.localtime()))) # 当前年
while(year <= now): 
    month = 1
    string = string + ',{"name":"' + str(year) + '","children":[{"name":"*"}'
    while(month <= 12): 
        if(month < 10):
            zero_month = '0' + str(month)
        else:
            zero_month = month
        string = string + ',{"name":"' + str(zero_month) + '","children":[{"name":"*"}'
        for m in all_link:
            if(re.match(r'^((https|http|ftp|rtsp|mms)?://)www.emperinter.info/' + str(year) + '/' + str(zero_month),m)):
                string = string + ',{"name":"' + str(m) + '"}'
        string = string + ',{"name":"*"}]}'
        month = month + 1
    string = string + ',{"name":"*"}]}'
    year = year + 1
string = string + ']}'

# 把string写入成为json文件
with open('data.json','w+') as f:
    f.write(string)

# 可视化生成    
with open("data.json", "r", encoding="utf-8") as f:
    j = json.load(f)
c = (
    Tree(init_opts=opts.InitOpts(width="1920px",height="1080px",page_title="emperinter's-blog-visualtion"))
    .add("", 
         [j], 
         collapse_interval=0,
         layout="radial",
         pos_top="18%",
         pos_bottom="14%",
         symbol="emptyCircle",
         symbol_size=7,
         initial_tree_depth=-1,
         is_roam=True
        )
    .set_global_opts(tooltip_opts=opts.TooltipOpts(trigger="item", trigger_on="mousemove"))
    .render("visual.php")
)

添加到crontab定时执行

crontab -e
0 3 * * * python3 WebsiteDir/visual.py

要不赞赏一下?

微信

支付宝

PayPal

Bitcoin

除非特别说明，本博客所有作品均采用知识共享署名-非商业性使用-禁止演绎 4.0 国际许可协议进行许可。转载请注明转自-
https://www.emperinter.info/2020/05/01/pyecharts/

阿里云国际版	20美元
Vultr	10美元
搬瓦工 \| Bandwagon	应该有折扣吧？
Just My Socks	JMS9272283 【注意手动复制去跳转】
域名 \| namesilo	`emperinter`(1美元)
币安	币安

简介

官网教程

难点：数据的处理

例子：实现博客文章的可视化

可视化思路

如何获取网站的链接

Python代码实现

添加到crontab定时执行

要不赞赏一下?

要不聊聊？

YouTube | B站

微信公众号

My Project

My Github Contributions

优惠码

近期文章

简介

官网教程

难点 ：数据的处理

例子：实现博客文章的可视化

可视化思路

如何获取网站的链接

Python代码实现

添加到crontab定时执行

相关文章：

要不赞赏一下?

要不聊聊？

YouTube | B站

微信公众号

My Project

My Github Contributions

优惠码

近期文章

难点：数据的处理