用Python做桑基图

作者

Simonzhou

发布于

2025年3月14日

1 桑基图

桑基图(Sankey diagram),即桑基能量分流图,也叫桑基能量平衡图。它是一种特定类型的流程图,概述图中延伸的分支的宽度对应数据流量的大小,通常应用于能源、材料成分、金融等数据的可视化分析。因1898年Matthew Henry Phineas Riall Sankey绘制的“蒸汽机的能源效率图”而闻名,此后便以其名字命名为“桑基图”。

Sankey diagrams are a data visualisation technique or flow diagram that emphasizes flow/movement/change from one state to another or one time to another, in which the width of the arrows is proportional to the flow rate of the depicted extensive property. The arrows being connected are called nodes and the connections are called links.

Sankey diagrams can also visualize the energy accounts, material flow accounts on a regional or national level, and cost breakdowns.The diagrams are often used in the visualization of material flow analysis.

Sankey diagrams emphasize the major transfers or flows within a system. They help locate the most important contributions to a flow. They often show conserved quantities within defined system boundaries.(Wikipedia contributors 2025)

1.1 桑基图的主要应用场景

桑基图常用于可持续能源、物流、人口流动、资源分配等领域的数据可视化。它可以帮助用户直观地理解和分析复杂的流动和关系,从而支持决策和策划过程。

1.2 桑基图的特点:

  1. 节点:桑基图由一系列节点组成,每个节点代表一个特定的实体或类别。例如,节点可以代表不同的时间、地点和部门等。

  2. 箭头:箭头表示流动的路径,从一个节点流向另一个节点。箭头的宽度通常表示流量或数量的大小。

  3. 流量量级:桑基图可以显示不同节点之间的流量量级,通过箭头的宽度来表示。宽度越大,表示流量或数量越大。

  4. 路径:桑基图可以显示多个节点之间的复杂路径,通过连接不同的节点和箭头来表示。

  5. 颜色编码:桑基图可以使用颜色来编码不同的节点或流动路径,以帮助用户更好地理解和区分不同的实体或类别。

1.3 设计的注意事项

在设计桑基图图表时,以下是一些需要注意的事项:

  1. 数据准备:确保数据准备充分,包括节点和流量的数据。节点应该清晰明确,流量数据应该准确可靠。

  2. 简洁明了:桑基图应该保持简洁明了,避免过多的节点和复杂的路径。过多的节点和路径可能会导致图表混乱不清晰,难以理解。

  3. 良好的布局:选择合适的布局方式,使得节点和箭头的排列有一定的逻辑性。可以按照流动的方向或重要性进行布局。

  4. 色彩选择:选择合适的色彩来区分不同的节点和流动路径。颜色应该鲜明对比,以便用户能够清晰地区分不同的实体或类别。

  5. 箭头宽度控制:根据流量的大小,合理调整箭头的宽度。宽度应该能够直观地反映流量的差异,但也不能过于夸张。

2 桑基图的Python实现

2.1 本地Python

安装相关的包和库:

pip install dash
pip install numpy

2.1.1 本地运行示例

Python 终端或编辑器运行后,可以在浏览器中输入 http://127.0.0.1:8051/ 进行查看。

# -*- coding: utf-8 -*-
"""
Created on Fri Mar 14 17:07:53 2025

@author: asus
"""

import dash
from dash import html, dcc, Input, Output
import plotly.graph_objects as go
import dash_bootstrap_components as dbc
import numpy as np
import plotly.express as px
 
 
def create_complex_sankey():
    # 示例数据
    labels = ["能源", "电力", "运输", "工业", "住宅", "商业", "损失", "可再生", "化石燃料", "核能"]
    sources = [0, 0, 0, 1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6]
    targets = [1, 2, 3, 4, 5, 6, 4, 5, 6, 7, 7, 8, 8, 9, 9, 7]
    values = [8, 4, 2, 8, 4, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1]
 
    # 创建桑基图
    sankey_fig = go.Figure(data=[go.Sankey(
        node=dict(
            pad=15,
            thickness=20,
            line=dict(color="black", width=0.5),
            label=labels,
            color=["#FF9999", "#66B3FF", "#99FF99", "#FFCC99", "#FF6666", "#66FF66", "#6666FF", "#FF66FF", "#66FFFF", "#FFFF66"]
        ),
        link=dict(
            source=sources,
            target=targets,
            value=values,
            color=["rgba(255, 153, 153, 0.6)", "rgba(102, 179, 255, 0.6)", "rgba(153, 255, 153, 0.6)", "rgba(255, 204, 153, 0.6)",
                   "rgba(255, 102, 102, 0.6)", "rgba(102, 255, 102, 0.6)", "rgba(102, 102, 255, 0.6)", "rgba(255, 102, 255, 0.6)",
                   "rgba(102, 255, 255, 0.6)", "rgba(255, 255, 102, 0.6)", "rgba(255, 153, 153, 0.6)", "rgba(102, 179, 255, 0.6)",
                   "rgba(153, 255, 153, 0.6)", "rgba(255, 204, 153, 0.6)", "rgba(255, 102, 102, 0.6)", "rgba(102, 255, 102, 0.6)"]
        )
    )])
 
    # 更新布局
    sankey_fig.update_layout(
        title='复杂桑基图示例',
        font_size=10,
        template='plotly_white'
    )
 
    return sankey_fig
 
app = dash.Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])
 
app.layout = html.Div([
    html.H3("桑基图展示", className="text-center mt-4 mb-3"),
    dcc.Graph(figure=create_complex_sankey())
])
 
if __name__ == "__main__":
    app.run_server(debug=True, port=8051)

2.2 Jupyter 实现

为了能在本网页中显示桑基图,需要在 jupyter 中运行该程序,那么除了相关的 jupyter 包需要被安装外,需要更换 dash 包为 jupyter-dash

2.2.1 安装 jupyter-dash

pip install jupyter-dash

2.2.2 运行程序

# -*- coding: utf-8 -*-
"""
Created on Fri Mar 14 17:07:53 2025

@author: asus
"""

from jupyter_dash import JupyterDash  # 替换 dash
from dash import html, dcc, Input, Output
import plotly.graph_objects as go
import dash_bootstrap_components as dbc
import numpy as np
import plotly.express as px

def create_complex_sankey():
    labels = ["能源", "电力", "运输", "工业", "住宅", "商业", "损失", "可再生", "化石燃料", "核能"]
    sources = [0, 0, 0, 1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6]
    targets = [1, 2, 3, 4, 5, 6, 4, 5, 6, 7, 7, 8, 8, 9, 9, 7]
    values = [8, 4, 2, 8, 4, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1]

    sankey_fig = go.Figure(data=[go.Sankey(
        node=dict(
            pad=15,
            thickness=20,
            line=dict(color="black", width=0.5),
            label=labels,
            color=["#FF9999", "#66B3FF", "#99FF99", "#FFCC99", "#FF6666", "#66FF66", "#6666FF", "#FF66FF", "#66FFFF", "#FFFF66"]
        ),
        link=dict(
            source=sources,
            target=targets,
            value=values,
            color=["rgba(255, 153, 153, 0.6)", "rgba(102, 179, 255, 0.6)", "rgba(153, 255, 153, 0.6)", "rgba(255, 204, 153, 0.6)",
                   "rgba(255, 102, 102, 0.6)", "rgba(102, 255, 102, 0.6)", "rgba(102, 102, 255, 0.6)", "rgba(255, 102, 255, 0.6)",
                   "rgba(102, 255, 255, 0.6)", "rgba(255, 255, 102, 0.6)", "rgba(255, 153, 153, 0.6)", "rgba(102, 179, 255, 0.6)",
                   "rgba(153, 255, 153, 0.6)", "rgba(255, 204, 153, 0.6)", "rgba(255, 102, 102, 0.6)", "rgba(102, 255, 102, 0.6)"]
        )
    )])

    sankey_fig.update_layout(
        title='复杂桑基图示例',
        font_size=10,
        template='plotly_white'
    )

    return sankey_fig

# 使用 JupyterDash 替代 Dash
app = JupyterDash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])

app.layout = html.Div([
    html.H3("桑基图展示", className="text-center mt-4 mb-3"),
    dcc.Graph(figure=create_complex_sankey())
])

# 在 Jupyter 中运行,mode='inline' 将图形嵌入笔记本
app.run_server(mode='inline')

2.2.3 网页无法正确显示

因为 GitHub 只支持静态网页,但是 桑基图 是一个通过 Flask 生成的动态 Web 应用,因为当本地生成后托管到 GitHub 后,是无法正确显示该图形。

解决办法:

  1. 将网页托管到支持动态图像的服务器上,如 Render 等

  2. 改用静态图形

import plotly.graph_objects as go

def create_complex_sankey():
    labels = ["能源", "电力", "运输", "工业", "住宅", "商业", "损失", "可再生", "化石燃料", "核能"]
    sources = [0, 0, 0, 1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6]
    targets = [1, 2, 3, 4, 5, 6, 4, 5, 6, 7, 7, 8, 8, 9, 9, 7]
    values = [8, 4, 2, 8, 4, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1]
    sankey_fig = go.Figure(data=[go.Sankey(
        node=dict(pad=15, thickness=20, line=dict(color="black", width=0.5), label=labels),
        link=dict(source=sources, target=targets, value=values)
    )])
    sankey_fig.update_layout(title='复杂桑基图示例', font_size=10, template='plotly_white')
    return sankey_fig

# 保存为静态 HTML 文件
fig = create_complex_sankey()
fig.write_html("sankey.html", include_plotlyjs='cdn')

3 桑基图的不足

  1. 难以精确表达数据(大小)
  2. 一般实在宏观层面上进行分析,难以照顾到微观部分
  3. 由于是从宏观上进行分析,因此容易误导用户

end.

参考文献

Wikipedia contributors. 2025. 《Sankey diagram — Wikipedia, The Free Encyclopedia》. https://en.wikipedia.org/w/index.php?title=Sankey_diagram&oldid=1279167445.