python – 3g覆盖率图 – 可视化lat,long,ping数据

印度阿三17 2019-06-30

展开全文

假设我一直在我的笔记本电脑上用3g调制解调器和GPS驱动设定路线,而我家里的电脑记录了ping延迟.我已将ping与GPS lat / long相关联,现在我想将这些数据可视化.

我每天有大约80,000个数据点,我想显示几个月的价值.我特别感兴趣的是显示ping持续超时的区域(即ping == 1000).

散点图

我的第一次尝试是使用散点图,每个数据输入一个点.如果超时,我将点的大小增加了5倍,因此很明显这些区域在哪里.我也将alpha降为0.1,以粗略的方式看到重叠点.

# Colour
c = pings 
# Size
s = [2 if ping < 1000 else 10 for ping in pings]
# Scatter plot
plt.scatter(longs, lats, s=s, marker='o', c=c, cmap=cm.jet, edgecolors='none', alpha=0.1)

显而易见的问题是它每个数据点显示一个标记,这是显示大量数据的一种非常差的方法.如果我经过两次相同的区域,那么第一遍数据就会显示在第二遍的顶部.

在均匀网格上插值

然后我尝试使用numpy和scipy在偶数网格上进行插值.

# Convert python list to np arrays
x = np.array(longs, dtype=float)
y = np.array(lats, dtype=float)
z = np.array(pings, dtype=float)

# Make even grid (200 rows/cols)
xi = np.linspace(min(longs), max(longs), 200)
yi = np.linspace(min(lats), max(lats), 200)

# Interpolate data points to grid
zi = griddata((x, y), z, (xi[None,:], yi[:,None]), method='linear', fill_value=0)

# Plot contour map
plt.contour(xi,yi,zi,15,linewidths=0.5,colors='k')
plt.contourf(xi,yi,zi,15,cmap=plt.cm.jet)

从this example起

这看起来很有趣(很多颜色和形状),但它在我没有探索过的区域外推得太远了.你看不到我走过的路线,只看到红/蓝斑点.

如果我在一条大曲线上行驶,它将插入两者之间的区域(见下文)：

插入不均匀的网格

然后我尝试使用meshgrid(xi,yi = np.meshgrid(lats,longs))而不是固定网格,但我被告知我的数组太大了.

有没有一种简单的方法可以从我的观点创建一个网格？

我的要求：

>处理大型数据集(80,000 x 60 =〜5m点)
>通过平均(我假设插值将执行此操作)或通过为每个点取最小值来显示每个点的重复数据.
>不要过分推断数据点

我很满意散点图(顶部),但我需要一些方法来平均数据才能显示它.

(对于狡猾的mspaint图纸道歉,我无法上传实际数据)

解：

# Get sum
hsum, long_range, lat_range = np.histogram2d(longs, lats, bins=(res_long,res_lat), range=((a,b),(c,d)), weights=pings)
# Get count
hcount, ignore1, ignore2 = np.histogram2d(longs, lats, bins=(res_long,res_lat), range=((a,b),(c,d)))
# Get average
h = hsum/hcount
x, y = np.where(h)
average = h[x, y]
# Make scatter plot
scatterplot = ax.scatter(long_range[x], lat_range[y], s=3, c=average, linewidths=0, cmap="jet", vmin=0, vmax=1000)

解决方法:

为了简化您的问题,您有两组点,一组用于ping< 1000,一组用于ping> = 1000.
由于点数非常大,因此无法通过scatter()直接绘制它们.我创建了一些示例数据：

longs = (np.random.rand(60, 1)   np.linspace(-np.pi, np.pi, 80000)).reshape(-1)
lats = np.sin(longs)   np.random.rand(len(longs)) * 0.1

bad_index = (longs>0) & (longs<1)
bad_longs = longs[bad_index]
bad_lats = lats[bad_index]

(longs,lats)是ping< 1000的点,(bad_longs,bad_lats)是ping> 1000的点您可以使用numpy.histogram2d()来计算点数：

ranges = [[np.min(lats), np.max(lats)], [np.min(longs), np.max(longs)]]
h, lat_range, long_range = np.histogram2d(lats, longs, bins=(400,400), range=ranges)
bad_h, lat_range2, long_range2 = np.histogram2d(bad_lats, bad_longs, bins=(400,400), range=ranges)

h和bad_h是每个小区域中的点数.

然后,您可以选择许多方法来可视化它.例如,您可以通过scatter()绘制它：

y, x = np.where(h)
count = h[y, x]
pl.scatter(long_range[x], lat_range[y], s=count/20, c=count, linewidths=0, cmap="Blues")

count = bad_h[y, x]
pl.scatter(long_range2[x], lat_range2[y], s=count/20, c=count, linewidths=0, cmap="Reds")

pl.show()

这是完整的代码：

import numpy as np
import pylab as pl

longs = (np.random.rand(60, 1)   np.linspace(-np.pi, np.pi, 80000)).reshape(-1)
lats = np.sin(longs)   np.random.rand(len(longs)) * 0.1

bad_index = (longs>0) & (longs<1)
bad_longs = longs[bad_index]
bad_lats = lats[bad_index]

ranges = [[np.min(lats), np.max(lats)], [np.min(longs), np.max(longs)]]
h, lat_range, long_range = np.histogram2d(lats, longs, bins=(300,300), range=ranges)
bad_h, lat_range2, long_range2 = np.histogram2d(bad_lats, bad_longs, bins=(300,300), range=ranges)

y, x = np.where(h)
count = h[y, x]
pl.scatter(long_range[x], lat_range[y], s=count/20, c=count, linewidths=0, cmap="Blues")

count = bad_h[y, x]
pl.scatter(long_range2[x], lat_range2[y], s=count/20, c=count, linewidths=0, cmap="Reds")

pl.show()

输出数字是：

来源：https://www./content-1-283101.html