为什么使用 SQLServer2008地理数据类型?

我正在重新设计一个客户数据库,其中一个新的信息,我想存储与标准地址字段(街道,城市等)是地址的地理位置。我想到的唯一一个用例是允许用户在谷歌地图上找不到地址时绘制坐标,这种情况通常发生在该地区刚开发或位于偏远/农村地区时。

我的第一个倾向是将经纬度存储为十进制值,但后来我想起了 SQL Server 2008 r2有一个 geography数据类型。我完全没有使用 geography的经验,而且从我最初的研究来看,它对于我的场景来说似乎有些过头了。

例如,要处理存储为 decimal(7,4)的经纬度,我可以这样做:

insert into Geotest(Latitude, Longitude) values (47.6475, -122.1393)
select Latitude, Longitude from Geotest

但对于 geography,我会这样做:

insert into Geotest(Geolocation) values (geography::Point(47.6475, -122.1393, 4326))
select Geolocation.Lat, Geolocation.Long from Geotest

虽然它不是 那个更复杂,为什么要增加复杂性,如果我不必?

在我放弃使用 geography的想法之前,有什么是我应该考虑的吗?使用空间索引搜索位置是否比索引经纬度字段更快?使用 geography是否有我不知道的优点?或者,从另一方面来说,我应该知道哪些警告会阻止我使用 geography


更新

@ Erik Philips 提出了使用 geography进行近距离搜索的能力,非常酷。

另一方面,一个快速测试显示,使用简单的 select获得经纬度的速度明显慢于使用 geography(详情见下文)。对于 geography上另一个 SO 问题的评论让我感到疑惑:

@ SaphuA 不客气 可空的 GEOGRAPHY 数据类型列上的空间索引 严重的性能问题,因此使该 GEOGRAPHY 列不为空 即使你不得不重塑你的模式。——托马斯6月18日11:18

总而言之,权衡进行近距离搜索的可能性与在性能和复杂性方面的权衡,我决定在这种情况下放弃使用 geography


我运行的测试的细节:

我创建了两个表,一个使用 geography,另一个使用 decimal(9,6)经纬度:

CREATE TABLE [dbo].[GeographyTest]
(
[RowId] [int] IDENTITY(1,1) NOT NULL,
[Location] [geography] NOT NULL,
CONSTRAINT [PK_GeographyTest] PRIMARY KEY CLUSTERED ( [RowId] ASC )
)


CREATE TABLE [dbo].[LatLongTest]
(
[RowId] [int] IDENTITY(1,1) NOT NULL,
[Latitude] [decimal](9, 6) NULL,
[Longitude] [decimal](9, 6) NULL,
CONSTRAINT [PK_LatLongTest] PRIMARY KEY CLUSTERED ([RowId] ASC)
)

并在每个表中插入一行使用相同的经纬度值:

insert into GeographyTest(Location) values (geography::Point(47.6475, -122.1393, 4326))
insert into LatLongTest(Latitude, Longitude) values (47.6475, -122.1393)

最后,运行以下代码显示,在我的机器上,使用 geography选择经纬度大约要慢5倍。

declare @lat float, @long float,
@d datetime2, @repCount int, @trialCount int,
@geographyDuration int, @latlongDuration int,
@trials int = 3, @reps int = 100000


create table #results
(
GeographyDuration int,
LatLongDuration int
)


set @trialCount = 0


while @trialCount < @trials
begin


set @repCount = 0
set @d = sysdatetime()


while @repCount < @reps
begin
select @lat = Location.Lat,  @long = Location.Long from GeographyTest where RowId = 1
set @repCount = @repCount + 1
end


set @geographyDuration = datediff(ms, @d, sysdatetime())


set @repCount = 0
set @d = sysdatetime()


while @repCount < @reps
begin
select @lat = Latitude,  @long = Longitude from LatLongTest where RowId = 1
set @repCount = @repCount + 1
end


set @latlongDuration = datediff(ms, @d, sysdatetime())


insert into #results values(@geographyDuration, @latlongDuration)


set @trialCount = @trialCount + 1


end


select *
from #results


select avg(GeographyDuration) as AvgGeographyDuration, avg(LatLongDuration) as AvgLatLongDuration
from #results


drop table #results

结果:

GeographyDuration LatLongDuration
----------------- ---------------
5146              1020
5143              1016
5169              1030


AvgGeographyDuration AvgLatLongDuration
-------------------- ------------------
5152                 1022

更令人惊讶的是,即使没有选择任何行,例如选择不存在的 RowId = 2的位置,geography仍然比较慢:

GeographyDuration LatLongDuration
----------------- ---------------
1607              948
1610              946
1607              947


AvgGeographyDuration AvgLatLongDuration
-------------------- ------------------
1608                 947
70077 次浏览

If you plan on doing any spatial computation, EF 5.0 allows LINQ Expressions like:

private Facility GetNearestFacilityToJobsite(DbGeography jobsite)
{
var q1 = from f in context.Facilities
let distance = f.Geocode.Distance(jobsite)
where distance < 500 * 1609.344
orderby distance
select f;
return q1.FirstOrDefault();
}

Then there is a very good reason to use Geography.

Explanation of spatial within Entity Framework.

Updated with Creating High Performance Spatial Databases

As I noted on Noel Abrahams Answer:

A note on space, each coordinate is stored as a double-precision floating-point number that is 64 bits (8 bytes) long, and 8-byte binary value is roughly equivalent to 15 digits of decimal precision, so comparing a decimal(9,6) which is only 5 bytes, isn't exactly a fair comparison. Decimal would have to be a minimum of Decimal(15,12) (9 bytes) for each LatLong (total of 18 bytes) for a real comparison.

So comparing storage types:

CREATE TABLE dbo.Geo
(
geo geography
)
GO


CREATE TABLE dbo.LatLng
(
lat decimal(15, 12),
lng decimal(15, 12)
)
GO


INSERT dbo.Geo
SELECT geography::Point(12.3456789012345, 12.3456789012345, 4326)
UNION ALL
SELECT geography::Point(87.6543210987654, 87.6543210987654, 4326)


GO 10000


INSERT dbo.LatLng
SELECT  12.3456789012345, 12.3456789012345
UNION
SELECT 87.6543210987654, 87.6543210987654


GO 10000


EXEC sp_spaceused 'dbo.Geo'


EXEC sp_spaceused 'dbo.LatLng'

Result:

name    rows    data
Geo     20000   728 KB
LatLon  20000   560 KB

The geography data-type takes up 30% more space.

Additionally the geography datatype is not limited to only storing a Point, you can also store LineString, CircularString, CompoundCurve, Polygon, CurvePolygon, GeometryCollection, MultiPoint, MultiLineString, and MultiPolygon and more. Any attempt to store even the simplest of Geography types (as Lat/Long) beyond a Point (for example LINESTRING(1 1, 2 2) instance) will incur additional rows for each point, a column for sequencing for the order of each point and another column for grouping of lines. SQL Server also has methods for the Geography data types which include calculating Area, Boundary, Length, Distances, and more.

It seems unwise to store Latitude and Longitude as Decimal in Sql Server.

Update 2

If you plan on doing any calculations like distance, area, etc, properly calculating these over the surface of the earth is difficult. Each Geography type stored in SQL Server is also stored with a Spatial Reference ID. These id's can be of different spheres (the earth is 4326). This means that the calculations in SQL Server will actually calculate correctly over the surface of the earth (instead of as-the-crow-flies which could be through the surface of the earth).

enter image description here

Another thing to consider is the storage space taken up by each method. The geography type is stored as a VARBINARY(MAX). Try running this script:

CREATE TABLE dbo.Geo
(
geo geography


)


GO


CREATE TABLE dbo.LatLon
(
lat decimal(9, 6)
,   lon decimal(9, 6)


)


GO


INSERT dbo.Geo
SELECT geography::Point(36.204824, 138.252924, 4326) UNION ALL
SELECT geography::Point(51.5220066, -0.0717512, 4326)


GO 10000


INSERT dbo.LatLon
SELECT  36.204824, 138.252924 UNION
SELECT 51.5220066, -0.0717512


GO 10000


EXEC sp_spaceused 'dbo.Geo'
EXEC sp_spaceused 'dbo.LatLon'

Result:

name    rows    data
Geo     20000   728 KB
LatLon  20000   400 KB

The geography data-type takes up almost twice as much space.

    CREATE FUNCTION [dbo].[fn_GreatCircleDistance]
(@Latitude1 As Decimal(38, 19), @Longitude1 As Decimal(38, 19),
@Latitude2 As Decimal(38, 19), @Longitude2 As Decimal(38, 19),
@ValuesAsDecimalDegrees As bit = 1,
@ResultAsMiles As bit = 0)
RETURNS decimal(38,19)
AS
BEGIN
-- Declare the return variable here
DECLARE @ResultVar  decimal(38,19)


-- Add the T-SQL statements to compute the return value here
/*
Credit for conversion algorithm to Chip Pearson
Web Page: www.cpearson.com/excel/latlong.aspx
Email: chip@cpearson.com
Phone: (816) 214-6957 USA Central Time (-6:00 UTC)
Between 9:00 AM and 7:00 PM


Ported to Transact SQL by Paul Burrows BCIS
*/
DECLARE  @C_RADIUS_EARTH_KM As Decimal(38, 19)
SET @C_RADIUS_EARTH_KM = 6370.97327862
DECLARE  @C_RADIUS_EARTH_MI As Decimal(38, 19)
SET @C_RADIUS_EARTH_MI = 3958.73926185
DECLARE  @C_PI As Decimal(38, 19)
SET @C_PI =  pi()


DECLARE @Lat1 As Decimal(38, 19)
DECLARE @Lat2 As Decimal(38, 19)
DECLARE @Long1 As Decimal(38, 19)
DECLARE @Long2 As Decimal(38, 19)
DECLARE @X As bigint
DECLARE @Delta As Decimal(38, 19)


If @ValuesAsDecimalDegrees = 1
Begin
set @X = 1
END
Else
Begin
set @X = 24
End


-- convert to decimal degrees
set @Lat1 = @Latitude1 * @X
set @Long1 = @Longitude1 * @X
set @Lat2 = @Latitude2 * @X
set @Long2 = @Longitude2 * @X


-- convert to radians: radians = (degrees/180) * PI
set @Lat1 = (@Lat1 / 180) * @C_PI
set @Lat2 = (@Lat2 / 180) * @C_PI
set @Long1 = (@Long1 / 180) * @C_PI
set @Long2 = (@Long2 / 180) * @C_PI


-- get the central spherical angle
set @Delta = ((2 * ASin(Sqrt((power(Sin((@Lat1 - @Lat2) / 2) ,2)) +
Cos(@Lat1) * Cos(@Lat2) * (power(Sin((@Long1 - @Long2) / 2) ,2))))))


If @ResultAsMiles = 1
Begin
set @ResultVar = @Delta * @C_RADIUS_EARTH_MI
End
Else
Begin
set @ResultVar = @Delta * @C_RADIUS_EARTH_KM
End


-- Return the result of the function
RETURN @ResultVar


END