如何在 Python 中构建 URL

我需要知道如何在 python 中构建 URL,比如:

http://subdomain.domain.com?arg1=someargument&arg2=someotherargument

你推荐使用哪个库? 为什么? 对于这种类型的库,有没有“最好”的选择?

还有,你能提供一些示例代码让我从使用这个库开始吗?

152388 次浏览

urlparse in the python standard library is all about building valid urls. Check the documentation of urlparse

Example:

from collections import namedtuple
from urllib.parse import urljoin, urlencode, urlparse, urlunparse


# namedtuple to match the internal signature of urlunparse
Components = namedtuple(
typename='Components',
field_names=['scheme', 'netloc', 'url', 'path', 'query', 'fragment']
)


query_params = {
'param1': 'some data',
'param2': 42
}


url = urlunparse(
Components(
scheme='https',
netloc='example.com',
query=urlencode(query_params),
path='',
url='/',
fragment='anchor'
)
)


print(url)

Output:

https://example.com/?param1=some+data&param2=42#anchor

I would go for Python's urllib, it's a built-in library.

Python 2

import urllib
url = 'https://example.com/somepage/?'
params = {'var1': 'some data', 'var2': 1337}
print(url + urllib.urlencode(params))

Python 3

import urllib.parse
url = 'https://example.com/somepage/?'
params = {'var1': 'some data', 'var2': 1337}
print(url + urllib.parse.urlencode(params))

Output:

https://example.com/somepage/?var1=some+data&var2=1337
import urllib


def make_url(base_url , *res, **params):
url = base_url
for r in res:
url = '{}/{}'.format(url, r)
if params:
url = '{}?{}'.format(url, urllib.urlencode(params))
return url


print make_url('http://example.com', 'user', 'ivan', alcoholic='true', age=18)

Output:

http://example.com/user/ivan?age=18&alcoholic=true

Here is an example of using urlparse to generate URLs. This provides the convenience of adding path to the URL without worrying about checking slashes.

import urllib


def build_url(base_url, path, args_dict):
# Returns a list in the structure of urlparse.ParseResult
url_parts = list(urllib.parse.urlparse(base_url))
url_parts[2] = path
url_parts[4] = urllib.parse.urlencode(args_dict)
return urllib.parse.urlunparse(url_parts)


>>> args = {'arg1': 'value1', 'arg2': 'value2'}
>>> # works with double slash scenario
>>> build_url('http://www.example.com/', '/somepage/index.html', args)


http://www.example.com/somepage/index.html?arg1=value1&arg2=value2


# works without slash
>>> build_url('http://www.example.com', 'somepage/index.html', args)


http://www.example.com/somepage/index.html?arg1=value1&arg2=value2