正确的方法尝试/除了使用Python请求模块?

try:
r = requests.get(url, params={'s': thing})
except requests.ConnectionError, e:
print(e)

这对吗?有更好的方式来组织吗?这将涵盖我所有的基础吗?

793841 次浏览

看看请求除了文档。简而言之:

在发生网络问题时(例如DNS故障,拒绝连接等),请求将引发ConnectionError异常。

在罕见的无效HTTP响应事件中,Requests将引发HTTPError异常。

如果请求超时,则引发Timeout异常。

如果请求超过配置的最大重定向数量,将引发TooManyRedirects异常。

Requests显式引发的所有异常都继承自requests.exceptions.RequestException

要回答您的问题,您所展示的将涵盖您的所有基础。您将只捕获与连接相关的错误,而不是超时的错误。

捕获异常时要做什么实际上取决于脚本/程序的设计。退出是否可以接受?你能继续再试一次吗?如果错误是灾难性的,你不能继续,那么是的,你可以通过抛出SystemExit来终止你的程序(这是一种既打印错误又调用sys.exit的好方法)。

你可以捕获基类异常,它将处理所有情况:

try:
r = requests.get(url, params={'s': thing})
except requests.exceptions.RequestException as e:  # This is the correct syntax
raise SystemExit(e)

或者你可以分别抓住它们,做不同的事情。

try:
r = requests.get(url, params={'s': thing})
except requests.exceptions.Timeout:
# Maybe set up for a retry, or continue in a retry loop
except requests.exceptions.TooManyRedirects:
# Tell the user their URL was bad and try a different one
except requests.exceptions.RequestException as e:
# catastrophic error. bail.
raise SystemExit(e)

正如基督教所指出的:

如果你想要http错误(例如401 Unauthorized)引发异常,你可以调用Response.raise_for_status。如果响应是http错误,将引发HTTPError

一个例子:

try:
r = requests.get('http://www.google.com/nothere')
r.raise_for_status()
except requests.exceptions.HTTPError as err:
raise SystemExit(err)

将打印:

404 Client Error: Not Found for url: http://www.google.com/nothere

还有一个建议要明确。最好的方法似乎是从特定错误到一般错误,从错误堆栈向下查找所需的错误,这样特定错误就不会被一般错误所掩盖。

url='http://www.google.com/blahblah'


try:
r = requests.get(url,timeout=3)
r.raise_for_status()
except requests.exceptions.HTTPError as errh:
print ("Http Error:",errh)
except requests.exceptions.ConnectionError as errc:
print ("Error Connecting:",errc)
except requests.exceptions.Timeout as errt:
print ("Timeout Error:",errt)
except requests.exceptions.RequestException as err:
print ("OOps: Something Else",err)


Http Error: 404 Client Error: Not Found for url: http://www.google.com/blahblah

vs

url='http://www.google.com/blahblah'


try:
r = requests.get(url,timeout=3)
r.raise_for_status()
except requests.exceptions.RequestException as err:
print ("OOps: Something Else",err)
except requests.exceptions.HTTPError as errh:
print ("Http Error:",errh)
except requests.exceptions.ConnectionError as errc:
print ("Error Connecting:",errc)
except requests.exceptions.Timeout as errt:
print ("Timeout Error:",errt)


OOps: Something Else 404 Client Error: Not Found for url: http://www.google.com/blahblah

异常对象还包含原始响应e.response,如果需要查看来自服务器的响应中的错误体,这可能很有用。例如:

try:
r = requests.post('somerestapi.com/post-here', data={'birthday': '9/9/3999'})
r.raise_for_status()
except requests.exceptions.HTTPError as e:
print (e.response.text)

这里有一个通用的方法来做事情,至少意味着你不需要围绕每个requests调用try ... except:

基本版

# see the docs: if you set no timeout the call never times out! A tuple means "max
# connect time" and "max read time"
DEFAULT_REQUESTS_TIMEOUT = (5, 15) # for example


def log_exception(e, verb, url, kwargs):
# the reason for making this a separate function will become apparent
raw_tb = traceback.extract_stack()
if 'data' in kwargs and len(kwargs['data']) > 500: # anticipate giant data string
kwargs['data'] = f'{kwargs["data"][:500]}...'
msg = f'BaseException raised: {e.__class__.__module__}.{e.__class__.__qualname__}: {e}\n' \
+ f'verb {verb}, url {url}, kwargs {kwargs}\n\n' \
+ 'Stack trace:\n' + ''.join(traceback.format_list(raw_tb[:-2]))
logger.error(msg)


def requests_call(verb, url, **kwargs):
response = None
exception = None
try:
if 'timeout' not in kwargs:
kwargs['timeout'] = DEFAULT_REQUESTS_TIMEOUT
response = requests.request(verb, url, **kwargs)
except BaseException as e:
log_exception(e, verb, url, kwargs)
exception = e
return (response, exception)

  1. 注意,ConnectionError是一个内装式,与类requests.ConnectionError*无关。我认为后者在这种情况下更常见,但我不知道…
  2. 当检查非None返回的异常时,根据的文档,所有requests异常(包括requests.ConnectionError)的超类requests.RequestException不是“# EYZ4"。也许从公认的答案.**开始就改变了
  3. 显然,这假设已经配置了记录器。在except块中调用logger.exception似乎是个好主意,但这只会在这个方法中给出堆栈!相反,获取指向该方法调用的的跟踪。然后记录(包含异常和导致问题的调用的详细信息)

*我看了源代码:requests.ConnectionError子类单一类requests.RequestException,其中子类单一类IOError(内置)

**然而,在这个页面底部,你会发现&;requests.exceptions.RequestException"在撰写本文时(2022-02)……但它链接到上面的页面:令人困惑。


用法非常简单:

search_response, exception = utilities.requests_call('get',
f'http://localhost:9200/my_index/_search?q={search_string}')

首先检查响应:如果它是None,就会发生一些有趣的事情,并且会出现一个异常,必须根据上下文(和异常)以某种方式对其进行操作。在Gui应用程序(PyQt5)中,我通常实现一个“可视化日志”;向用户提供一些输出(并同时记录到日志文件中),但其中添加的消息应该是非技术性的。所以通常会出现这样的情况:

if search_response == None:
# you might check here for (e.g.) a requests.Timeout, tailoring the message
# accordingly, as the kind of error anyone might be expected to understand
msg = f'No response searching on |{search_string}|. See log'
MainWindow.the().visual_log(msg, log_level=logging.ERROR)
return
response_json = search_response.json()
if search_response.status_code != 200: # NB 201 ("created") may be acceptable sometimes...
msg = f'Bad response searching on |{search_string}|. See log'
MainWindow.the().visual_log(msg, log_level=logging.ERROR)
# usually response_json will give full details about the problem
log_msg = f'search on |{search_string}| bad response\n{json.dumps(response_json, indent=4)}'
logger.error(log_msg)
return


# now examine the keys and values in response_json: these may of course
# indicate an error of some kind even though the response returned OK (status 200)...

鉴于堆栈跟踪是自动记录的,您通常不需要更多…

返回json对象时的高级版本

(…可能会节省大量的样板文件!)

为了跨越t,当期望返回json对象时:

如上所述,如果一个异常给非技术用户一个消息“没有response",以及一个非200状态“坏response",我建议这样做

  • 如果响应的JSON结构中缺少预期的键,则会产生消息“反常response"
  • 消息“意想不到的response"超出范围或奇怪的值
  • 而出现“错误”这样的键;或“"error "”,值为True或其他任何值,转换为消息“错误response"

这些可能会也可能不会阻止代码继续运行。


... 事实上,在我看来,让这个过程变得更通用是值得的。对于我来说,接下来的这些函数通常使用上面的requests_call将20行代码减少到大约3行,并使您的大部分处理和日志消息标准化。在你的项目中调用requests,代码会变得更好,不那么臃肿:

def log_response_error(response_type, call_name, deliverable, verb, url, **kwargs):
# NB this function can also be used independently
if response_type == 'No': # exception was raised (and logged)
if isinstance(deliverable, requests.Timeout):
MainWindow.the().visual_log(f'Time out of {call_name} before response received!', logging.ERROR)
return
else:
if isinstance(deliverable, BaseException):
# NB if response.json() raises an exception we end up here
log_exception(deliverable, verb, url, kwargs)
else:
# if we get here no exception has been raised, so no stack trace has yet been logged.
# a response has been returned, but is either "Bad" or "Anomalous"
response_json = deliverable.json()


raw_tb = traceback.extract_stack()
if 'data' in kwargs and len(kwargs['data']) > 500: # anticipate giant data string
kwargs['data'] = f'{kwargs["data"][:500]}...'
added_message = ''
if hasattr(deliverable, 'added_message'):
added_message = deliverable.added_message + '\n'
del deliverable.added_message
call_and_response_details = f'{response_type} response\n{added_message}' \
+ f'verb {verb}, url {url}, kwargs {kwargs}\nresponse:\n{json.dumps(response_json, indent=4)}'
logger.error(f'{call_and_response_details}\nStack trace: {"".join(traceback.format_list(raw_tb[:-1]))}')
MainWindow.the().visual_log(f'{response_type} response {call_name}. See log.', logging.ERROR)
    

def check_keys(req_dict_structure, response_dict_structure, response):
# so this function is about checking the keys in the returned json object...
# NB both structures MUST be dicts
if not isinstance(req_dict_structure, dict):
response.added_message = f'req_dict_structure not dict: {type(req_dict_structure)}\n'
return False
if not isinstance(response_dict_structure, dict):
response.added_message = f'response_dict_structure not dict: {type(response_dict_structure)}\n'
return False
for dict_key in req_dict_structure.keys():
if dict_key not in response_dict_structure:
response.added_message = f'key |{dict_key}| missing\n'
return False
req_value = req_dict_structure[dict_key]
response_value = response_dict_structure[dict_key]
if isinstance(req_value, dict):
# if the response at this point is a list apply the req_value dict to each element:
# failure in just one such element leads to "Anomalous response"...
if isinstance(response_value, list):
for resp_list_element in response_value:
if not check_keys(req_value, resp_list_element, response):
return False
elif not check_keys(req_value, response_value, response): # any other response value must be a dict (tested in next level of recursion)
return False
elif isinstance(req_value, list):
if not isinstance(response_value, list): # if the req_value is a list the reponse must be one
response.added_message = f'key |{dict_key}| not list: {type(response_value)}\n'
return False
# it is OK for the value to be a list, but these must be strings (keys) or dicts
for req_list_element, resp_list_element in zip(req_value, response_value):
if isinstance(req_list_element, dict):
if not check_keys(req_list_element, resp_list_element, response):
return False
if not isinstance(req_list_element, str):
response.added_message = f'req_list_element not string: {type(req_list_element)}\n'
return False
if req_list_element not in response_value:
response.added_message = f'key |{req_list_element}| missing from response list\n'
return False
# put None as a dummy value (otherwise something like {'my_key'} will be seen as a set, not a dict
elif req_value != None:
response.added_message = f'required value of key |{dict_key}| must be None (dummy), dict or list: {type(req_value)}\n'
return False
return True


def process_json_requests_call(verb, url, **kwargs):
# "call_name" is a mandatory kwarg
if 'call_name' not in kwargs:
raise Exception('kwarg "call_name" not supplied!')
call_name = kwargs['call_name']
del kwargs['call_name']


required_keys = {}
if 'required_keys' in kwargs:
required_keys = kwargs['required_keys']
del kwargs['required_keys']


acceptable_statuses = [200]
if 'acceptable_statuses' in kwargs:
acceptable_statuses = kwargs['acceptable_statuses']
del kwargs['acceptable_statuses']


exception_handler = log_response_error
if 'exception_handler' in kwargs:
exception_handler = kwargs['exception_handler']
del kwargs['exception_handler']
        

response, exception = requests_call(verb, url, **kwargs)


if response == None:
exception_handler('No', call_name, exception, verb, url, **kwargs)
return (False, exception)
try:
response_json = response.json()
except BaseException as e:
logger.error(f'response.status_code {response.status_code} but calling json() raised exception')
# an exception raised at this point can't truthfully lead to a "No response" message... so say "bad"
exception_handler('Bad', call_name, e, verb, url, **kwargs)
return (False, response)
        

status_ok = response.status_code in acceptable_statuses
if not status_ok:
response.added_message = f'status code was {response.status_code}'
log_response_error('Bad', call_name, response, verb, url, **kwargs)
return (False, response)
check_result = check_keys(required_keys, response_json, response)
if not check_result:
log_response_error('Anomalous', call_name, response, verb, url, **kwargs)
return (check_result, response)

示例调用(注意,在此版本中,"deliverable"要么是一个异常,要么是一个传递json结构的响应):

success, deliverable = utilities.process_json_requests_call('get',
f'{ES_URL}{INDEX_NAME}/_doc/1',
call_name=f'checking index {INDEX_NAME}',
required_keys={'_source':{'status_text': None}})
if not success: return False
# here, we know the deliverable is a response, not an exception
# we also don't need to check for the keys being present:
# the generic code has checked that all expected keys are present
index_status = deliverable.json()['_source']['status_text']
if index_status != 'successfully completed':
# ... i.e. an example of a 200 response, but an error nonetheless
msg = f'Error response: ES index {INDEX_NAME} does not seem to have been built OK: cannot search'
MainWindow.the().visual_log(msg)
logger.error(f'index |{INDEX_NAME}|: deliverable.json() {json.dumps(deliverable.json(), indent=4)}')
return False

所以“视觉日志”;例如,在缺少键“status_text”的情况下,用户看到的消息将是“异常响应检查索引XYZ”。看到日志!”(日志会给出更详细的自动构造的技术消息,包括堆栈跟踪,还包括问题中丢失密钥的详细信息)。

  • 强制kwarg: call_name;可选kwargs: required_keysacceptable_statusesexception_handler
  • required_keys dict可以嵌套到任何深度
  • 细粒度的异常处理可以通过在kwargs中包含函数exception_handler来完成(但是不要忘记requests_call将记录调用细节、异常类型和__str__以及堆栈跟踪)。
  • 在上面我还实现了一个关键的“数据”检查;在可能被记录的kwargs中。这是因为批量操作(例如在Elasticsearch中填充索引)可以由巨大的字符串组成。例如,缩短到前500个字符。

PS:是的,我确实知道elasticsearch Python模块(一个“瘦包装器”;在# EYZ1)。以上所有内容都是为了说明目的。