内容简介:使用python requests上传文件时, 报OverflowError: string longer than 2147483647 bytes 错误.问题代码
问题
简述
使用python requests上传文件时, 报
OverflowError: string longer than 2147483647 bytes 错误.
detail
问题代码
data = {} with open("bigfile", "rb") as f: r = requests.post(PUBLISH_URL, data=data, files={"xxx": f})
traceback
Traceback (most recent call last): File "test.py", line 52, in <module> main() File "test.py", line 49, in main publish() File "test.py", line 41, in publish r = requests.post(PUBLISH_URL, data=cfg, files={file_key: ("./test.apk", f)}) File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 116, in post return request('post', url, data=data, json=json, **kwargs) File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 60, in request return session.request(method=method, url=url, **kwargs) File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 533, in request resp = self.send(prep, **send_kwargs) File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 646, in send r = adapter.send(request, **kwargs) File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 449, in send timeout=timeout File "/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 600, in urlopen chunked=chunked) File "/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 354, in _make_request conn.request(method, url, **httplib_request_kw) File "/usr/lib/python2.7/httplib.py", line 1057, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 1097, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 1053, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 897, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 873, in send self.sock.sendall(data) File "/usr/lib/python2.7/ssl.py", line 743, in sendall v = self.send(data[count:]) File "/usr/lib/python2.7/ssl.py", line 709, in send v = self._sslobj.write(data) OverflowError: string longer than 2147483647 bytes
分析过程
requests 将file obj 全部读入内存了
没想到requests的实现这么粗暴, 直接file.read(), 实际也是如此, 发送大文件时, 内存快速上涨. 代码如下:
requests/models.py
@staticmethod def _encode_files(files, data): """Build the body for a multipart/form-data request. Will successfully encode files when passed as a dict or a list of tuples. Order is retained if data is a list of tuples but arbitrary if parameters are supplied as a dict. The tuples may be 2-tuples (filename, fileobj), 3-tuples (filename, fileobj, contentype) or 4-tuples (filename, fileobj, contentype, custom_headers). """ if (not files): raise ValueError("Files must be provided.") elif isinstance(data, basestring): raise ValueError("Data must not be a string.") new_fields = [] fields = to_key_val_list(data or {}) files = to_key_val_list(files or {}) for field, val in fields: if isinstance(val, basestring) or not hasattr(val, '__iter__'): val = [val] for v in val: if v is not None: # Don't call str() on bytestrings: in Py3 it all goes wrong. if not isinstance(v, bytes): v = str(v) new_fields.append( (field.decode('utf-8') if isinstance(field, bytes) else field, v.encode('utf-8') if isinstance(v, str) else v)) for (k, v) in files: # support for explicit filename ft = None fh = None if isinstance(v, (tuple, list)): if len(v) == 2: fn, fp = v elif len(v) == 3: fn, fp, ft = v else: fn, fp, ft, fh = v else: fn = guess_filename(v) or k fp = v if isinstance(fp, (str, bytes, bytearray)): fdata = fp elif hasattr(fp, 'read'): fdata = fp.read() # 这里将所有文件都读入内存 elif fp is None: continue else: fdata = fp rf = RequestField(name=k, data=fdata, filename=fn, headers=fh) rf.make_multipart(content_type=ft) new_fields.append(rf) body, content_type = encode_multipart_formdata(new_fields) return body, content_type
官方文档推荐使用requests-toolbelt
https://2.python-requests.org...
In the event you are posting a very large file as a multipart/form-data request, you may want to stream the request. By default, requests does not support this, but there is a separate package which does - requests-toolbelt. You should read the toolbelt’s documentation for more details about how to use it.
使用requests-toolbelt的写法
from requests_toolbelt import MultipartEncoder data = {} with open("bigfile", "rb") as f: data["xxx"] = ("filename", f) m = MultipartEncoder(fields=data) r = requests.post(PUBLISH_URL, data=m, headers={'Content-Type': m.content_type})
总结
requests的发送文件的实现十分粗暴, 会直接读全部文件内容到内存再sign, ssl sign大于2GB会报错, 官方文档推荐使用requests-toolbelt上传大文件.
分块上传当然也是一个方案(如果服务器支持).
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:- acme.sh 续期问题(路径问题)
- 缓存的一些问题和一些加密算法【缓存问题】
- 如何把设计问题转化为数学问题(方法论)
- 推荐系统中的冷启动问题和探索利用问题
- GraphQL 教程(六)—— N+1问题和缓存等问题
- Golang 并发问题(四)之单核上的并发问题
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Don't Make Me Think
Steve Krug / New Riders Press / 18 August, 2005 / $35.00
Five years and more than 100,000 copies after it was first published, it's hard to imagine anyone working in Web design who hasn't read Steve Krug's "instant classic" on Web usability, but people are ......一起来看看 《Don't Make Me Think》 这本书的介绍吧!