delta/python-packages/kafka-python.git - github.com: mumrah/kafka-python.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	LZ4 support in kafka 0.8/0.9 does not accept a ContentSize header	Dana Powers	2017-03-14	1	-6/+14
\|
*	Prefer python-lz4 over lz4f if available	Dana Powers	2017-03-14	1	-7/+32
\|
*	Free lz4 decompression context to avoid leak	Dana Powers	2017-03-14	1	-0/+1
\|
*	Vendor six 1.10.0six	Dana Powers	2016-08-01	1	-2/+4
\|
*	Use standard LZ4 framing for v1 messages / kafka 0.10 (#695)	Dana Powers	2016-05-22	1	-7/+23
\| \| \| \| \| \|	* LZ4 framing fixed in 0.10 / message v1 -- retain broken lz4 code for compatibility * lz4f does not support easy incremental decompression - raise RuntimeError * Update lz4 codec tests
*	Handle broken LZ4 framing; switch to lz4tools + xxhashlz4_fixup	Dana Powers	2016-01-26	1	-7/+51
\|
*	Prefer module imports (io.BytesIO)	Dana Powers	2016-01-25	1	-5/+5
\|
*	python-snappy does not like buffer-slices on pypy...	Dana Powers	2016-01-25	1	-2/+12
\|
*	Ignore pylint errors on buffer/memoryview	Dana Powers	2016-01-25	1	-0/+2
\|
*	Python3 does not support buffer -- use memoryview in snappy_decode	Dana Powers	2016-01-25	1	-2/+8
\|
*	Dont need context manager for BytesIO	Dana Powers	2016-01-25	1	-22/+18
\|
*	Write xerial-formatted snappy by default; use buffers to reduce copies	Dana Powers	2016-01-25	1	-22/+16
\|
*	Add support for LZ4 compressed messages using python-lz4 module	Dana Powers	2016-01-25	1	-0/+13
\|
*	Docstring updates	Dana Powers	2016-01-07	1	-13/+19
\|
*	allow to specify compression level for codecs which support this	trbs	2015-09-12	1	-2/+5
\|
*	Take the linter to kafka/codec.py	Dana Powers	2015-03-09	1	-11/+10
\|
*	Gzip context manager not supported in py2.6, so use try/finally instead	Dana Powers	2015-03-09	1	-2/+17
\|
*	Use context managers in gzip_encode / gzip_decode	Dana Powers	2015-03-08	1	-12/+7
\|
*	Make all unit tests pass on py3.3/3.4	Bruno Renié	2014-09-03	1	-8/+11
\|
*	Make it possible to read and write xerial snappy	Greg Bowyer	2014-02-19	1	-3/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes mumrah/kafka-python#126 TL;DR ===== This makes it possible to read and write snappy compressed streams that are compatible with the java and scala kafka clients (the xerial blocking format)) Xerial Details ============== Kafka supports transparent compression of data (both in transit and at rest) of messages, one of the allowable compression algorithms is Google's snappy, an algorithm which has excellent performance at the cost of efficiency. The specific implementation of snappy used in kafka is the xerial-snappy implementation, this is a readily available java library for snappy. As part of this implementation, there is a specialised blocking format that is somewhat none standard in the snappy world. Xerial Format ------------- The blocking mode of the xerial snappy library is fairly simple, using a magic header to identify itself and then a size + block scheme, unless otherwise noted all items in xerials blocking format are assumed to be big-endian. A block size (```xerial_blocksize``` in implementation) controls how frequent the blocking occurs 32k is the default in the xerial library, this blocking controls the size of the uncompressed chunks that will be fed to snappy to be compressed. The format winds up being \| Header \| Block1 len \| Block1 data \| Blockn len \| Blockn data \| \| ----------- \| ---------- \| ------------ \| ---------- \| ------------ \| \| 16 bytes \| BE int32 \| snappy bytes \| BE int32 \| snappy bytes \| It is important to not that the blocksize is the amount of uncompressed data presented to snappy at each block, whereas the blocklen is the number of bytes that will be present in the stream, that is the length will always be <= blocksize. Xerial blocking header ---------------------- Marker \| Magic String \| Null / Pad \| Version \| Compat ------ \| ------------ \| ---------- \| -------- \| -------- byte \| c-string \| byte \| int32 \| int32 ------ \| ------------ \| ---------- \| -------- \| -------- -126 \| 'SNAPPY' \| \0 \| variable \| variable The pad appears to be to ensure that SNAPPY is a valid cstring, and to align the header on a word boundary. The version is the version of this format as written by xerial, in the wild this is currently 1 as such we only support v1. Compat is there to claim the minimum supported version that can read a xerial block stream, presently in the wild this is 1. Implementation specific details =============================== The implementation presented here follows the Xerial implementation as of its v1 blocking format, no attempts are made to check for future versions. Since none-xerial aware clients might have persisted snappy compressed messages to kafka brokers we allow clients to turn on xerial compatibility for message sending, and perform header sniffing to detect xerial vs plain snappy payloads.
*	Split fixtures out to a separate file	Ivan Pouzyrevsky	2013-06-07	1	-3/+3
\|
*	Beautify codec.py	Ivan Pouzyrevsky	2013-06-07	1	-24/+21
\|
*	Refactor and update integration tests	Ivan Pouzyrevsky	2013-06-07	1	-0/+7
\|
*	PEP8-ify most of the files	Mahendra M	2013-05-29	1	-2/+6
\| \| \| \|	consumer.py and conn.py will be done later after pending merges
*	Add Snappy support0.1-alpha	David Arthur	2012-11-16	1	-0/+17
\| \| \| \|	Fixes #2
*	Moved codec stuff into it's own module	David Arthur	2012-10-02	1	-0/+23
	Snappy will go there when I get around to it