summaryrefslogtreecommitdiff
path: root/kafka
Commit message (Collapse)AuthorAgeFilesLines
...
* | Merge pull request #166 from patricklucas/teach_producers_about_compressionDana Powers2014-05-193-22/+59
|\ \ | | | | | | Add 'codec' parameter to Producer
| * | Improve error handling and tests w.r.t. codecsPatrick Lucas2014-05-073-21/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add function kafka.protocol.create_message_set() that takes a list of payloads and a codec and returns a message set with the desired encoding. Introduce kafka.common.UnsupportedCodecError, raised if an unknown codec is specified. Include a test for the new function.
| * | Merge branch 'teach_producers_about_compression' into producer_compressionMark Roberts2014-05-072-20/+45
| |\ \ | | | | | | | | | | | | | | | | | | | | Conflicts: servers/0.8.0/kafka-src test/test_unit.py
| | * | Add 'codec' parameter to ProducerPatrick Lucas2014-05-032-20/+45
| | | | | | | | | | | | | | | | | | | | Adds a codec parameter to Producer.__init__ that lets the user choose a compression codec to use for all messages sent by it.
* | | | Support IPv6 hosts and networksAlexey Borzenkov2014-05-091-3/+1
|/ / /
* | | Merge branch 'master' into add_testsMark Roberts2014-05-061-1/+15
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kafka/client.py contained duplicate copies of same refactor, merged. Move test/test_integration.py changes into test/test_producer_integration. Conflicts: kafka/client.py servers/0.8.0/kafka-src test/test_integration.py
| * \ \ Merge pull request #139 from alexcb/masterDana Powers2014-05-061-1/+15
| |\ \ \ | | | | | | | | | | SimpleProducer randomization of initial round robin ordering
| | * | | added random_start param to SimpleProducer to enable/disable randomization ↵Alex Couture-Beil2014-04-011-4/+11
| | | | | | | | | | | | | | | | | | | | of the initial partition messages are published to
| | * | | Changed randomization to simply randomize the initial starting partition of ↵Alex Couture-Beil2014-04-011-3/+7
| | | | | | | | | | | | | | | | | | | | the sorted list of partition rather than completely randomizing the initial ordering before round-robin cycling the partitions
| | * | | Modified SimpleProducer to randomize the initial round robin orderingAlex Couture-Beil2014-03-111-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | of partitions to prevent the first message from always being published to partition 0.
| * | | | Fix lack of timeout support in KafkaClient and KafkaConnectionmaciejkula2014-04-162-2/+2
| | |/ / | |/| |
* | | | Attempt to fix travis build. Decrease complexity of service.py in favor of ↵Mark Roberts2014-05-064-8/+12
| | | | | | | | | | | | | | | | in memory logging. Address code review concerns
* | | | Make commit() check for errors instead of simply assert no errorMark Roberts2014-04-301-1/+1
| | | |
* | | | Make BrokerRequestError a base class, make subclasses for each broker errorMark Roberts2014-04-303-53/+113
| | | |
* | | | Various fixesMark Roberts2014-04-252-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bump version number to 0.9.1 Update readme to show supported Kafka/Python versions Validate arguments in consumer.py, add initial consumer unit test Make service kill() child processes when startup fails Add tests for util.py, fix Python 2.6 specific bug.
* | | | Fix last remaining test by making autocommit more intuitiveMark Roberts2014-04-241-1/+1
| | | |
* | | | Split out kafka version environments, default tox no longer runs any ↵Mark Roberts2014-04-231-11/+11
| | | | | | | | | | | | | | | | integration tests, make skipped integration also skip setupClass, implement rudimentary offset support in consumer.py
* | | | Fix bug in socket timeout per PR #161 by maciejkula, add testMark Roberts2014-04-191-1/+1
| | | |
* | | | Split up and speed up producer based integration testsMark Roberts2014-04-172-1/+2
| | | |
* | | | Refactor away _get_conn_for_broker. Fix bug in _get_connMark Roberts2014-04-091-13/+6
| | | |
* | | | Merge branch 'master' into add_testsMark Roberts2014-04-081-0/+4
|\ \ \ \ | |/ / /
| * | | Commit in seek if autocommitMark Roberts2014-03-271-1/+4
| | | |
| * | | Make seek(); commit(); work without commit discarding the seek changeMark Roberts2014-03-251-0/+1
| | | |
* | | | Reinstate test_integrate, make test_protocol more explicit, create testutilMark Roberts2014-04-081-1/+1
| | | |
* | | | Explicit testing of protocol errors. Make tests more explicit, and start ↵Mark Roberts2014-04-082-4/+7
|/ / / | | | | | | | | | working on intermittent failures in test_encode_fetch_request and test_encode_produc_request
* | | Merge pull request #134 from wizzat/conn_refactorv0.9.0Dana Powers2014-03-213-26/+35
|\ \ \ | | | | | | | | conn.py performance improvements, make examples work, add another example
| * \ \ Merge branch 'master' into conn_refactorMark Roberts2014-03-182-5/+5
| |\ \ \
| * \ \ \ Merge branch 'master' into conn_refactorMark Roberts2014-02-263-9/+46
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: example.py
| * | | | | Fix grammar in error stringMark Roberts2014-02-251-1/+1
| | | | | |
| * | | | | Minor refactor in conn.py, update version in __init__.py, add ErrorStringMark Roberts2014-02-253-26/+35
| | | | | |
* | | | | | Merge pull request #109 from mrtheb/developDana Powers2014-03-212-10/+29
|\ \ \ \ \ \ | |_|_|/ / / |/| | | | | TopicAndPartition fix when partition has no leader = -1
| * | | | | Merge branch 'master' into developmrtheb2014-03-175-15/+144
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: test/test_unit.py
| * | | | | | Changes based on comments by @rdiomar, plus added LeaderUnavailableError for ↵mrtheb2014-02-152-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | clarity
| * | | | | | check for broker None in send_broker_aware_request (added test for it)mrtheb2014-01-311-5/+14
| | | | | | |
| * | | | | | Merge branch 'master' into developmrtheb2014-01-312-74/+114
| |\ \ \ \ \ \ | | | |_|_|_|/ | | |/| | | |
| * | | | | | Handle cases for partition with leader=-1 (not defined)Marc Labbe2014-01-312-10/+12
| | | | | | |
| * | | | | | added mockmrtheb2014-01-181-3/+4
| | | | | | |
* | | | | | | Check against basestring instead of str in collect.hosts.Saulius Zemaitaitis2014-03-171-1/+1
| |_|/ / / / |/| | | | |
* | | | | | If a broker refuses the connection, try the nextstephenarmstrong2014-03-131-3/+3
| |_|_|_|/ |/| | | |
* | | | | nit: fixed misspellingZack Dever2014-03-031-1/+1
| |_|_|/ |/| | |
* | | | Merge pull request #122 from mrtheb/multihostsOmar2014-02-263-9/+46
|\ \ \ \ | |_|_|/ |/| | | Support for multiple hosts on KafkaClient boostrap (improves on #70)
| * | | clean up after comments from @rdiomarmrtheb2014-02-151-3/+5
| | | |
| * | | Support list (or comma-separated) of hosts (replaces host and port arguments)mrtheb2014-02-092-7/+11
| | | |
| * | | Merge branch 'master' into multihostsmrtheb2014-01-317-341/+471
| |\ \ \ | | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: kafka/client.py kafka/conn.py setup.py test/test_integration.py test/test_unit.py
| * | | Allow KafkaClient to take in a list of brokers for bootstrappingMarc Labbe2013-11-143-22/+48
| | | |
* | | | Fix version in __init__.py to match setup.pyDavid Arthur2014-02-251-1/+1
| | | |
* | | | Make it possible to read and write xerial snappyGreg Bowyer2014-02-191-3/+95
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes mumrah/kafka-python#126 TL;DR ===== This makes it possible to read and write snappy compressed streams that are compatible with the java and scala kafka clients (the xerial blocking format)) Xerial Details ============== Kafka supports transparent compression of data (both in transit and at rest) of messages, one of the allowable compression algorithms is Google's snappy, an algorithm which has excellent performance at the cost of efficiency. The specific implementation of snappy used in kafka is the xerial-snappy implementation, this is a readily available java library for snappy. As part of this implementation, there is a specialised blocking format that is somewhat none standard in the snappy world. Xerial Format ------------- The blocking mode of the xerial snappy library is fairly simple, using a magic header to identify itself and then a size + block scheme, unless otherwise noted all items in xerials blocking format are assumed to be big-endian. A block size (```xerial_blocksize``` in implementation) controls how frequent the blocking occurs 32k is the default in the xerial library, this blocking controls the size of the uncompressed chunks that will be fed to snappy to be compressed. The format winds up being | Header | Block1 len | Block1 data | Blockn len | Blockn data | | ----------- | ---------- | ------------ | ---------- | ------------ | | 16 bytes | BE int32 | snappy bytes | BE int32 | snappy bytes | It is important to not that the blocksize is the amount of uncompressed data presented to snappy at each block, whereas the blocklen is the number of bytes that will be present in the stream, that is the length will always be <= blocksize. Xerial blocking header ---------------------- Marker | Magic String | Null / Pad | Version | Compat ------ | ------------ | ---------- | -------- | -------- byte | c-string | byte | int32 | int32 ------ | ------------ | ---------- | -------- | -------- -126 | 'SNAPPY' | \0 | variable | variable The pad appears to be to ensure that SNAPPY is a valid cstring, and to align the header on a word boundary. The version is the version of this format as written by xerial, in the wild this is currently 1 as such we only support v1. Compat is there to claim the minimum supported version that can read a xerial block stream, presently in the wild this is 1. Implementation specific details =============================== The implementation presented here follows the Xerial implementation as of its v1 blocking format, no attempts are made to check for future versions. Since none-xerial aware clients might have persisted snappy compressed messages to kafka brokers we allow clients to turn on xerial compatibility for message sending, and perform header sniffing to detect xerial vs plain snappy payloads.
* | | Merge pull request #111 from rdiomar/multitopic_producersDana Powers2014-01-301-35/+44
|\ \ \ | | | | | | | | Make producers take a topic argument at send rather than init time -- fixes Issue #110, but breaks backwards compatibility with previous Producer interface.
| * | | Use TopicAndPartition when producing async messagesOmar Ghishan2014-01-271-8/+11
| | | |
| * | | Make producers take a topic argument at send rather than init timeOmar Ghishan2014-01-231-34/+40
| | |/ | |/| | | | | | | This allows a single producer to be used to send to multiple topics. See https://github.com/mumrah/kafka-python/issues/110