summaryrefslogtreecommitdiff
path: root/kafka
Commit message (Collapse)AuthorAgeFilesLines
...
* | | Resolve conflicts for #106Omar Ghishan2014-01-281-39/+70
|\ \ \ | |/ / |/| |
| * | Add doc string for SimpleConsumer._get_message()Omar Ghishan2014-01-201-0/+6
| | |
| * | Make get_messages() update and commit offsets just before returningOmar Ghishan2014-01-151-16/+35
| | |
| * | Only use timeout if it's not NoneOmar Ghishan2014-01-151-4/+5
| | |
| * | Store fetched offsets separately.Omar Ghishan2014-01-151-10/+14
| | | | | | | | | | | | | | | | | | | | | Fetch requests can be repeated if we get a ConsumerFetchSizeTooSmall or if _fetch() is called multiple times for some reason. We don't want to re-fetch messages that are already in our queue, so store the offsets of the last enqueued messages from each partition.
| * | Fix offset increments:Omar Ghishan2014-01-151-16/+17
| | | | | | | | | | | | | | | * Increment the offset before returning a message rather than when putting it in the internal queue. This prevents committing the wrong offsets. * In MultiProcessConsumer, store the offset of the next message
* | | Merge pull request #107 from rdiomar/fix_default_timeoutsMarc Labbé2014-01-162-3/+13
|\ \ \ | | | | | | | | Increase default connection timeout
| * | | Change default socket timeout to 120 seconds in both the client and connectionOmar Ghishan2014-01-162-5/+10
| | | |
| * | | Make the default connection timeout NoneOmar Ghishan2014-01-161-1/+6
| |/ / | | | | | | | | | This fixes the default behavior, which used to cause a socket timeout when waiting for 10 seconds for a message to be produced.
* | | Merge pull request #98 from waliaashish85/devOmar2014-01-161-4/+2
|\ \ \ | | | | | | | | Changes for aligning code with offset fetch and commit APIs (Kafka 0.8.1)
| * | | Deleting client_id from offset commit and fetch response as per Kafka trunk codeAshish Walia2014-01-131-2/+0
| | | |
| * | | Syncing offset commit and fetch api keys with Kafka trunk codeAshish Walia2014-01-131-2/+2
| | | |
* | | | Merge branch 'repr' of https://github.com/mahendra/kafka-python into ↵mrtheb2014-01-144-1/+18
|\ \ \ \ | |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | mahendra-repr Conflicts: kafka/client.py kafka/consumer.py
| * | | Add proper string representations for each classMahendra M2013-10-084-2/+19
| | | |
* | | | Merge pull request #100 from cosbynator/no_infinite_loops_realOmar2014-01-144-95/+125
|\ \ \ \ | | | | | | | | | | Branch fix: No infinite loops during metadata requests, invalidate metadata more, exception hierarchy
| * | | | Throw KafkaUnavailableError when no brokers availableThomas Dimson2014-01-132-2/+6
| | | | |
| * | | | Exception hierarchy, invalidate more md on errorsThomas Dimson2014-01-134-95/+121
| | | | |
* | | | | remove zero length field name in format string, to work in Python 2.6Vadim Graboys2014-01-131-1/+1
|/ / / /
* | | | Merge pull request #88 from rdiomar/rdiomar_changesOmar2014-01-135-190/+216
|\ \ \ \ | |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various changes/fixes, including: * Allow customizing socket timeouts * Read the correct number of bytes from kafka * Guarantee reading the expected number of bytes from the socket every time * Remove bufsize from client and conn * SimpleConsumer flow changes * Fix some error handling * Add optional upper limit to consumer fetch buffer size * Add and fix unit and integration tests
| * | | Change log.error() back to log.exception()Omar Ghishan2014-01-082-8/+8
| | | |
| * | | Change BufferUnderflowError to ConnectionError in conn._read_bytes()Omar Ghishan2014-01-082-6/+4
| | | | | | | | | | | | Both errors are handled the same way when raised and caught, so this makes sense.
| * | | Remove unnecessary methodOmar Ghishan2014-01-071-17/+8
| | | |
| * | | Handle dirty flag in conn.recv()Omar Ghishan2014-01-071-1/+3
| | | | | | | | | | | | | | | | | | | | * If the connection is dirty, reinit * If we get a BufferUnderflowError, the server could have gone away, so mark it dirty
| * | | Use the same timeout when reinitializing a connectionOmar Ghishan2014-01-071-2/+3
| | | |
| * | | Remove unnecessary bracketsOmar Ghishan2014-01-061-2/+2
| | | |
| * | | Add a limit to fetch buffer size, and actually retry requests when fetch ↵Omar Ghishan2014-01-061-37/+58
| | | | | | | | | | | | | | | | | | | | | | | | size is too small Note: This can cause fetching a message to exceed a given timeout, but timeouts are not guaranteed anyways, and in this case it's the client's fault for not sending a big enough buffer size rather than the kafka server. This can be bad if max_fetch_size is None (no limit) and there is some message in Kafka that is crazy huge, but that is why we should have some max_fetch_size.
| * | | Fix client error handlingOmar Ghishan2014-01-061-5/+17
| | | | | | | | | | | | | | | | | | | | This differentiates between errors that occur when sending the request and receiving the response, and adds BufferUnderflowError handling.
| * | | Raise a ConnectionError when a socket.error is raised when receiving dataOmar Ghishan2014-01-062-10/+14
| | | | | | | | | | | | | | | | Also, log.exception() is unhelpfully noisy. Use log.error() with some error details in the message instead.
| * | | Fix seek offset deltasOmar Ghishan2014-01-061-6/+0
| | | | | | | | | | | | | | | | | | | | We always store the offset of the next available message, so we shouldn't decrement the offset deltas when seeking by an extra 1
| * | | Add note about questionable error handling while decoding messages.Omar Ghishan2014-01-061-0/+8
| | | | | | | | | | | | | | | | Will remove once any error handling issues are resolved.
| * | | Add and fix comments to protocol.pyOmar Ghishan2014-01-061-6/+10
| | | |
| * | | Add comments and maintain 80 character line limitOmar Ghishan2014-01-061-7/+23
| | | |
| * | | Add iter_timeout option to SimpleConsumer. If not None, it causes the ↵Omar Ghishan2014-01-061-6/+22
| | | | | | | | | | | | | | | | | | | | | | | | iterator to exit when reached. Also put constant timeout values in pre-defined constants
| * | | Add buffer_size param description to docstring Omar Ghishan2014-01-061-1/+2
| | | |
| * | | Remove SimpleConsumer queue size limit since it can cause the iteratorOmar Ghishan2014-01-061-1/+1
| | | | | | | | | | | | | | | | to block forever if it's reached.
| * | | SimpleConsumer flow changes:Omar Ghishan2014-01-061-112/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Combine partition fetch requests into a single request * Put the messages received in a queue and update offsets * Grab as many messages from the queue as requested * When the queue is empty, request more * timeout param for get_messages() is the actual timeout for getting those messages * Based on https://github.com/mumrah/kafka-python/pull/74 - don't increase min_bytes if the consumer fetch buffer size is too small. Notes: Change MultiProcessConsumer and _mp_consume() accordingly. Previously, when querying each partition separately, it was possible to block waiting for messages on partition 0 even if there are new ones in partition 1. These changes allow us to block while waiting for messages on all partitions, and reduce total number of kafka requests. Use Queue.Queue for single proc Queue instead of already imported multiprocessing.Queue because the latter doesn't seem to guarantee immediate availability of items after a put: >>> from multiprocessing import Queue >>> q = Queue() >>> q.put(1); q.get_nowait() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/queues.py", line 152, in get_nowait return self.get(False) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/queues.py", line 134, in get raise Empty Queue.Empty
| * | | Reset consumer fields to original values rather than defaults in FetchContextOmar Ghishan2014-01-061-3/+5
| | | |
| * | | Allow None timeout in FetchContext even if block is FalseOmar Ghishan2014-01-061-4/+4
| | | |
| * | | * Guarantee reading the expected number of bytes from the socket every timeOmar Ghishan2014-01-063-32/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Remove bufsize from client and conn, since they're not actually enforced Notes: This commit changes behavior a bit by raising a BufferUnderflowError when no data is received for the message size rather than a ConnectionError. Since bufsize in the socket is not actually enforced, but it is used by the consumer when creating requests, moving it there until a better solution is implemented.
| * | | Read the correct number of bytes from kafka.Omar Ghishan2014-01-061-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | According to the protocol documentation, the 4 byte integer at the beginning of a response represents the size of the payload only, not including those bytes. See http://goo.gl/rg5uom
| * | | Allow customizing socket timeouts.Omar Ghishan2014-01-062-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | Previously, if you try to consume a message with a timeout greater than 10 seconds, but you don't receive data in those 10 seconds, a socket.timeout exception is raised. This allows a higher socket timeout to be set, or even None for no timeout.
* | | | Merge pull request #66 from jcrobak/fix-import-collisionDavid Arthur2014-01-083-0/+6
|\ \ \ \ | |/ / / |/| | | Enable absolute imports for modules using Queue.
| * | | Enable absolute imports for modules using Queue.Joe Crobak2013-10-213-0/+6
| | |/ | |/| | | | | | | | | | | | | | | | When running on Linux with code on a case-insensitive file system, imports of the `Queue` module fail because python resolves the wrong file (It is trying to use a relative import of `queue.py` in the kafka directory). This change forces absolute imports via PEP328.
* | | Merge pull request #84 from nieksand/simpler_timeoutsDavid Arthur2013-12-281-3/+3
|\ \ \ | | | | | | | | Replace _send_upstream datetime logic with simpler time().
| * | | Replaced _send_upstream datetime logic with simpler time().Niek Sanders2013-12-251-3/+3
| | | |
* | | | reduce memory copies when consuming kafka responsesEvan Klitzke2013-12-251-5/+2
|/ / /
* | | allow for timeout to be None in SimpleConsumer.get_messagesZack Dever2013-12-121-1/+2
|/ /
* | Ensure that multiprocess consumer works in windowsMahendra M2013-10-081-53/+63
|/
* Merge branch 'master' into prod-windowsMahendra M2013-10-084-19/+49
|\ | | | | | | | | Conflicts: kafka/producer.py
| * make changes to be more fault tolerant: clean up connections, brokers, ↵Jim Lim2013-10-044-19/+49
| | | | | | | | | | | | | | | | failed_messages - add integration tests for sync producer - add integration tests for async producer w. leadership election - use log.exception