delta/python-packages/kafka-python.git - github.com: mumrah/kafka-python.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Improve SSLWantReadError handling when ssl not availableold_ssl_exception	Dana Powers	2019-03-21	1	-3/+4
\|
*	Wrap SSL sockets after connecting (#1754)	Dana Powers	2019-03-21	1	-19/+11
\|
*	Fix race condition in protocol.send_bytes (#1752)	Filip Stefanak	2019-03-21	1	-1/+2
\|
*	Bump version for development	Dana Powers	2019-03-14	1	-1/+1
\|
*	Release 1.4.51.4.5	Dana Powers	2019-03-14	1	-1/+1
\|
*	Error if connections_max_idle_ms not larger than request_timeout_ms (#1688)	Jeff Widman	2019-03-14	1	-3/+7
\|
*	Retry bootstrapping after backoff when necessary (#1736)	Dana Powers	2019-03-14	2	-84/+89
\| \| \| \| \| \| \|	The current client attempts to bootstrap once during initialization, but if it fails there is no second attempt and the client will be inoperable. This can happen, for example, if an entire cluster is down at the time a long-running client starts execution. This commit attempts to fix this by removing the synchronous bootstrapping from `KafkaClient` init, and instead merges bootstrap metadata with the cluster metadata. The Java client uses a similar approach. This allows us to continue falling back to bootstrap data when necessary throughout the life of a long-running consumer or producer. Fix #1670
*	Fix default protocol parser version	Dana Powers	2019-03-13	1	-0/+3
\|
*	Minor updates to client_async.py	Dana Powers	2019-03-13	1	-4/+4
\|
*	Recheck connecting nodes sooner when refreshing metadata (#1737)	Dana Powers	2019-03-13	1	-3/+1
\|
*	Don't recheck version if api_versions data is already cached (#1738)	Dana Powers	2019-03-13	1	-0/+3
\| \| \|	I noticed during local testing that version probing was happening twice when connecting to newer broker versions. This was because we call check_version() once explicitly, and then again implicitly within get_api_versions(). But once we have _api_versions data cached, we can just return it and avoid probing versions a second time.
*	Attempt to join heartbeat thread during close() (#1735)	Dana Powers	2019-03-13	1	-3/+6
\| \| \| \| \| \| \|	Underlying issue here is a race on consumer.close() between the client, the connections/sockets, and the heartbeat thread. Although the heartbeat thread is signaled to close, we do not block for it. So when we go on to close the client and its underlying connections, if the heartbeat is still doing work it can cause errors/crashes if it attempts to access the now closed objects (selectors and/or sockets, primarily). So this commit adds a blocking thread join to the heartbeat close. This may cause some additional blocking time on consumer.close() while the heartbeat thread finishes. But it should be small in average case and in the worst case will be no longer than the heartbeat_timeout_ms (though if we timeout the join, race errors may still occur). Fix #1666
*	1701 use last offset from fetch v4 if available (#1724)	Keith So	2019-03-13	3	-0/+28
\|
*	Catch thrown OSError by python 3.7 when creating a connection (#1694)	Daniel Johansson	2019-03-12	1	-0/+3
\|
*	Ignore lookup_coordinator result in commit_offsets_async (#1712)	Faqa	2019-03-12	1	-1/+2
\|
*	Synchronize puts to KafkaConsumer protocol buffer during async sends	Dana Powers	2019-03-12	1	-21/+36
\|
*	Do network connections and writes in KafkaClient.poll() (#1729)	Dana Powers	2019-03-08	4	-49/+76
\| \| \| \| \| \|	* Add BrokerConnection.send_pending_requests to support async network sends * Send network requests during KafkaClient.poll() rather than in KafkaClient.send() * Dont acquire lock during KafkaClient.send if node is connected / ready * Move all network connection IO into KafkaClient.poll()
*	Do not require client lock for read-only operations (#1730)	Dana Powers	2019-03-06	1	-50/+50
\| \| \|	In an effort to reduce the surface area of lock coordination, and thereby hopefully reduce lock contention, I think we can remove locking from the read-only KafkaClient methods: connected, is_disconnected, in_flight_request_count, and least_loaded_node . Given that the read data could change after the lock is released but before the caller uses it, the value of acquiring a lock here does not seem high to me.
*	Make NotEnoughReplicasError/NotEnoughReplicasAfterAppendError retriable (#1722)	le-linh	2019-03-03	1	-0/+2
\|
*	Remove unused import	Jeff Widman	2019-01-28	1	-1/+0
\|
*	Improve KafkaConsumer join group / only enable Heartbeat Thread during ↵	Dana Powers	2019-01-15	1	-11/+23
\| \| \| \|	stable group (#1695)
*	Remove unused `skip_double_compressed_messages`	Jeff Widman	2019-01-13	2	-16/+0
\| \| \| \| \| \| \| \| \| \|	This `skip_double_compressed_messages` flag was added in https://github.com/dpkp/kafka-python/pull/755 in order to fix https://github.com/dpkp/kafka-python/issues/718. However, grep'ing through the code, it looks like it this is no longer used anywhere and doesn't do anything. So removing it.
*	Timeout all unconnected conns (incl SSL) after request_timeout_ms	Dana Powers	2019-01-13	1	-6/+8
\|
*	Fix `AttributeError` caused by `getattr()`	Jeff Widman	2019-01-07	1	-1/+2
\| \| \| \| \| \| \|	`getattr(object, 'x', object.y)` will evaluate the default argument `object.y` regardless of whether `'x'` exists. For details see: https://stackoverflow.com/q/31443989/770425
*	Fix SSL connection testing in Python 3.7	Ben Weir	2019-01-03	1	-0/+7
\|
*	Fix response error checking in KafkaAdminClient send_to_controller	Dana Powers	2019-01-03	2	-5/+15
\| \| \| \| \| \| \| \| \| \| \| \|	Previously we weren't accounting for when the response tuple also has a `error_message` value. Note that in Java, the error fieldname is inconsistent: - `CreateTopicsResponse` / `CreatePartitionsResponse` uses `topic_errors` - `DeleteTopicsResponse` uses `topic_error_codes` So this updates the `CreateTopicsResponse` classes to match. The fix is a little brittle, but should suffice for now.
*	#1681 add copy() in metrics() to avoid thread safety issues (#1682)	Tosi Émeric	2018-12-27	2	-4/+4
\|
*	Bugfix: Types need identity comparison	Jeff Widman	2018-12-13	1	-1/+1
\| \| \|	`isinstance()` won't work here, as the types require identity comparison.
*	Bump version for development	Dana Powers	2018-11-20	1	-1/+1
\|
*	Release 1.4.41.4.4	Dana Powers	2018-11-20	1	-1/+1
\|
*	Rename KafkaAdmin to KafkaAdminClient	Jeff Widman	2018-11-20	3	-20/+20
\|
*	Break KafkaClient poll if closed	Dana Powers	2018-11-20	1	-0/+2
\|
*	Add protocols for {Describe,Create,Delete} Acls	Ulrik Johansson	2018-11-19	1	-0/+185
\|
*	Bugfix: Always set this_groups_coordinator_id	Jeff Widman	2018-11-19	1	-1/+3
\|
*	Various docstring / pep8 / code hygiene cleanups	Jeff Widman	2018-11-18	1	-71/+86
\|
*	Fix describe_groups	Jeff Widman	2018-11-18	1	-13/+50
\| \| \| \| \| \| \| \| \| \| \| \| \|	This was completely broken previously because it didn't lookup the group coordinator of the consumer group. Also added basic error handling/raising. Note: I added the `group_coordinator_id` as an optional kwarg. As best I can tell, the Java client doesn't include this and instead looks it up every time. However, if we add this, it allows the caller the flexibility to bypass the network round trip of the lookup if for some reason they already know the `group_coordinator_id`.
*	Set a clear default value for `validate_only`/`include_synonyms`	Jeff Widman	2018-11-18	1	-8/+8
\| \| \| \| \| \| \|	Set a clear default value for `validate_only` / `include_synonyms` Previously the kwarg defaulted to `None`, but then sent a `False` so this makes it more explicit and reduces ambiguity.
*	Fix list_consumer_groups() to query all brokers	Jeff Widman	2018-11-18	1	-5/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, this only queried the controller. In actuality, the Kafka protocol requires that the client query all brokers in order to get the full list of consumer groups. Note: The Java code (as best I can tell) doesn't allow limiting this to specific brokers. And on the surface, this makes sense... you typically don't care about specific brokers. However, the inverse is true... consumer groups care about knowing their group coordinator so they don't have to repeatedly query to find it. In fact, a Kafka broker will only return the groups that it's a coordinator for. While this is an implementation detail that is not guaranteed by the upstream broker code, and technically should not be relied upon, I think it very unlikely to change. So monitoring scripts that fetch the offsets or describe the consumers groups of all groups in the cluster can simply issue one call per broker to identify all the coordinators, rather than having to issue one call per consumer group. For an ad-hoc script this doesn't matter, but for a monitoring script that runs every couple of minutes, this can be a big deal. I know in the situations where I will use this, this matters more to me than the risk of the interface unexpectedly breaking.
*	Add list_consumer_group_offsets()	Jeff Widman	2018-11-18	2	-1/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support fetching the offsets of a consumer group. Note: As far as I can tell (the Java code is a little inscrutable), the Java AdminClient doesn't allow specifying the `coordinator_id` or the `partitions`. But I decided to include them because they provide a lot of additional flexibility: 1. allowing users to specify the partitions allows this method to be used even for older brokers that don't support the OffsetFetchRequest_v2 2. allowing users to specify the coordinator ID gives them a way to bypass a network round trip. This method will frequently be used for monitoring, and if you've got 1,000 consumer groups that are being monitored once a minute, that's ~1.5M requests a day that are unnecessarily duplicated as the coordinator doesn't change unless there's an error.
*	Fix send to controller	Jeff Widman	2018-11-18	1	-44/+89
\| \| \| \| \| \| \| \| \|	The controller send error handling was completely broken. It also pinned the metadata version unnecessarily. Additionally, several of the methods were sending to the controller but either that was unnecessary, or just plain wrong. So updated following the pattern of the Java Admin client.
*	Add group coordinator lookup	Jeff Widman	2018-11-18	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need a way to send a request to the group coordinator. I spent a day and a half trying to implement a `_send_request_to_group_coordinator()` that included: 1. caching the value of the group coordinator so that it wouldn't have to be repeatedly looked up on every call. This is particularly important because the `list_consumer_groups()`, `list_consumer_group_offsets()`, and `describe_consumer_groups()` will frequently be used by monitoring scripts. I know across the production clusters that I support, using a cached value will save ~1M calls per day. 2. clean and consistent error handling. This is difficult because the responses are inconsistent about error codes. Some have a top-level error code, some bury it within the description of the actual item. 3. Avoiding tight coupling between this method and the request/response classes... the custom parsing logic for errors etc, given that it's non-standard, should live in the callers, not here. So finally I gave up and just went with this simpler solution and made it so the callers can optionally bypass this if they somehow already know the group coordinator.
*	Remove support for api versions as strings from KafkaAdmin	Jeff Widman	2018-11-18	1	-10/+0
\| \| \| \| \|	This is a new class, so let's not support the old version strings and saddle ourselves with tech debt right from the get-go.
*	Be explicit with tuples for %s formatting	Jeff Widman	2018-11-18	22	-38/+38
\| \| \| \|	Fix #1633
*	Stop using broker-errors for client-side problems	Jeff Widman	2018-11-18	3	-38/+44
\| \| \| \| \| \| \| \| \| \| \| \| \|	`UnsupportedVersionError` is intended to indicate a server-side error: https://github.com/dpkp/kafka-python/blob/ba7372e44ffa1ee49fb4d5efbd67534393e944db/kafka/errors.py#L375-L378 So we should not be raising it for client-side errors. I realize that semantically this seems like the appropriate error to raise. However, this is confusing when debugging... for a real-life example, see https://github.com/Parsely/pykafka/issues/697. So I strongly feel that server-side errors should be kept separate from client-side errors, even if all the client is doing is proactively protecting against hitting a situation where the broker would return this error.
*	Use TypeError for invalid type	Jeff Widman	2018-11-17	1	-1/+1
\|
*	raising logging level on messages signalling data loss (#1553)	Alexander Sibiryakov	2018-11-10	1	-2/+3
\|
*	set socket timeout for the wake_w (#1577)	flaneur	2018-11-10	2	-0/+6
\|
*	(Attempt to) Fix deadlock between consumer and heartbeat (#1628)	Dana Powers	2018-11-10	2	-4/+2
\|
*	Fix typo	Jeff Widman	2018-11-07	1	-1/+1
\|
*	Pre-compile pack/unpack function calls	billyevans	2018-10-29	1	-13/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I noticed that pack/unpack functions from https://github.com/dpkp/kafka-python/blob/master/kafka/protocol/types.py might be slightly improved. I made pre-compilation for them. It gives about 10% better performance compared to the current implementation. Consumption of 100msg: ``` 239884 0.187 0.000 0.287 0.000 types.py:18(_unpack) # new version 239884 0.192 0.000 0.323 0.000 types.py:17(_unpack) ``` I also made some profiling for producers/consumers. It gives about 1-1.5% time savings.