| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
|
|
|
| |
Cleanup the formatting, remove parens, extraneous spaces, etc.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was completely broken previously because it didn't lookup the group
coordinator of the consumer group. Also added basic error
handling/raising.
Note:
I added the `group_coordinator_id` as an optional kwarg. As best I
can tell, the Java client doesn't include this and instead looks it up
every time. However, if we add this, it allows the caller the
flexibility to bypass the network round trip of the lookup if for some
reason they already know the `group_coordinator_id`.
|
|
|
|
|
|
|
| |
Set a clear default value for `validate_only` / `include_synonyms`
Previously the kwarg defaulted to `None`, but then sent a `False` so this
makes it more explicit and reduces ambiguity.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, this only queried the controller. In actuality, the Kafka
protocol requires that the client query all brokers in order to get the
full list of consumer groups.
Note: The Java code (as best I can tell) doesn't allow limiting this to
specific brokers. And on the surface, this makes sense... you typically
don't care about specific brokers.
However, the inverse is true... consumer groups care about knowing their
group coordinator so they don't have to repeatedly query to find it.
In fact, a Kafka broker will only return the groups that it's a
coordinator for. While this is an implementation detail that is not
guaranteed by the upstream broker code, and technically should not be
relied upon, I think it very unlikely to change.
So monitoring scripts that fetch the offsets or describe the consumers
groups of all groups in the cluster can simply issue one call per broker
to identify all the coordinators, rather than having to issue one call
per consumer group. For an ad-hoc script this doesn't matter, but for
a monitoring script that runs every couple of minutes, this can be a big
deal. I know in the situations where I will use this, this matters more
to me than the risk of the interface unexpectedly breaking.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support fetching the offsets of a consumer group.
Note: As far as I can tell (the Java code is a little inscrutable), the
Java AdminClient doesn't allow specifying the `coordinator_id` or the
`partitions`.
But I decided to include them because they provide a lot of additional
flexibility:
1. allowing users to specify the partitions allows this method to be used even for
older brokers that don't support the OffsetFetchRequest_v2
2. allowing users to specify the coordinator ID gives them a way to
bypass a network round trip. This method will frequently be used for
monitoring, and if you've got 1,000 consumer groups that are being
monitored once a minute, that's ~1.5M requests a day that are
unnecessarily duplicated as the coordinator doesn't change unless
there's an error.
|
|
|
|
|
|
|
|
|
| |
The controller send error handling was completely broken.
It also pinned the metadata version unnecessarily.
Additionally, several of the methods were sending to the controller
but either that was unnecessary, or just plain wrong. So updated
following the pattern of the Java Admin client.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need a way to send a request to the group coordinator.
I spent a day and a half trying to implement a `_send_request_to_group_coordinator()`
that included:
1. caching the value of the group coordinator so that it wouldn't have
to be repeatedly looked up on every call. This is particularly important
because the `list_consumer_groups()`, `list_consumer_group_offsets()`,
and `describe_consumer_groups()` will frequently be used by monitoring
scripts. I know across the production clusters that I support, using a
cached value will save ~1M calls per day.
2. clean and consistent error handling. This is difficult because the
responses are inconsistent about error codes. Some have a top-level
error code, some bury it within the description of the actual item.
3. Avoiding tight coupling between this method and the request/response
classes... the custom parsing logic for errors etc, given that it's
non-standard, should live in the callers, not here.
So finally I gave up and just went with this simpler solution and made
it so the callers can optionally bypass this if they somehow already
know the group coordinator.
|
|
|
|
|
| |
This is a new class, so let's not support the old version strings and
saddle ourselves with tech debt right from the get-go.
|
|
|
|
| |
Fix #1633
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`UnsupportedVersionError` is intended to indicate a server-side error:
https://github.com/dpkp/kafka-python/blob/ba7372e44ffa1ee49fb4d5efbd67534393e944db/kafka/errors.py#L375-L378
So we should not be raising it for client-side errors. I realize that
semantically this seems like the appropriate error to raise. However,
this is confusing when debugging... for a real-life example, see
https://github.com/Parsely/pykafka/issues/697. So I strongly feel that
server-side errors should be kept separate from client-side errors,
even if all the client is doing is proactively protecting against
hitting a situation where the broker would return this error.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed that pack/unpack functions from
https://github.com/dpkp/kafka-python/blob/master/kafka/protocol/types.py
might be slightly improved. I made pre-compilation for them. It gives
about 10% better performance compared to the current implementation.
Consumption of 100msg:
```
239884 0.187 0.000 0.287 0.000 types.py:18(_unpack) # new version
239884 0.192 0.000 0.323 0.000 types.py:17(_unpack)
```
I also made some profiling for producers/consumers. It gives about
1-1.5% time savings.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`random_string` now comes from `test.fixtures` and was being
transparently imported via `test.testutil` so this bypasses the
pointless indirect import.
Similarly, `kafka_version` was transparently imported by `test.testutil`
from `test.fixtures`.
Also removed `random_port()` in `test.testutil` because its unused as its been replaced
by the one in `test.fixtures`.
This is part of the pytest migration that was started back in
a1869c4be5f47b4f6433610249aaf29af4ec95e5.
|
| |
|
|
|
|
| |
This is no longer used anywhere in the codebase
|
|
|
|
|
|
|
| |
Removed some of the hardcoded values as they are now outdated, and just
pointed to where to find the current value in the code.
Also some minor wordsmithing.
|
|
|
|
| |
I missed this in my previous cleanup back in 9221fcf83528b5c3657e43636cb84c1d18025acd.
|
|
|
|
|
|
|
|
|
| |
We have many deprecation warnings in the travis logs for things that are
fixed in newer versions of `pylint` or `pylint`'s dependencies.
Note that `pylint` >= 2.0 does not support python 2, so this will result
in different versions of pylint running for python 2 vs python 3.
Personally, I am just fine with this.
|
|
|
|
| |
Temporarily workaround https://github.com/PyCQA/pylint/issues/2571 so that we can stop pinning `pylint`.
|
|
|
|
|
|
| |
Requires cluster version > 0.10.0.0, and uses new wire protocol
classes to do many things via broker connection that previously
needed to be done directly in zookeeper.
|
|
|
|
|
|
|
| |
When I was fixing urls the other day, I noticed that sphinx hadn't added
https but there was an open ticket: https://github.com/sphinx-doc/sphinx/issues/5522
Now that that is resolved, I'm updating it here.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`six.moves` is a dynamically-created namespace that doesn't actually
exist and therefore `pylint` can't statically analyze it.
By default, `pylint` is smart enough to realize that and ignore the
import errors.
However, because we vendor it, the location changes to
`kafka.vendor.six.moves` so `pylint` doesn't realize it should be
ignored.
So this explicitly ignores it.
`pylint` documentation of this feature:
http://pylint.pycqa.org/en/1.9/technical_reference/features.html?highlight=ignored-modules#id34
More background:
* https://github.com/PyCQA/pylint/issues/1640
* https://github.com/PyCQA/pylint/issues/223
|
|
|
|
|
|
|
|
|
| |
This is needed for https://github.com/dpkp/kafka-python/pull/1540
While the usage there is trivial and could probably be worked around, I'd
rather vendor it so that future code can use enums... since `enum` is
already available in the python 3 stdlib, this will be easy enough to
eventually stop vendoring whenever we finally drop python 2 support.
|
|
|
|
| |
Use vendored `six`, and also `six.moves.range` rather than `xrange`
|
|
|
|
|
| |
Snappy URL was outdated. Similarly, many of these sites now support
https.
|
|
|
|
|
|
|
|
|
| |
Bump `six` to `1.11.0`. Most changes do not affect us, but it's good to
stay up to date. Also, we will likely start vendoring `enum34` in which
case https://github.com/benjaminp/six/pull/178 is needed.
Note that this preserves the `kafka-python` customization from https://github.com/dpkp/kafka-python/pull/979
which has been submitted upstream as https://github.com/benjaminp/six/pull/176 but not yet merged.
|
| |
|
| |
|
| |
|
| |
|