summaryrefslogtreecommitdiff
path: root/src/encoding/json/decode_test.go
Commit message (Collapse)AuthorAgeFilesLines
* Revert "encoding/json: don't reuse slice elements when decoding"Daniel Martí2020-07-021-14/+1
| | | | | | | | | | | | | | | | | | | This reverts https://golang.org/cl/191783. Reason for revert: Broke too many programs which depended on the previous behavior, even when it was the opposite of what the documentation said. We can attempt to fix the original issue again for 1.16, while keeping those programs in mind. Fixes #39427. Change-Id: I7a7f24b2a594c597ef625aeff04fff29aaa88fc6 Reviewed-on: https://go-review.googlesource.com/c/go/+/240657 Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* encoding/json: revert "avoid work when unquoting strings, take 2"Daniel Martí2020-06-151-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts golang.org/cl/190659 and golang.org/cl/226218, minus the regression tests in the latter. The original work happened in golang.org/cl/151157, which was reverted in golang.org/cl/190909 due to a crash found by fuzzing. We tried a second time in golang.org/cl/190659, which shipped with Go 1.14. A bug was found, where strings would be mangled in certain edge cases. The fix for that was golang.org/cl/226218, which was backported into Go 1.14.4. Unfortunately, a second regression was just reported in #39555, which is a similar case of strings getting mangled when decoding under certain conditions. It would be possible to come up with another small patch to fix that edge case, but instead, let's just revert the entire optimization, as it has proved to do more harm than good. Moreover, it's hard to argue or prove that there will be no more such regressions. However, all the work wasn't for nothing. First, we learned that the way the decoder unquotes tokenized strings isn't simple; initially, we had wrongly assumed that each string was unquoted exactly once and in order. Second, we have gained a number of regression tests which will be useful to prevent the same mistakes in the future, including the test cases we add in this CL. Fixes #39555. Change-Id: I66a6919c2dd6d9789232482ba6cf3814eaa70f61 Reviewed-on: https://go-review.googlesource.com/c/go/+/237838 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Andrew Bonventre <andybons@golang.org>
* Revert "encoding/json: reuse values when decoding map elements"Daniel Martí2020-05-281-54/+0
| | | | | | | | | | | | | | | | | This reverts golang.org/cl/179337. Reason for revert: broke a few too many reasonably valid Go programs. The previous behavior was perhaps less consistent, but the docs were never very clear about when the decoder merges with existing values, versus replacing existing values altogether. Fixes #39149. Change-Id: I1c1d857709b8398969fe421aa962f6b62f91763a Reviewed-on: https://go-review.googlesource.com/c/go/+/234559 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Andrew Bonventre <andybons@golang.org>
* encoding/json: reuse values when decoding map elementsDaniel Martí2020-05-081-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we decode into a struct, each input key-value may be decoded into one of the struct's fields. Particularly, existing data isn't dropped, so that some sub-fields can be decoded into without zeroing all other data. However, decoding into a map behaved in the opposite way. Whenever a key-value was decoded, it completely replaced the previous map element. If the map contained any non-zero data in that key, it's dropped. Instead, try to reuse the existing element value if possible. If the map element type is a pointer, and the value is non-nil, we can decode directly into it. If it's not a pointer, make a copy and decode into that copy, as map element values aren't addressable. This means we have to parse and convert the map element key before the value, to be able to obtain the existing element value. This is fine, though. Moreover, reporting errors on the key before the value follows the input order more closely. Finally, add a test to explore the four combinations, involving pointer and non-pointer, and non-zero and zero values. A table-driven test wasn't used, as each case required different checks, such as checking that the non-nil pointer case doesn't end up with a different pointer. Fixes #31924. Change-Id: I5ca40c9963a98aaf92f26f0b35843c021028dfca Reviewed-on: https://go-review.googlesource.com/c/go/+/179337 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* encoding/json: don't mangle strings in an edge case when decodingDaniel Martí2020-05-081-2/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The added comment contains some context. The original optimization assumed that each call to unquoteBytes (or unquote) followed its corresponding call to rescanLiteral. Otherwise, unquoting a literal might use d.safeUnquote from another re-scanned literal. Unfortunately, this assumption is wrong. When decoding {"foo": "bar"} into a map[T]string where T implements TextUnmarshaler, the sequence of calls would be as follows: 1) rescanLiteral "foo" 2) unquoteBytes "foo" 3) rescanLiteral "bar" 4) unquoteBytes "foo" (for UnmarshalText) 5) unquoteBytes "bar" Note that the call to UnmarshalText happens in literalStore, which repeats the work to unquote the input string literal. But, since that happens after we've re-scanned "bar", we're using the wrong safeUnquote field value. In the added test case, the second string had a non-zero number of safe bytes, and the first string had none since it was all non-ASCII. Thus, "safely" unquoting a number of the first string's bytes could cut a rune in half, and thus mangle the runes. A rather simple fix, without a full revert, is to only allow one use of safeUnquote per call to unquoteBytes. Each call to rescanLiteral when we have a string is soon followed by a call to unquoteBytes, so it's no longer possible for us to use the wrong index. Also add a test case from #38126, which is the same underlying bug, but affecting the ",string" option. Before the fix, the test would fail, just like in the original two issues: --- FAIL: TestUnmarshalRescanLiteralMangledUnquote (0.00s) decode_test.go:2443: Key "开源" does not exist in map: map[开���:12345开源] decode_test.go:2458: Unmarshal unexpected error: json: invalid use of ,string struct tag, trying to unmarshal "\"aaa\tbbb\"" into string Fixes #38105. For #38126. Change-Id: I761e54924e9a971a4f9eaa70bbf72014bb1476e6 Reviewed-on: https://go-review.googlesource.com/c/go/+/226218 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
* encoding/json: don't reuse slice elements when decodingDaniel Martí2020-05-071-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous behavior directly contradicted the docs that have been in place for years: To unmarshal a JSON array into a slice, Unmarshal resets the slice length to zero and then appends each element to the slice. We could use reflect.New to create a new element and reflect.Append to then append it to the destination slice, but benchmarks have shown that reflect.Append is very slow compared to the code that manually grows a slice in this file. Instead, if we're decoding into an element that came from the original backing array, zero it before decoding into it. We're going to be using the CodeDecoder benchmark, as it has a slice of struct pointers that's decoded very often. Note that we still reuse existing values from arrays being decoded into, as the documentation agrees with the existing implementation in that case: To unmarshal a JSON array into a Go array, Unmarshal decodes JSON array elements into corresponding Go array elements. The numbers with the benchmark as-is might seem catastrophic, but that's only because the benchmark is decoding into the same variable over and over again. Since the old decoder was happy to reuse slice elements, it would save a lot of allocations by not having to zero and re-allocate said elements: name old time/op new time/op delta CodeDecoder-8 10.4ms ± 1% 10.9ms ± 1% +4.41% (p=0.000 n=10+10) name old speed new speed delta CodeDecoder-8 186MB/s ± 1% 178MB/s ± 1% -4.23% (p=0.000 n=10+10) name old alloc/op new alloc/op delta CodeDecoder-8 2.19MB ± 0% 3.59MB ± 0% +64.09% (p=0.000 n=10+10) name old allocs/op new allocs/op delta CodeDecoder-8 76.8k ± 0% 92.7k ± 0% +20.71% (p=0.000 n=10+10) We can prove this by moving 'var r codeResponse' into the loop, so that the benchmark no longer reuses the destination pointer. And sure enough, we no longer see the slow-down caused by the extra allocations: name old time/op new time/op delta CodeDecoder-8 10.9ms ± 0% 10.9ms ± 1% -0.37% (p=0.043 n=10+10) name old speed new speed delta CodeDecoder-8 177MB/s ± 0% 178MB/s ± 1% +0.37% (p=0.041 n=10+10) name old alloc/op new alloc/op delta CodeDecoder-8 3.59MB ± 0% 3.59MB ± 0% ~ (p=0.780 n=10+10) name old allocs/op new allocs/op delta CodeDecoder-8 92.7k ± 0% 92.7k ± 0% ~ (all equal) I believe that it's useful to leave the benchmarks as they are now, because the decoder does reuse memory in some cases. For example, existing map elements are reused. However, subtle changes like this one need to be benchmarked carefully. Finally, add a couple of tests involving both a slice and an array of structs. Fixes #21092. Change-Id: I8b1194f25e723a31abd146fbfe9428ac10c1389d Reviewed-on: https://go-review.googlesource.com/c/go/+/191783 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* encoding/json: limit max nesting depthJordan Liggitt2020-02-241-0/+96
| | | | | | | | | | | | | | | | | | | | | | | | Limit the maximum nesting depth when parsing to protect against stack overflow, permitted by https://tools.ietf.org/html/rfc7159#section-9 A nesting depth limit of 10,000 was chosen to be a conservative balance between avoiding stack overflow and avoiding impacting legitimate JSON documents. 10,000 is less than 1% of the experimental stack depth limit with the default stack size: * On 64-bit systems, the default stack limit is 1GB, which allows ~2,800,000 frames of recursive parsing * On 32-bit systems, the default stack limit is 250MB, which allows ~1,100,000 frames of recursive parsing Fixes #31789 Change-Id: I4f5a90e89dcb4ab1a957ad9d02e1fa0efafaccf6 Reviewed-on: https://go-review.googlesource.com/c/go/+/199837 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
* encoding/json: support TextUnmarshaler for map keys with string underlying typesCuong Manh Le2019-10-101-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When unmarshaling to a map, the map's key type must either be a string, an integer, or implement encoding.TextUnmarshaler. But for a user defined type, reflect.Kind will not distinguish between the static type and the underlying type. In: var x MyString = "x" t := reflect.TypeOf(x) println(t.Kind() == reflect.String) the Kind of x is still reflect.String, even though the static type of x is MyString. Moreover, checking for the map's key type is a string occurs first, so even if the map key type MyString implements encoding.TextUnmarshaler, it will be ignored. To fix the bug, check for encoding.TextUnmarshaler first. Fixes #34437 Change-Id: I780e0b084575e1dddfbb433fe03857adf71d05fb Reviewed-on: https://go-review.googlesource.com/c/go/+/200237 Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
* encoding/json: validate strings when decoding into NumberLucas Bremgartner2019-09-161-0/+31
| | | | | | | | | | | | | | | Unmarshaling a string into a json.Number should first check that the string is a valid Number. If not, we should fail without decoding it. Fixes #14702 Change-Id: I286178e93df74ad63c0a852c3f3489577072cf47 GitHub-Last-Rev: fe69bb68eed06d056639f440d2daf4bb7c99013b GitHub-Pull-Request: golang/go#34272 Reviewed-on: https://go-review.googlesource.com/c/go/+/195045 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
* encoding/json: don't indirect pointers when decoding nullRomain Baugue2019-09-101-0/+10
| | | | | | | | | | | | | | The indirect method checked the type of the child when indirecting a pointer. If the current value is a pointer and we are decoding null, we can skip this entirely and return early, avoiding the whole descent. Fixes #31776 Change-Id: Ib8b2a2357572c41f56fceac59b5a858980f3f65e Reviewed-on: https://go-review.googlesource.com/c/go/+/174699 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
* std: remove unused bits of code all over the placeDaniel Martí2019-09-021-35/+0
| | | | | | | | | | | | | | | | Some were never used, and some haven't been used for years. One exception is net/http's readerAndCloser, which was only used in a test. Move it to a test file. While at it, remove a check in regexp that could never fire; the field is an uint32, so it can never be negative. Change-Id: Ia2200f6afa106bae4034045ea8233b452f38747b Reviewed-on: https://go-review.googlesource.com/c/go/+/192621 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/json: fix scanner byte offset on scanEnditchyny2019-09-021-0/+35
| | | | | | | | | | | | | | | | scanEnd is delayed one byte so we decrement the scanner bytes count by 1 to ensure that this value is correct in the next call of Decode. Fixes #32399 Change-Id: I8c8698e7f95bbcf0373aceaa05319819eae9d86f GitHub-Last-Rev: 0ac25d8de23d38c7ac577faddc6983571023f561 GitHub-Pull-Request: golang/go#32598 Reviewed-on: https://go-review.googlesource.com/c/go/+/182117 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
* encoding/json: remove unnecessary isValidNumber callDaniel Martí2019-08-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | The decoder called this function to check numbers being decoded into a json.Number. However, these can't be quoted as strings, so the tokenizer has already verified they are valid JSON numbers. Verified this by adding a test with such an input. As expected, it produces a syntax error, not the fmt.Errorf - that line could never execute. Since the only remaining non-test caller of isvalidnumber is in encode.go, move the function there. This change should slightly reduce the amount of work when decoding into json.Number, though that isn't very common nor part of any current benchmarks. Change-Id: I67a1723deb3d18d5b542d6dd35f3ae56a43f23eb Reviewed-on: https://go-review.googlesource.com/c/go/+/184817 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* encoding/json: fix the broken "overwriting of data" testsDaniel Martí2019-08-271-44/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because TestUnmarshal actually allocates a new value to decode into using ptr's pointer type, any existing data is thrown away. This was harmless in alomst all of the test cases, minus the "overwriting of data" ones added in 2015 in CL 12209. I spotted that nothing covered decoding a JSON array with few elements into a slice which already had many elements. I initially assumed that the code was buggy or that some code could be removed, when in fact there simply wasn't any code covering the edge case. Move those two tests to TestPrefilled, which already served a very similar purpose. Remove the map case, as TestPrefilled already has plenty of prefilled map cases. Moreover, we no longer reset an entire map when decoding, as per the godoc: To unmarshal a JSON object into a map, Unmarshal first establishes a map to use. If the map is nil, Unmarshal allocates a new map. Otherwise Unmarshal reuses the existing map, keeping existing entries. Finally, to ensure that ptr is used correctly in the future, make TestUnmarshal error if it's anything other than a pointer to a zero value. That is, the only correct use should be new(type). Don't rename the ptr field, as that would be extremely noisy and cause unwanted merge conflicts. Change-Id: I41e3ecfeae42d877ac5443a6bd622ac3d6c8120c Reviewed-on: https://go-review.googlesource.com/c/go/+/185738 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
* Revert "encoding/json: avoid work when unquoting strings"Filippo Valsorda2019-08-211-0/+2
| | | | | | | | | | | | | | | | | | This reverts CL 151157. CL 151157 introduced a crash when decoding into ",string" fields. It came with a moderate speedup, so at this stage of the release cycle let's just revert it, and reapply it in Go 1.14 with the fix in CL 190659. Also applied the test cases from CL 190659. Updates #33728 Change-Id: Ie46e2bc15224b251888580daf6b79d5865f3878e Reviewed-on: https://go-review.googlesource.com/c/go/+/190909 Run-TryBot: Andrew Bonventre <andybons@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Andrew Bonventre <andybons@golang.org>
* encoding/json: fix Unmarshal hang on recursive pointersLE Manh Cuong2019-04-301-0/+12
| | | | | | | | | | | | | | | | | | | | indirect walks down v until it gets to a non-pointer. But it does not handle the case when v is a pointer to itself, like in: var v interface{} v = &v Unmarshal(b, v) So just stop immediately if we see v is a pointer to itself. Fixes #31740 Change-Id: Ie396264119e24d70284cd9bf76dcb2050babb069 Reviewed-on: https://go-review.googlesource.com/c/go/+/174337 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/json: fix performance regression in the decoderDaniel Martí2019-03-181-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In golang.org/cl/145218, a feature was added where the JSON decoder would keep track of the entire path to a field when reporting an UnmarshalTypeError. However, we all failed to check if this affected the benchmarks - myself included, as a reviewer. Below are the numbers comparing the CL's parent with itself, once it was merged: name old time/op new time/op delta CodeDecoder-8 12.9ms ± 1% 28.2ms ± 2% +119.33% (p=0.002 n=6+6) name old speed new speed delta CodeDecoder-8 151MB/s ± 1% 69MB/s ± 3% -54.40% (p=0.002 n=6+6) name old alloc/op new alloc/op delta CodeDecoder-8 2.74MB ± 0% 109.39MB ± 0% +3891.83% (p=0.002 n=6+6) name old allocs/op new allocs/op delta CodeDecoder-8 77.5k ± 0% 168.5k ± 0% +117.30% (p=0.004 n=6+5) The reason why the decoder got twice as slow is because it now allocated ~40x as many objects, which puts a lot of pressure on the garbage collector. The reason is that the CL concatenated strings every time a nested field was decoded. In other words, practically every field generated garbage when decoded. This is hugely wasteful, especially considering that the vast majority of JSON decoding inputs won't return UnmarshalTypeError. Instead, use a stack of fields, and make sure to always use the same backing array, to ensure we only need to grow the slice to the maximum depth once. The original CL also introduced a bug. The field stack string wasn't reset to its original state when reaching "d.opcode == scanEndObject", so the last field in a decoded struct could leak. For example, an added test decodes a list of structs, and encoding/json before this CL would fail: got: cannot unmarshal string into Go struct field T.Ts.Y.Y.Y of type int want: cannot unmarshal string into Go struct field T.Ts.Y of type int To fix that, simply reset the stack after decoding every field, even if it's the last. Below is the original performance versus this CL. There's a tiny performance hit, probably due to the append for every decoded field, but at least we're back to the usual ~150MB/s. name old time/op new time/op delta CodeDecoder-8 12.9ms ± 1% 13.0ms ± 1% +1.25% (p=0.009 n=6+6) name old speed new speed delta CodeDecoder-8 151MB/s ± 1% 149MB/s ± 1% -1.24% (p=0.009 n=6+6) name old alloc/op new alloc/op delta CodeDecoder-8 2.74MB ± 0% 2.74MB ± 0% +0.00% (p=0.002 n=6+6) name old allocs/op new allocs/op delta CodeDecoder-8 77.5k ± 0% 77.5k ± 0% +0.00% (p=0.002 n=6+6) Finally, make all of these benchmarks report allocs by default. The decoder ones are pretty sensitive to generated garbage, so ReportAllocs would have made the performance regression more obvious. Change-Id: I67b50f86b2e72f55539429450c67bfb1a9464b67 Reviewed-on: https://go-review.googlesource.com/c/go/+/167978 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
* encoding/json: add Path to UnmarshalTypeErrorLE Manh Cuong2019-03-051-2/+22
| | | | | | | | | | | | | | | | When parsing nested object, UnmarshalTypeError does not contain actual path to nested field in original JSON. This commit change Field to contain the full path to that field. One can get the Field name by stripping all the leading path elements. Fixes #22369 Change-Id: I6969cc08abe8387a351e3fb2944adfaa0dccad2a Reviewed-on: https://go-review.googlesource.com/c/go/+/145218 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
* encoding/json: remove use of DeepEqual for testing errorsMarcel van Lohuizen2019-02-271-3/+13
| | | | | | | | | | | | | | Comparing errors using DeepEqual breaks if frame information is added as proposed in Issue #29934. Updates #29934. Change-Id: Ib430c9ddbe588dd1dd51314c408c74c07285e1ff Reviewed-on: https://go-review.googlesource.com/c/162179 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Damien Neil <dneil@google.com>
* encoding/json: always verify we can get a field's valueDaniel Martí2018-10-161-0/+15
| | | | | | | | | | | | | | | | | | Calling .Interface on a struct field's reflect.Value isn't always safe. For example, if that field is an unexported anonymous struct. We only descended into this branch if the struct type had any methods, so this bug had gone unnoticed for a few release cycles. Add the check, and add a simple test case. Fixes #28145. Change-Id: I02f7e0ab9a4a0c18a5e2164211922fe9c3d30f64 Reviewed-on: https://go-review.googlesource.com/c/141537 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* encoding/json: fix "data changed underfoot?" panicDaniel Martí2018-10-161-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given a program as follows: data := []byte(`{"F": { "a": 2, "3": 4 }}`) json.Unmarshal(data, &map[string]map[int]int{}) The JSON package should error, as "a" is not a valid integer. However, we'd encounter a panic: panic: JSON decoder out of sync - data changing underfoot? The reason was that decodeState.object would return a nil error on encountering the invalid map key string, while saving the key type error for later. This broke if we were inside another object, as we would abruptly end parsing the nested object, leaving the decoder in an unexpected state. To fix this, simply avoid storing the map element and continue decoding the object, to leave the decoder state exactly as if we hadn't seen an invalid key type. This affected both signed and unsigned integer keys, so fix both and add two test cases. Updates #28189. Change-Id: I8a6204cc3ff9fb04ed769df7a20a824c8b94faff Reviewed-on: https://go-review.googlesource.com/c/142518 Reviewed-by: Ian Lance Taylor <iant@golang.org>
* encoding/json: more tests to cover decoding edge casesDaniel Martí2018-09-121-1/+18
| | | | | | | | | | | | | The overall coverage of the json package goes up from 90.8% to 91.3%. While at it, apply two minor code simplifications found while inspecting the HTML coverage report. Change-Id: I0fba968afeedc813b1385e4bde72d93b878854d7 Reviewed-on: https://go-review.googlesource.com/134735 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* encoding/json: recover saved error context when unmarshallingIan Davis2018-09-061-0/+11
| | | | | | | | | | Fixes: #27464 Change-Id: I270c56fd0d5ae8787a1293029aff3072f4f52f33 Reviewed-on: https://go-review.googlesource.com/132955 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
* encoding/json: fix UnmarshalTypeError without field and struct valuesTaesu Pyo2018-08-281-0/+17
| | | | | | | | | | | | | Fixes #26444 Fixes #27275 Change-Id: I9e8cbff79f7643ca8964c572c1a98172b6831730 GitHub-Last-Rev: 7eea2158b67ccab34b45a21e8f4289c36de02d93 GitHub-Pull-Request: golang/go#26719 Reviewed-on: https://go-review.googlesource.com/126897 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
* encoding/json: encode struct field names ahead of timeDaniel Martí2018-08-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Struct field names are static, so we can run HTMLEscape on them when building each struct type encoder. Then, when running the struct encoder, we can select either the original or the escaped field name to write directly. When the encoder is not escaping HTML, using the original string works because neither Go struct field names nor JSON tags allow any characters that would need to be escaped, like '"', '\\', or '\n'. When the encoder is escaping HTML, the only difference is that '<', '>', and '&' are allowed via JSON struct field tags, hence why we use HTMLEscape to properly escape them. All of the above lets us encode field names with a simple if/else and WriteString calls, which are considerably simpler and faster than encoding an arbitrary string. While at it, also include the quotes and colon in these strings, to avoid three WriteByte calls in the loop hot path. Also added a few tests, to ensure that the behavior in these edge cases is not broken. The output of the tests is the same if this optimization is reverted. name old time/op new time/op delta CodeEncoder-4 7.12ms ± 0% 6.14ms ± 0% -13.85% (p=0.004 n=6+5) name old speed new speed delta CodeEncoder-4 272MB/s ± 0% 316MB/s ± 0% +16.08% (p=0.004 n=6+5) name old alloc/op new alloc/op delta CodeEncoder-4 91.9kB ± 0% 93.2kB ± 0% +1.43% (p=0.002 n=6+6) name old allocs/op new allocs/op delta CodeEncoder-4 0.00 0.00 ~ (all equal) Updates #5683. Change-Id: I6f6a340d0de4670799ce38cf95b2092822d2e3ef Reviewed-on: https://go-review.googlesource.com/122460 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* go/printer, gofmt: tuned table alignment for better resultsRobert Griesemer2018-04-041-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The go/printer (and thus gofmt) uses a heuristic to determine whether to break alignment between elements of an expression list which is spread across multiple lines. The heuristic only kicked in if the entry sizes (character length) was above a certain threshold (20) and the ratio between the previous and current entry size was above a certain value (4). This heuristic worked reasonably most of the time, but also led to unfortunate breaks in many cases where a single entry was suddenly much smaller (or larger) then the previous one. The behavior of gofmt was sufficiently mysterious in some of these situations that many issues were filed against it. The simplest solution to address this problem is to remove the heuristic altogether and have a programmer introduce empty lines to force different alignments if it improves readability. The problem with that approach is that the places where it really matters, very long tables with many (hundreds, or more) entries, may be machine-generated and not "post-processed" by a human (e.g., unicode/utf8/tables.go). If a single one of those entries is overlong, the result would be that the alignment would force all comments or values in key:value pairs to be adjusted to that overlong value, making the table hard to read (e.g., that entry may not even be visible on screen and all other entries seem spaced out too wide). Instead, we opted for a slightly improved heuristic that behaves much better for "normal", human-written code. 1) The threshold is increased from 20 to 40. This disables the heuristic for many common cases yet even if the alignment is not "ideal", 40 is not that many characters per line with todays screens, making it very likely that the entire line remains "visible" in an editor. 2) Changed the heuristic to not simply look at the size ratio between current and previous line, but instead considering the geometric mean of the sizes of the previous (aligned) lines. This emphasizes the "overall picture" of the previous lines, rather than a single one (which might be an outlier). 3) Changed the ratio from 4 to 2.5. Now that we ignore sizes below 40, a ratio of 4 would mean that a new entry would have to be 4 times bigger (160) or smaller (10) before alignment would be broken. A ratio of 2.5 seems more sensible. Applied updated gofmt to all of src and misc. Also tested against several former issues that complained about this and verified that the output for the given examples is satisfactory (added respective test cases). Some of the files changed because they were not gofmt-ed in the first place. For #644. For #7335. For #10392. (and probably more related issues) Fixes #22852. Change-Id: I5e48b3d3b157a5cf2d649833b7297b33f43a6f6e
* encoding/json: avoid assuming side-effect free reflect.Value.Addr().Elem()Joe Tsai2018-03-011-4/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the following: type child struct{ Field string } type parent struct{ child } p := new(parent) v := reflect.ValueOf(p).Elem().Field(0) v.Field(0).SetString("hello") // v.Field = "hello" v = v.Addr().Elem() // v = *(&v) v.Field(0).SetString("goodbye") // v.Field = "goodbye" It would appear that v.Addr().Elem() should have the same value, and that it would be safe to set "goodbye". However, after CL 66331, any interspersed calls between Field calls causes the RO flag to be set. Thus, setting to "goodbye" actually causes a panic. That CL affects decodeState.indirect which assumes that back-to-back Value.Addr().Elem() is side-effect free. We fix that logic to keep track of the Addr() and Elem() calls and set v back to the original after a full round-trip has occured. Fixes #24152 Updates #24153 Change-Id: Ie50f8fe963f00cef8515d89d1d5cbc43b76d9f9c Reviewed-on: https://go-review.googlesource.com/97796 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
* encoding/json: make error capture logic in recover more type safeJoe Tsai2018-02-141-0/+14
| | | | | | | | | | | | | | | | Rather than only ignoring runtime.Error panics, which are a very narrow set of possible panic values, switch it such that the json package only captures panic values that have been properly wrapped in a jsonError struct. This ensures that only intentional panics originating from the json package are captured. Fixes #23012 Change-Id: I5e85200259edd2abb1b0512ce6cc288849151a6d Reviewed-on: https://go-review.googlesource.com/94019 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/json: error when trying to set an embedded pointer to unexported ↵Joe Tsai2017-12-061-19/+78
| | | | | | | | | | | | | | | | | | | | | | | | | struct types This CL reverts CL 76851 and takes a different approach to #21357. The changes in encode.go and encode_test.go are reverts that rolls back the changed behavior in CL 76851 where embedded pointers to unexported struct types were unilaterally ignored in both marshal and unmarshal. Instead, these fields are handled as before with the exception that it returns an error when Unmarshal is unable to set an unexported field. The behavior of Marshal is now unchanged with regards to #21357. This policy maintains the greatest degree of backwards compatibility and avoids silently discarding data the user may have expected to be present. Fixes #21357 Change-Id: I7dc753280c99f786ac51acf7e6c0246618c8b2b1 Reviewed-on: https://go-review.googlesource.com/82135 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* encoding/json: always ignore embedded pointers to unexported struct typesJoe Tsai2017-11-131-0/+19
| | | | | | | | | | | | | | | | | | | CL 60410 fixes a bug in reflect that allows assignments to an embedded field of a pointer to an unexported struct type. This breaks the json package because unmarshal is now unable to assign a newly allocated struct to such fields. In order to be consistent in the behavior for marshal and unmarshal, this CL changes both marshal and unmarshal to always ignore embedded pointers to unexported structs. Fixes #21357 Change-Id: If62ea11155555e61115ebb9cfa5305caf101bde5 Reviewed-on: https://go-review.googlesource.com/76851 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* encoding/json: permit encoding uintptr as a stringIan Lance Taylor2017-11-081-19/+23
| | | | | | | | | | | Fixes #22629 Change-Id: I31e85f9faa125ee0dfd6d3c5fa89334b00d61e6e Reviewed-on: https://go-review.googlesource.com/76530 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com> Reviewed-by: Joe Tsai <joetsai@google.com>
* encoding/json: disallow unknown fields in DecoderIvan Bertona2017-10-311-6/+82
| | | | | | | | | | | | | | | | | | | | | Add a DisallowUnknownFields flag to Decoder. DisallowUnknownFields causes the Decoder to return an error when the the decoding destination is a struct and the input contains object keys which do not match any non-ignored, public field the destination, including keys whose value is set to null. Note: this fix has already been worked on in 27231, which seems to be abandoned. This version is a slightly simpler implementation and is up to date with the master branch. Fixes #15314 Change-Id: I987a5857c52018df334f4d1a2360649c44a7175d Reviewed-on: https://go-review.googlesource.com/74830 Reviewed-by: Joe Tsai <joetsai@google.com> Run-TryBot: Joe Tsai <joetsai@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
* all: prefer bytes.IndexByte over bytes.IndexMarvin Stenger2017-09-271-1/+1
| | | | | | | | | | | | | bytes.IndexByte can be used wherever the second argument to strings.Index is exactly one byte long, so we do that with this change. This avoids generating unnecessary string symbols/converison and saves a few calls to bytes.Index. Change-Id: If31c775790e01edfece1169e398ad6a754fb4428 Reviewed-on: https://go-review.googlesource.com/66373 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* all: spell "marshal" and "unmarshal" consistentlyDmitri Shuralyov2016-11-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tree is inconsistent about single l vs double l in those words in documentation, test messages, and one error value text. $ git grep -E '[Mm]arshall(|s|er|ers|ed|ing)' | wc -l 42 $ git grep -E '[Mm]arshal(|s|er|ers|ed|ing)' | wc -l 1694 Make it consistently a single l, per earlier decisions. This means contributors won't be confused by misleading precedence, and it helps consistency. Change the spelling in one error value text in newRawAttributes of crypto/x509 package to be consistent. This change was generated with: perl -i -npe 's,([Mm]arshal)l(|s|er|ers|ed|ing),$1$2,' $(git grep -l -E '[Mm]arshall' | grep -v AUTHORS | grep -v CONTRIBUTORS) Updates #12431. Follows https://golang.org/cl/14150. Change-Id: I85d28a2d7692862ccb02d6a09f5d18538b6049a2 Reviewed-on: https://go-review.googlesource.com/33017 Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/json: fix decoding of null into Unmarshaler, TextUnmarshalerRuss Cox2016-10-171-32/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Define behavior for Unmarshal of JSON null into Unmarshaler and TextUnmarshaler. Specifically, an Unmarshaler will be given the literal null and can decide what to do (because otherwise json.RawMessage is impossible to implement), and a TextUnmarshaler will be skipped over (because there is no text to unmarshal), like most other inappropriate types. Document this in Unmarshal, with a reminder in UnmarshalJSON about handling null. 2. Test all this. 3. Fix the TextUnmarshaler case, which was returning an unmarshalling error, to match the definition. 4. Fix the error that had been used for the TextUnmarshaler, since it was claiming that there was a JSON string when in fact the problem was NOT having a string. 5. Adjust time.Time and big.Int's UnmarshalJSON to ignore null, as is conventional. Fixes #9037. Change-Id: If78350414eb8dda712867dc8f4ca35a9db041b0c Reviewed-on: https://go-review.googlesource.com/30944 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/json: handle misspelled JSON literals in ,stringRuss Cox2016-10-131-0/+14
| | | | | | | | | | Fixes #15146. Change-Id: I229611b9cc995a1391681c492c4d742195c787ea Reviewed-on: https://go-review.googlesource.com/30943 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/json: add struct and field name to UnmarshalTypeError messageJirka Daněk2016-10-051-7/+35
| | | | | | | | | | | | | | The UnmarshalTypeError has two new fields Struct and Field, used when constructing the error message. Fixes #6716. Change-Id: I67da171480a9491960b3ae81893770644180f848 Reviewed-on: https://go-review.googlesource.com/18692 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* encoding/json: use standard ES6 formatting for numbers during marshalRuss Cox2016-10-051-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Change float32/float64 formatting to use non-exponential form for a slightly wider range, to more closely match ES6 JSON.stringify and other JSON generators. Most notably: 1e20 now formats as 100000000000000000000 (previously 1e+20) 1e-6 now formats as 0.000001 (previously 1e-06) 1e-7 now formats as 1e-7 (previously 1e-07) This also brings the int64 and float64 formatting in line with each other, for all shared representable values. For example both int64(1234567) and float64(1234567) now format as "1234567", where before the float64 formatted as "1.234567e+06". The only variation now compared to ES6 JSON.stringify is that Go continues to encode negative zero as "-0", not "0", so that the value continues to be preserved during JSON round trips. Fixes #6384. Fixes #14135. Change-Id: Ib0e0e009cd9181d75edc0424a28fe776bcc5bbf8 Reviewed-on: https://go-review.googlesource.com/30371 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/json: additional tests and fixes for []typedByte encoding/decodingRuss Cox2016-05-241-9/+203
| | | | | | | | | | | | | | | | | | | | | | CL 19725 changed the encoding of []typedByte to look for typedByte.MarshalJSON and typedByte.MarshalText. Previously it was handled like []byte, producing a base64 encoding of the underlying byte data. CL 19725 forgot to look for (*typedByte).MarshalJSON and (*typedByte).MarshalText, as the marshaling of other slices would. Add test and fix for those. This CL also adds tests that the decoder can handle both the old and new encodings. (This was true even in Go 1.6, which is the only reason we can consider this not an incompatible change.) For #13783. Change-Id: I7cab8b6c0154a7f2d09335b7fa23173bcf856c37 Reviewed-on: https://go-review.googlesource.com/23294 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Andrew Gerrand <adg@golang.org>
* encoding/json: support maps with integer keysCaleb Spare2016-05-101-1/+90
| | | | | | | | | | | | | | | | | | | | This change makes encoding and decoding support integer types in map keys, converting to/from JSON string keys. JSON object keys are still sorted lexically, even though the keys may be integer strings. For backwards-compatibility, the existing Text(Un)Marshaler support for map keys (added in CL 20356) does not take precedence over the default encoding for string types. There is no such concern for integer types, so integer map key encoding is only used as a fallback if the map key type is not a Text(Un)Marshaler. Fixes #12529. Change-Id: I7e68c34f9cd19704b1d233a9862da15fabf0908a Reviewed-on: https://go-review.googlesource.com/22060 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/json: allow non-string type keys for (un-)marshalAugusto Roman2016-04-051-18/+44
| | | | | | | | | | | | | | | | This CL allows JSON-encoding & -decoding maps whose keys are types that implement encoding.TextMarshaler / TextUnmarshaler. During encode, the map keys are marshaled upfront so that they can be sorted. Fixes #12146 Change-Id: I43809750a7ad82a3603662f095c7baf75fd172da Reviewed-on: https://go-review.googlesource.com/20356 Run-TryBot: Caleb Spare <cespare@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* encoding/json, internal/testenv: use FatalfDominik Honnef2016-03-221-1/+1
| | | | | | Change-Id: I64dd09e76d811000a914776fdad47808e3895690 Reviewed-on: https://go-review.googlesource.com/20989 Reviewed-by: Dave Cheney <dave@cheney.net>
* encoding/json: use reflect.SetBytes when decoding bytesHåvard Haugen2015-11-141-0/+21
| | | | | | | | | | | | | This allows slices of custom types with byte as underlying type to be decoded, fixing a regression introduced in CL 9371. Fixes #12921. Change-Id: I62a715eaeaaa912b6bc599e94f9981a9ba5cb242 Reviewed-on: https://go-review.googlesource.com/16303 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* encoding/json: check for exported fields in embedded structsMarcel van Lohuizen2015-10-261-2/+14
| | | | | | | | | | Addresses issue #12367. Must be checked in before CL 14010. Change-Id: I7233c3a62d4f55d0ac7e8a87df5fc4ee7beb7207 Reviewed-on: https://go-review.googlesource.com/14011 Reviewed-by: Russ Cox <rsc@golang.org>
* encoding/json: spell "marshaling" and "unmarshaling" consistentlyAndrew Gerrand2015-09-231-2/+2
| | | | | | | | | Fixes #12431 Change-Id: I67c42bf2cd9285f471387248fd9c22a16b158349 Reviewed-on: https://go-review.googlesource.com/14150 Reviewed-by: Dmitri Shuralyov <shurcool@gmail.com> Reviewed-by: Rob Pike <r@golang.org>
* encoding/json: scanner: use byte, more consistentMarvin Stenger2015-09-211-1/+1
| | | | | | | | | | | | | The fields step and redoState of struct scanner are now defined as `func(s *scanner, c byte) int` instead of `func(s *scanner, c int) int`, since bytes are sufficient. Further changes improve the consistency in the scanner.go file. Change-Id: Ifb85f2130d728d2b936d79914d87a1f0b5c6ee7d Reviewed-on: https://go-review.googlesource.com/14801 Reviewed-by: Andrew Gerrand <adg@golang.org> Run-TryBot: Andrew Gerrand <adg@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* encoding/json: revert "fix decoding of JSON null values"Russ Cox2015-07-301-75/+1
| | | | | | | | | | | | Fixes #11912. Fixes #11937. This reverts commit 1a99ba55df902a2657d1ccfc52a60024c22dba98. Change-Id: I32b76053fdabc59f28ca5bedf1b15c0baa8afae1 Reviewed-on: https://go-review.googlesource.com/12893 Reviewed-by: Didier Spezia <didier.06@gmail.com> Reviewed-by: David Crawshaw <crawshaw@golang.org>
* encoding/json: fix decoding of JSON null valuesDidier Spezia2015-07-221-1/+75
| | | | | | | | | | | | | | | | | | | | | JSON decoding currently fails for null values bound to any type which does implement the JSON Unmarshaler interface without checking for null values (such as time.Time). It also fails for types implementing the TextUnmarshaler interface. The expected behavior of the JSON decoding engine in such case is to process null by keeping the value unchanged without producing any error. Make sure null values are handled by the decoding engine itself, and never passed to the UnmarshalText or UnmarshalJSON methods. Fixes #9037 Change-Id: I261d85587ba543ef6f1815555b2af9311034d5bb Reviewed-on: https://go-review.googlesource.com/9376 Reviewed-by: Russ Cox <rsc@golang.org>
* encoding/json: document and test overwrite of slice, map during UnmarshalRuss Cox2015-07-151-0/+9
| | | | | | | | Fixes #8837. Change-Id: Iaaecbb0b324004cb74b16b764126b01315e6a16e Reviewed-on: https://go-review.googlesource.com/12209 Reviewed-by: Andrew Gerrand <adg@golang.org>
* encoding/json: fix out of phase error unmarshaling non-string into ↵Russ Cox2015-07-151-0/+25
| | | | | | | | | | TextUnmarshaler Fixes #9650. Change-Id: I45b879124691e485b86c1e99a3227032283850d2 Reviewed-on: https://go-review.googlesource.com/12208 Reviewed-by: Andrew Gerrand <adg@golang.org>