-
-
Notifications
You must be signed in to change notification settings - Fork 590
Performance Problems #158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
…ization-branches. * FIX 2 forgotten test-case on resolver-URIs from split_scopes. x 1.8 faster in big referenced model.
…s_stack empty when iteration breaks (no detectable performance penalty). * Replace non-python-2.6 DefragResult with named-tuple. * Add test-case checking scopes_stack empty.
…e with hand-made stats-funcs.
Hi, I have also observed the same behavior with my schema, too many calls to the urllib. To have a common reference i created benchmarks that Julian has requested, and used the META_SCHEMAs along with a moderately large sample schema i'm using for my project. The new TC about ensuring the well-state of scopes-stacks in my Pull-Request passes for all python-version EXCEPT from the 'pypi' environment. It seems that the If that is the case, then, this new test-case (slightly-modified) should also fail with the old unoptimized code in |
Forgot the most important: For complicated schemas the combined optimizations improve the performance almost by a factor x2. |
…_unroll_scopes' Although break_loop was not faster by itself, combined with the previous optimizations dropped even further the time: MAX FASTER x2!
…_unroll_scopes' * Mark BreakLoopException as private. Although break_loop was not faster by itself, combined with the previous optimizations dropped even further the time: MAX FASTER x2!
Is there any update on this? Currently jsonschema is a major performance bottleneck for a project I am working on. What's the status of those PRS? Anything I can do to help? I hope to look into the library in detail myself next week. |
Unfortunately I just haven't had the chance to really sit down and look at them :( -- at least one of them has changes that break public APIs, so those at least have to change. IIRC they also lack automated benchmarks. Beyond that I haven't gotten a chance to vet the changes there. Certainly including profile output if you're having performance issues would also help. |
@Julian alrighty, sadly the project is closed as its a work thing. I suspect the other people looking into this have the right idea. If I figure anything out ill make a PR or comment here. Perhaps I can generate a call graph using a similar but different schema |
* Improve statistics print-outs. * Engrave timming results in benchmark docstrings.
…ing into a list whe not null (instead of using a context-manager each time) Roughly x 1.5 faster
…g by keeping fragments separated from URL (and avoid redunant frag/defrag).
…g by keeping fragments separated from URL (and avoid redunant frag/defrag). Conflicts: jsonschema/tests/test_benchmarks.py issue python-jsonschema#158: Use try-finally to ensure resolver scopes_stack empty when iteration breaks (no detectable performance penalty). * Replace non-python-2.6 DefragResult with named-tuple. * Add test-case checking scopes_stack empty. Conflicts: jsonschema/tests/test_validators.py jsonschema/validators.py
* Improve statistics print-outs. * Engrave timming results in benchmark docstrings.
…ing into a list whe not null (instead of using a context-manager each time) Roughly x 1.5 faster
…g by keeping fragments separated from URL (and avoid redunant frag/defrag).
…s_stack empty when iteration breaks (no detectable performance penalty). * Replace non-python-2.6 DefragResult with named-tuple. * Add test-case checking scopes_stack empty.
…e with hand-made stats-funcs.
* dnephin/perf_cache_resolving: Use lru_cache Remove DefragResult. Remove context manager from ref() validation. Perf improvements by using a cache. Add benchmark script. Fix test failures issue #158: TRY to speed-up scope & $ref url-handling by keeping fragments separated from URL (and avoid redunant frag/defrag). Conflicts: jsonschema/tests/test_benchmarks.py
* perf_cache_resolving: Squashed 'json/' changes from 9208016..0b657e8 Need to preserve backwards compat for RefResolvers without the new methods. Pass in caches instead of arguments. I give up. Not deprecating these for now, just not used internally. Fix base_uri backwards compatibility. Er, green doesn't work on 2.6, and make running right out of a checkout easier. Wrong docstring. Add back assertions for backwards compat. Wait wat. Remove insanity. Probably should combine these at some point, but for now move them. Really run on the installed package. Begone py.test. Remove 3.3, use pip for installs, use green here too. lxml-cffi is giving obscure errors again. Fix a non-type in the docs. Switch to vcversioner, use repoze.lru only on 2.6, and add extras_require for format. Run tests on the installed package. Newer tox is slightly saner. It's hard to be enthusiastic about tox anymore. Use lru_cache Remove DefragResult. Remove context manager from ref() validation. Perf improvements by using a cache. Add benchmark script. Fix test failures issue #158: TRY to speed-up scope & $ref url-handling by keeping fragments separated from URL (and avoid redunant frag/defrag). Conflicts: jsonschema/tests/test_benchmarks.py
73a0593 Merge pull request python-jsonschema#169 from epoberezkin/draft6-tests-ajv e2e06d7 update ajv version, test bd14545 Merge branch 'master' into draft6-tests-ajv 8758156 Merge pull request python-jsonschema#172 from epoberezkin/bignum 19a0b46 Merge pull request python-jsonschema#171 from epoberezkin/id b8165b7 draft-06: option/bignum tests exclusiveMaximum/Minimum updated f04ed0e Merge branch 'master' into draft6-tests-ajv 0931c60 test: run draft-04/06 tests with ajv da8b14e Merge pull request python-jsonschema#170 from epoberezkin/boolean 20d706c Merge pull request python-jsonschema#163 from epoberezkin/boundary-point ae72865 Merge pull request python-jsonschema#168 from korzio/patch-1 f809e51 draft-06: $id keyword 04d7e06 Add djv validator to readme e86adb2 draft-06: $ref to boolean schemas d62b754 draft-06: allOf, anyOf, oneOf keywords with boolean schemas b714a18 draft-06: not keyword with boolean schemas b6b00f9 draft-06: contains keyword with boolean schemas 9c05d18 draft-06: propertyNames keyword with boolean schemas fa99135 draft-06: patternProperties keyword with boolean schemas 5a888a6 draft-06: dependencies keyword with boolean subschemas 53858ff draft-06: items keyword with boolean schemas afd6fab Merge pull request python-jsonschema#167 from json-schema-org/revert-123-relative-ref-id 3b9b688 Revert "Test relative reference resolution when ID is not present" 77e0411 draft-06: properties keyword with boolean schemas cff24c1 Merge pull request python-jsonschema#123 from yuloh/relative-ref-id e7c1f4e Merge pull request python-jsonschema#161 from epoberezkin/items 43bfc6b Merge pull request python-jsonschema#165 from epoberezkin/test-schema 6956f20 draft-06: boolean root schema e1139a3 update test-schema to draft-04 042fae9 Merge pull request python-jsonschema#164 from epoberezkin/zero-terminated 78de3a6 draft-06: zero-terminated float is a valid integer 9d998bb draft-04: added maximum/minimum tests for boundary point b8f51ab Merge pull request python-jsonschema#151 from epoberezkin/exclusive-limits 951bd41 Merge pull request python-jsonschema#154 from epoberezkin/property-names b974907 Merge pull request python-jsonschema#159 from epoberezkin/empty-property-list 9b1364e Merge pull request python-jsonschema#158 from epoberezkin/contains 95b4648 Merge pull request python-jsonschema#157 from epoberezkin/const 85efafb draft-06: required and dependencies with empty property arrays 7e40f2e draft-06: exclusiveMaximum and exclusiveMinimum validation 0058644 draft-06: propertyNames validation ba6c582 draft-06: const validation 9867b96 draft-06: contains keyword validation 96fed74 draft-04/06: updated descriptions for items/additionalItems test cases 0799212 Make the URI tests consistent with themselves. 65c18d5 draft-04/06: items/additionalItems tests 12ab007 Update to the new organization. a599e1e Flip protocol-relative URI Reference (not a URI) ed3391c Merge pull request python-jsonschema#150 from handrews/draft6 555dfd3 Initialize draft6 tests from draft4. 5363098 Merge remote-tracking branch 'pboettch/master' 2ba2657 Merge remote-tracking branch 'agebhar1/feature/java+everit-org' f7cec5a Merge remote-tracking branch 'agebhar1/feature/java+networknt' 16a8c1b Merge pull request python-jsonschema#145 from agebhar1/feature/java+json-schema-validator 6d29eb2 add `networknt/json-schema-validator` to test suite users b38690a add `everit-org/json-schema` to test suite users 5637345 update Java's `json-schema-validator` repository 5a2c708 Added 'Modern C++ JSON schema validator' 371977c Merge branch 'develop' e2618a0 Merge pull request python-jsonschema#142 from pipobscure/refOnly b24fae2 When $ref is present other keywords should be ignored cb25a4e Merge pull request python-jsonschema#134 from zloster/develop 1e9db5e Add optional check for non-ECMA 262 regular expressions efb3c89 Merge pull request python-jsonschema#127 from iainbeeston/extra-items-tests 27692ac Merge pull request python-jsonschema#133 from gavinwahl/develop fc7584e add postgres-json-schema 654cc26 Added tests for items when data contains more or less types than the schema 5fb3d9f Merge pull request python-jsonschema#125 from yuloh/ref-property 833a70c Merge pull request python-jsonschema#126 from gregsdennis/develop 302da1e Added Manatee.Json as .Net implementor. dc001c2 Add test for properties named $ref 6d366dc Test relative reference resolution 9355965 Merge pull request python-jsonschema#120 from seagreen/extract-test-schema 42ba0f9 Extract the schema for tests into a JSON file. 2f0d6e3 Merge pull request python-jsonschema#118 from yuloh/ignores-non-objects-required cc52997 Test required ignores non objects 27ed651 Merge pull request python-jsonschema#117 from tatut/develop c9f5dcd Add Clojure json-schema to users 8549899 Merge pull request python-jsonschema#116 from yuloh/patch-1 de70a75 Add league/json-guard implementation for PHP 9723c8f Merge pull request python-jsonschema#111 from Relequestual/patch-1 9d5909a Also check string "1" is not a number in draft 4 e4ab5bd Check that a string of "1" is not a number 5dc71b8 Merge pull request python-jsonschema#108 from duksis/patch-1 6a054cd Adds ex_json_schema validator for Elixir f3d5aeb Merge branch 'develop' 5bcf11a Port to draft4. 3f7ed81 Merge pull request python-jsonschema#103 from Relequestual/patch-1 928d3b0 Fixed incorrect negative description of a sub test 62414e4 Minor shuffling. a7305a6 Merge remote-tracking branch 'legoktm/tox' into develop 5cc622c Merge pull request python-jsonschema#100 from atomiqio/develop 2f51b2e updated node.js info in README.md 8f29757 Add the README note on develop. 338eef3 Merge pull request python-jsonschema#97 from atomiqio/develop 146e291 JavaScript should written in upper camel case 7511038 Merge pull request python-jsonschema#95 from epoberezkin/develop ca28bbb added ajv validator to readme 504f776 Merge pull request python-jsonschema#93 from gelraen/readme 59207fd Add another Go implementation d14cf96 Use tox to run tests git-subtree-dir: json git-subtree-split: 73a05935f5418f4b59fb8084b4ffa6edf8a4eea0
27f8c84 Merge pull request #184 from remexre/master 8fc497e Changes draft06 tests to use draft06 metaschema. 583ecf9 Add name.json to the CLI tool's remotes. 67c7b4d Merge pull request #180 from bismark/patch-1 784198b Update erlang URL 05fdba4 Merge pull request #173 from epoberezkin/format e8ef6fd draft-06: additional format tests d545553 Merge branch 'master' into format bc4de6c Merge pull request #174 from epoberezkin/zero-term f455ecc Merge branch 'master' into zero-term 5f6abf7 draft-06: zero-terminated float test description c1b12bf Merge pull request #160 from epoberezkin/ref-tests 25836f7 update draft-06 tests: "id" -> "$id" 671bad6 fix: JavaScript test 44f6447 Merge branch 'master' into ref-tests 73a0593 Merge pull request #169 from epoberezkin/draft6-tests-ajv e2e06d7 update ajv version, test bd14545 Merge branch 'master' into draft6-tests-ajv cbe0e5b Merge branch 'master' into ref-tests 8758156 Merge pull request #172 from epoberezkin/bignum 19a0b46 Merge pull request #171 from epoberezkin/id e1e1eec draft-06: format "json-pointer" tests 1e2834c draft-06: format "uri-template" tests 21b776e draft-06: format "uri-reference" tests b8165b7 draft-06: option/bignum tests exclusiveMaximum/Minimum updated f04ed0e Merge branch 'master' into draft6-tests-ajv 0931c60 test: run draft-04/06 tests with ajv da8b14e Merge pull request #170 from epoberezkin/boolean 20d706c Merge pull request #163 from epoberezkin/boundary-point ae72865 Merge pull request #168 from korzio/patch-1 f809e51 draft-06: $id keyword 04d7e06 Add djv validator to readme e86adb2 draft-06: $ref to boolean schemas d62b754 draft-06: allOf, anyOf, oneOf keywords with boolean schemas b714a18 draft-06: not keyword with boolean schemas b6b00f9 draft-06: contains keyword with boolean schemas 9c05d18 draft-06: propertyNames keyword with boolean schemas fa99135 draft-06: patternProperties keyword with boolean schemas 5a888a6 draft-06: dependencies keyword with boolean subschemas 53858ff draft-06: items keyword with boolean schemas afd6fab Merge pull request #167 from json-schema-org/revert-123-relative-ref-id 3b9b688 Revert "Test relative reference resolution when ID is not present" 77e0411 draft-06: properties keyword with boolean schemas cff24c1 Merge pull request #123 from yuloh/relative-ref-id e7c1f4e Merge pull request #161 from epoberezkin/items 43bfc6b Merge pull request #165 from epoberezkin/test-schema 6956f20 draft-06: boolean root schema e1139a3 update test-schema to draft-04 042fae9 Merge pull request #164 from epoberezkin/zero-terminated 78de3a6 draft-06: zero-terminated float is a valid integer 9d998bb draft-04: added maximum/minimum tests for boundary point b8f51ab Merge pull request #151 from epoberezkin/exclusive-limits 951bd41 Merge pull request #154 from epoberezkin/property-names b974907 Merge pull request #159 from epoberezkin/empty-property-list 9b1364e Merge pull request #158 from epoberezkin/contains 95b4648 Merge pull request #157 from epoberezkin/const 85efafb draft-06: required and dependencies with empty property arrays 7e40f2e draft-06: exclusiveMaximum and exclusiveMinimum validation 0058644 draft-06: propertyNames validation ba6c582 draft-06: const validation 9867b96 draft-06: contains keyword validation 96fed74 draft-04/06: updated descriptions for items/additionalItems test cases 0799212 Make the URI tests consistent with themselves. 65c18d5 draft-04/06: items/additionalItems tests 3186761 draft-04/06: root ref "#" in remote ref 0178c74 draft-04/06: recursive refs test a884acc draft-04/06: base URI change tests 12ab007 Update to the new organization. a599e1e Flip protocol-relative URI Reference (not a URI) ed3391c Merge pull request #150 from handrews/draft6 555dfd3 Initialize draft6 tests from draft4. 5363098 Merge remote-tracking branch 'pboettch/master' 2ba2657 Merge remote-tracking branch 'agebhar1/feature/java+everit-org' f7cec5a Merge remote-tracking branch 'agebhar1/feature/java+networknt' 16a8c1b Merge pull request #145 from agebhar1/feature/java+json-schema-validator 6d29eb2 add `networknt/json-schema-validator` to test suite users b38690a add `everit-org/json-schema` to test suite users 5637345 update Java's `json-schema-validator` repository 5a2c708 Added 'Modern C++ JSON schema validator' 371977c Merge branch 'develop' e2618a0 Merge pull request #142 from pipobscure/refOnly b24fae2 When $ref is present other keywords should be ignored cb25a4e Merge pull request #134 from zloster/develop 1e9db5e Add optional check for non-ECMA 262 regular expressions efb3c89 Merge pull request #127 from iainbeeston/extra-items-tests 27692ac Merge pull request #133 from gavinwahl/develop fc7584e add postgres-json-schema 654cc26 Added tests for items when data contains more or less types than the schema 5fb3d9f Merge pull request #125 from yuloh/ref-property 833a70c Merge pull request #126 from gregsdennis/develop 302da1e Added Manatee.Json as .Net implementor. dc001c2 Add test for properties named $ref 6d366dc Test relative reference resolution 9355965 Merge pull request #120 from seagreen/extract-test-schema 42ba0f9 Extract the schema for tests into a JSON file. 2f0d6e3 Merge pull request #118 from yuloh/ignores-non-objects-required cc52997 Test required ignores non objects 27ed651 Merge pull request #117 from tatut/develop c9f5dcd Add Clojure json-schema to users 8549899 Merge pull request #116 from yuloh/patch-1 de70a75 Add league/json-guard implementation for PHP 9723c8f Merge pull request #111 from Relequestual/patch-1 9d5909a Also check string "1" is not a number in draft 4 e4ab5bd Check that a string of "1" is not a number 5dc71b8 Merge pull request #108 from duksis/patch-1 6a054cd Adds ex_json_schema validator for Elixir f3d5aeb Merge branch 'develop' 5bcf11a Port to draft4. 3f7ed81 Merge pull request #103 from Relequestual/patch-1 928d3b0 Fixed incorrect negative description of a sub test 62414e4 Minor shuffling. a7305a6 Merge remote-tracking branch 'legoktm/tox' into develop 5cc622c Merge pull request #100 from atomiqio/develop 2f51b2e updated node.js info in README.md 8f29757 Add the README note on develop. 338eef3 Merge pull request #97 from atomiqio/develop 146e291 JavaScript should written in upper camel case 7511038 Merge pull request #95 from epoberezkin/develop ca28bbb added ajv validator to readme 504f776 Merge pull request #93 from gelraen/readme 59207fd Add another Go implementation d14cf96 Use tox to run tests git-subtree-dir: json git-subtree-split: 27f8c840acdb3700de7173981e95ed90ef28de2e
Hey,
So I've been using jsonschema for basic json validation of events before processing. However, one thing that I've noticed is that urlparse and urljoin are consistently among the hot spots in my code base. After tracing it down I've found that a lot of it comes from RefResolver.resolving and RefResolver.resolving.
It seems like a couple of optimizations could be really handy:
I'm not entirely sure how to go about handling in_scope, but in my application it seems like resolution scope always becomes the raw scope.
The next spot that seems to take up a bunch of time is in iter_errors. There aren't any validation errors, but this seems like it's just a matter of applying the schema to the object and detecting any errors. However, in several places it seems like it's wrapped in a list, which means you lose the ability to grab the first error and bail out.
The text was updated successfully, but these errors were encountered: