# Why are identifier encoding tests failing?

I am deploying a member node, and the identifier encoding tests in the web tester are failing for some identifiers, but not others. For example:

AssertionError: http://127.0.0.1/mn/v1 Failed 1 or more identifier encoding tests Node Test Summary for node: http://127.0.0.1/mn/v1
Test 0: OK : common-unicode-ascii-safe-ABCDEFGHIJKLMNOPQRSTUVWXYZ
Test 1: OK : common-unicode-ascii-safe-abcdefghijklmnopqrstuvwxyz
Test 2: OK : common-unicode-ascii-safe-0123456789
Test 3: OK : common-unicode-ascii-safe-:@$-_.!*()',~ Test 4: OK : common-unicode-ascii-safe-unreserved-._~ Test 5: OK : common-unicode-ascii-safe-sub-delims-$!*()',
Test 6: OK : common-unicode-ascii-safe-gen-delims-:@
Test 7: OK : common-unicode-ascii-escaped-"#<>[]^{}|
Test 8: OK : common-unicode-ascii-escaped-tomcatBlocked-\
Test 9: OK : common-unicode-ascii-escaped-tomcatBlocked-%5C
Test 10: OK : common-unicode-ascii-semi-colon-test-%3B
Test 11: OK : common-unicode-ascii-escaped-%
Test 12: OK : common-unicode-ascii-escape-anyway-+
Test 13: OK : path-unicode-ascii-safe-&=&=
Test 14: OK : path-unicode-ascii-escaped-;
Test 15: OK : path-unicode-ascii-escaped-?
Test 16: Error:: ServiceFailure: 0000: NON-D1-EXCEPTION: status: 404 response headers:
Vary = Accept-Encoding
Date = Tue, 02 Apr 2013 18:49:5...: path-unicode-ascii-escaped-/
Test 17: OK : path-unicode-ascii-escaped-%3F
Test 18: OK : path-unicode-ascii-escaped-%2F
Test 19: Error:: ServiceFailure: 0000: NON-D1-EXCEPTION: status: 404 response headers:
Vary = Accept-Encoding
Date = Tue, 02 Apr 2013 18:49:5...: path-unicode-ascii-escaped-double-//case
Test 20: Error:: ServiceFailure: 0000: NON-D1-EXCEPTION: status: 404 response headers:
Vary = Accept-Encoding
Date = Tue, 02 Apr 2013 18:49:5...: path-unicode-ascii-escaped-double-trailing//
Test 21: OK : path-unicode-ascii-escaped-double-%2F%2Fcase
Test 22: OK : path-unicode-ascii-escaped-double-trailing%2F%2F
Test 23: OK : common-unicode-bmp-1byte-escaped-¡¢£
Test 24: OK : common-unicode-bmp-2byte-escaped-䦹䦺
Test 25: OK : common-ascii-doc-example-urn:lsid:ubio.org:namebank:11815
Test 26: Error:: ServiceFailure: 0000: NON-D1-EXCEPTION: status: 404 response headers:
Vary = Accept-Encoding
Date = Tue, 02 Apr 2013 18:49:5...: path-ascii-doc-example-10.1000/182
Test 27: Error:: ServiceFailure: 0000: NON-D1-EXCEPTION: status: 404 response headers:
Vary = Accept-Encoding
Date = Tue, 02 Apr 2013 18:49:5...: path-ascii-doc-example-http://example.com/data/mydata?row=24
Test 28: OK : common-bmp-doc-example-ฉันกินกระจกได้
Test 29: OK : common-bmp-doc-example-Is_féidir_liom_ithe_gloine


I can't make sense of the all of the test output. What's going on?

edit retag close merge delete

Sort by » oldest newest most voted

In general, the main culprits are either how your code handles the identifier (whether it can handle unicode characters properly), or how the web server handles the URL.

Looking at the tests above, I see that all of the identifiers of those that fail have a "/" (forward slash) in them, so there is a problem with how these identifiers are interpreted by the web service. Many web servers for security reasons alter URLs such as foo/../../../somewhere/outside/web/context to maintain the context of the request, so that would be my leading guess of where the problem lies.

Checking the exception text:

ServiceFailure: 0000: NON-D1-EXCEPTION: status: 404 response headers:
Vary = Accept-Encoding
Date = Tue, 02 Apr 2013 18:49:5...: path-ascii-doc-example-10.1000/182


it seems an unexpected response was received - so either the service returns non-dataone exceptions in some cases, or the request never reached the dataone service and the web server returned it.

Putting it all together, my hunch is that it is the latter - the web server itself is failing to pass the request on to the DataONE handler, and returning a standard response. I would look at how your web server is handling security for these types of requests.

If you are using apache/tomcat, try adding the following lines to catalina.properties

org.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH=true
`

more

We're using mod_wsgi/Apache. We updated the Apache config to handle the forward slashes with:

AllowEncodedSlashes

Should be good.

more

Hope you don't mind, I changed the correct answer to the original explanation, since it seemed to address the question asked (which led you to post the fix for your apache server). I think technically, your post should have been a comment to my solution, offering extra information.

( 2013-06-12 18:06:48 -0600 )edit