Why does the url in the request of a decidePolicyForNavigationAction return http://localhost:8888/%E2%80%9Chttp://example.com%E2%80%9D


Good morning SO,

I have a mac webView (Not iOS UIWebView) that loads with url pointing to http://localhost:8888. The loaded html contains links and iframes but when an iframe loads, or a link is clicked the main page attempts to load http://localhost:8888/%E2%80%9Chttp://example.com%E2%80%9D and then reloads the original page http://localhost:8888.. What's even stranger is that the opened link or iframe indeed tries to load http://localhost:8888/%E2%80%9Chttp://example.com%E2%80%9D.

I tried to handle this with decidePolicyForNavigationAction but the [request URL] value arrives that way and parsing it to extract the embedded url seems way hackish.

Any ideas on to why the webView builds url's this way and how to handle loading iframes and links correctly?

- (void)webView:(WebView *)webView
decidePolicyForNavigationAction:(NSDictionary *)actionInformation
        request:(NSURLRequest *)request
          frame:(WebFrame *)frame
decisionListener:(id <WebPolicyDecisionListener>)listener
{
    NSLog(@"Navigating to %@", [request URL]); // returns http://localhost:8888/%E2%80%9Chttp://example.com%E2%80%9D
    [listener use];
}

- (void)webView:(WebView *)webView
decidePolicyForNewWindowAction:(NSDictionary *)actionInformation
        request:(NSURLRequest *)request
   newFrameName:(NSString *)frameName
decisionListener:(id < WebPolicyDecisionListener >)listener {
    if ([actionInformation objectForKey:WebActionElementKey]) {
        // Happens here also :(
        NSLog(@"Opening in browser %@", [request URL]);
        [[NSWorkspace sharedWorkspace] openURL:[request URL]];
    }
}

Answers:


Note that your URLs have the escape sequences %E2%80%9C and %E2%80%9D wrapping them. These are URI escape sequences for and respectively. Those look like ", but are slightly different characters for "opening" and "closing" double-quotes. When the webView processes the HTML and looks for the attribute value after href, the webView does not find a double- or single-quote wrapping the attribute value as expected. It finds this other character instead. Since the quote is not an expected HTML attribute quoting character, the webView interprets it as part of the URL. Since the URL doesn't begin with either a known protocol (http://, https://, etc.) or a URL root (/), the webView interprets the URL as relative to your given URL (http://localhost:8888/). To make it a valid URL (specific ASCII characters), it escapes the two non-ASCII rich quote characters. That is why they show up in your URL as %E2%80%9C and %E2%80%9D instead of and . More succinctly:

  • href=“http://example.com” is not the same as href="http://example.com"
  • the pulled URL is “http://example.com”, not http://example.com
  • the URL is relative, so it becomes http://localhost:8888/“http://example.com”
  • the URL has to be ASCII-safe, so the non-ASCII characters get encoded, resulting in http://localhost:8888%E2%80%9Chttp://example.com%E2%80%9D

Most likely, someone copy and pasted the HTML from something like a Word document or other rich text formatter, which often turns simple double-quotes into "richer" formatted double-quotes. Replace the rich quotes with proper double- or single- quotes and your links will begin working as expected.