谷歌搜索结果中的“真实”文件链接?

我经常使用 Google 搜索文档(主要是 PDF)。但是当我右键单击链接,或者只是将鼠标光标悬挂在它上面。我得到的不是真正的链接,而是一些冗长和令人困惑的东西,如下:

http://www.google.com/url?sa=t&source=web&cd=1&ved=0CCUQFjAA&url=http%3A%2F%2Fwww.marxists.org%2Freference%2Farchive%2Feinstein%2Fworks%2F1910s%2Frelative%2Frelativity.pdf&ei=Fai1TZq-Acugtgenw6DqDg&usg=AFQjCNFzYOTqpf68rQnuwW9K7wp39WL6Rg&sig2=z4RqvOLEEJsPohBqr1ghxQ

我不知道这是什么,但我知道这种无稽之谈不是我想要的,我想要真正的链接(对于上面的一个: http://www.marxists.org/reference/archive/einstein/works/1910s/relative/relativity.pdf) ,而不是与谷歌的干预。

我如何得到“真正的”链接文件在谷歌搜索结果?

37383 次浏览

The URL is right here:

&url=http%3A%2F%2Fwww.marxists.org%2Freference%2Farchive%2Feinstein%2Fworks%2F1910s%2Frelative%2Frelativity.pdf

Just unescape it with some language, like Python:

>>> import urllib
>>> print urllib.unquote('http%3A%2F%2Fwww.marxists.org%2Freference%2Farchive%2Feinstein%2Fworks%2F1910s%2Frelative%2Frelativity.pdf')
http://www.marxists.org/reference/archive/einstein/works/1910s/relative/relativity.pdf

So to extract the URL from a Google url, here's a script to do so:

import urllib


url = raw_input('What is the Google url? ')
url = url[url.find('&url=') + 5:]
url = url[:url.find('&')]


print urllib.unquote(url)

When I look up this search in Internet Explorer I do indeed get this link

But when I use Chrome, I get what you want. So it seems to be an IE feature, or at least have something to do with the browser you are using. If you are in the position to change browsers, I would consider using chrome (tested, gives normal URL) or opera (tested, normal url) but not firefox (tested, gives funky url)

it a long link because Google wants to keep track of who found what, and actually clicked on a search result...

if you want the real link (the above is also a real link!)

type this on your linkx-prompt:

php -r "print urldecode('http://www.google.com/url?sa=t&source=web&cd=1&ved=0CCUQFjAA&url=http%3A%2F%2Fwww.marxists.org%2Freference%2Farchive%2Feinstein%2Fworks%2F1910s%2Frelative%2Frelativity.pdf&ei=Fai1TZq-Acugtgenw6DqDg&usg=AFQjCNFzYOTqpf68rQnuwW9K7wp39WL6Rg&sig2=z4RqvOLEEJsPohBqr1ghxQ');" | awk -F'&' '/url=/{ print $5 }'

See this tool

http://www.duvidasdeinformatica.com/blog/limpar-links-paginas-resultados-google/

It's in portuguese, but at the bottom you have a box where you can copy/paste the url, and it get's "converted" to the real one...

I think I read once, while having the same frustration, that it masks the actual URLs ONLY when you're logged into your google account and your accounts settings are configured for web history tracking.

IF my memory serves me correctly, you could try: - performing the search in a separate browser window using your browsers native "private" or "incognito" browsing feature - simply log out of your google account, get your results and log back in - go to google.com/history and click "Pause", which prevents future web activity from being saved, and then return to the same page after grabbing your results and click "Resume" (if you intend to use Web History).

If this sort of activity is something where you would routinely want to grab multiple URLs from the results and the above technique doesn't work as I recall, you can try something like an add-on to firefox, such as Copy Link URL, which provides the ability to copy the URLs of links you select which you could then paste into a text editor and replace the encoded elements with a Find & Replace.

Or, you could perhaps do a little research to find a website that will decode the URL for you. I found URL Deobfuscator on webtoolhub.com that does a good job of making the main / desired URL available for copy/paste by decoding the encoded characters, removing query strings, etc.

Cheers.

Doing a little google searching and ran across the Firefox add-on called LinkWalker.

Simple context menu utility for links which decodes embedded and cloaked URLs, strips off query-string parameters and converts text selections to clickable link.

Sounds like that could do the trick.

Maybe this is not the best solution, but here's one way that doesn't require coding or add-ons for Chrome and Firefox. Assume there are similar ways to do this for IE and others, though at least IE will usually open PDFs in the browser with the link in the url bar at the top which is easy enough to copy.

  1. Click on the search result, which should download the PDF.

  2. Now in your browser open the list of recent downloads

  • Chrome, Ctrl+J
  • Firefox on Linux(?), it's Ctrl+Shift+Y
  1. Now copy the link
  • Chrome: Right click on the URL listed beneath the name of the file and select "Copy Link Address"
  • Firefox: Right click on the file and select "Copy Download Link"

EDIT: As of December 2020, and probably earlier, Chrome shows you a clean, copyable URL in the search results.

I'm using a Firefox extension named Google/Yandex search link fix, it works just great and allows direct copy of the link target

From a comment in @Blender answer, I've learned how to install a User Script in Firefox and Chrome.

Now, when right clicking and copying a URL in Google search results, I get the real link instead of that rubbish (sorry, Google, I know you love us, but we don't need no stinky tracking URLs).

At first, I used googlePrivacy as suggested by @naxa, but it's bugging nowadays. The script provided in Web Applicatations SE, Turning off Google search results indirection, does the work. It has User Script and Extension flavors:

Bellow the info on how to proceed with the User Script.

Installing the UserScript

In Chrome, I installed it using Tampermonkey.

tampermonkey

And Greasemonkey in Firefox.

greasemonkey

Results

Before the UserScript

ugly google

After

cool google


Related post in Web Applications:

I've created a simple web site that cleans Google search result URLs:

URL Clean

URLs copied from Google search results (such as links to PDFs) are more complicated than they need to be. This tool removes the unnecessary parts, leaving the page's original URL.