Web Clipper 2.0

Hello everybody, we have something new for you to test: a new version of Bear’s web clipper!

TL;DR

The new Web Clipper doesn’t use a remote server like the old version. Instead, it extracts content right in your browser, making it faster, more private, and more reliable with password-protected or local sites. You’ll find instructions for testing in the last section.

(The old) Web clipper 1.0

The web clipper 1.0 relies on Postlight Parser, a Node.js library that extracts meaningful content from web pages (no navigations, footers, menus, ads…). The Parser lives on an Amazon S3 Lambda server, so right now, when you click the browser extension, this is what’s happening:

  • The URL is fetched from the browser and transferred to Bear.
  • Bear passes the URL to the Parser and waits for the cleaned content version of the page.
  • Bear receives the content, downloads the images, converts everything to Markdown, and generates a note.
    The web clipper 1.0 has some downsides:
  • Websites protected by passwords or on local networks can’t be reached by the Parser.
  • Some websites use redirects or load portions of the page after its first loading, making it impossible to reach the desired content from the Lambda.
  • Other websites use JavaScript libraries to load images asynchronously, and it is not possible to reach their URLs.
  • The entire process feels less safe because a remote server is involved.

(The new) Web Clipper 2.0

With the new web clipper, the meaningful content extraction happens in the browser using Mozilla’s readability lib with the HTML code currently displayed. Then the HTML code is transferred to Bear, which fetches the images and converts them into Markdown. This means:

  • Websites protected by passwords or on local networks can be transferred to Bear.
  • Redirects have already happened at the moment the web clipper is invoked.
  • Images and text loaded asynchronously are already there (with some exceptions, see below).
  • Last but not least, everything is done on the device.

Limitations

  • Some websites load images and texts on scroll. Those remain a problem unless the page is scrolled to the bottom before invoking the extension.
  • Both the old and the new clipper have some custom behavior for some specific URLs. Adding/modifying those behaviors now requires an extension update.

How to test

We have prepared a beta version for you to test. If you have version 1.x installed, we suggest you unselect or disinstall it first.

Safari

Chrome

Firefox

  • Download this archive
  • about:debugging in Firefox
  • Click This Firefox
  • Click Load Temporary Add-on... and select the archive

Give it a try and let us know how it works for you. Feel free to share your feedback here with us!
The Bear Team

8 Likes

The Chrome download link is not working

Thanks. Links Fixed.

1 Like

FYI: The “Configure Safari in MacOS to run unsigned extensions” link also appears to be broken.

1 Like

I tried the Chrome version, which seems to be working pretty well. It’s way faster than the previous version.

The new web clipper still doesn’t capture the byline and date of the webpage, and MarkDownload extension does. But then, almost no other web clipper does this.

Thank you for your efforts!

Here is an example of MarkDownload and Bear 2.0 downloading a website:

Archive.zip (237.0 KB)

Please delete the last paragraph and the zip file after you download it. Thanks.

1 Like

Do you mean this?

1 Like

Exactly!! 20 characters.

I would not want that “feature” by default unless it was at the very end or in yaml. I prefer web clipped stuff to look like the article.

If possible I would appreciate that feature

+1.

I recently saved an article. Few days later I wanted to look up the author, but it’s not saved in Bear. And the article itself went behind a paywall and also doesn’t even show the author anymore. Information lost :pensive_face:

So far i hadn’t enough time to test the new web clipper. One point that is going to make the testing tedious: by clicking on the icon in safari toolbar sometimes the whole page is saved, sometimes just the title and url only. It depends of the website. I am forced to use the context menu

Very thankful you guys are working on this one. I’ll give it a spin.

Yes! I would love to have this too!

1 Like

Outstanding. The new one, is MUCH better!