Thursday, August 12, 2010

Xmarks for Firefox, now with History Sync

Today, we're happy to announce that the newest version of our Firefox extension adds support for opt-in synchronization of browser history, rounding out a lineup that includes synchronization of bookmarks, passwords, and open tabs. This latest version also includes beta support for Firefox 4. We'll be rolling this out to the entire userbase in the coming weeks, but if you're interested in a sneak preview, you can download an early version here:



To enable History Sync, navigate to the "Sync" pane of the Xmarks Settings dialog (Tools → Xmarks Settings → Sync) and click the checkbox next to History:





Do the same thing on two or more computers running Xmarks for Firefox, and those computers will automatically and silently start exchanging history information. Easy!



If you're interested in understanding more deeply how it works and some of the design trade-offs we faced to make this happen, read on below the fold. And feel free to let us know how it works for you in the comments.





History Sync vs. Bookmark Sync





When we first started thinking about history sync, we naturally viewed it as being very similar to bookmark sync, and contemplated applying our existing Bookmark Sync Server to the task. Bookmarks and History being so similar in their structure, this seemed like an easy job. And our Bookmark Sync Server is battle-hardened and extremely reliable. Have you ever noticed that when you have a hammer every problem looks like a nail?



The problem crops up when you consider the volume of data generated by millions of users surfing and syncing their history. How many pages do you visit for each bookmark that you create? 100? 1000? Our Bookmark Sync Server is scalable, but as a small company, scaling a free service by a factor of 1000 is really out of reach for us. So we needed another approach.



Trying Not To Boil the Ocean





While there are lots of similarities between syncing bookmarks and syncing history, there are plenty of differences too. We started to explore those differences, looking for a more economical approach to the problem. The first observation is that users manage bookmarks, but the browser manages history. The browser takes all kinds of liberties with your history (e.g., creating new entries as you browse, purging old entries as they expire) that would not be acceptable when dealing with bookmarks. That yielded the first insight: since users don't really "own" their history in the same way that they own their bookmarks, absolute fidelity is not critical for history. So we can probably deliver an acceptable experience without syncing everything in your history.



Next, the advent of the Awesome Bar in Firefox has really changed the way that users interact with browser history data. Power Users (who tend to be our biggest fans) learned soon after its introduction in Firefox 3 that the Awesome Bar allows them to get to their most frequently-visited sites just by typing a few letters into the address bar. The folks at Mozilla who designed the Awesome Bar spent time developing a clever algorithm to rank items in your history so that the most useful sites appear at the top of the drop-down list when you start typing.



The algorithm is driven by "Frecency" -- a combination of the frequency with which you visit a site and the recency with which you've visited it. It tends to bubble up to the top of the drop-down list those sites that you have either visited a lot or have visited recently.



That leads neatly to the second insight: if you're going to sync only a subset of the entire browser history, make sure that you pick the sites that are most frecent, because those are the sites that are most likely to appear in the Awesome Bar. A corollary here is that good history must be consistent with with frecency. That means that sync needs to accumulate visits to a site from all clients being synced so that frecency incorporates the sum total of all visits everywhere.



The final insight here is that, inasmuch as the browser purges expired items from your history as part of its regular operation, deletes don't really need to be synced. That is, I don't have to sync the deletion of old history items from one browser to another, because the other browser will probably purge those expired items on its own. That does neglect one use case that we decided wasn't so important: syncing the manual deletion of an item from history. We figured that was a rarity, and in any event, users with foresight can gain the same protection by switching the browser into private browsing mode.



So what's the result of this exploration? It's all about the Awesome Bar, and feeding it the right data so that it provides good results when users type. It's not about making sure that every change to history is synced to every other browser. We want to make sure that users can go back and forth between two computers they use regularly and get at sites that they visited recently on either computer, and also be able to set up a brand new computer and have the history get populated with the user's most frequently visited sites.



The Solution



Given these requirements, our Bookmark Server looks like overkill: it provides complete data fidelity, it versions every change that a user makes so that they can do backup & restore, etc. For this problem, we need something lighter.



Happily, we've got another tool in our shed: to deploy tab sync, we developed a fast, lightweight server that is well-matched to the problem of syncing open tabs. It turns out that it's rather well tuned for syncing history too. Here's how it works:




  • Periodically, the Xmarks addon queries Firefox's local history database, looking for the most frecent urls, a set that changes constantly as you browse. Having determined the set of frecent urls, it then finds actual visits made on this browser to those urls. It builds a compact representation of this data and sends it to the server.


  • Then it queries the server for history data from other browsers. If there is any, the addon downloads it and adds any as-yet unseen visit data to the local history database, making it appear as if you had made those visits on this browser.



That's it. In actual deployment, the key variable is the size of the data being pushed from each browser to the server: the larger the data, the more history can be exported from one browser to another. But more data means more bandwidth and comptutation (for us and you). We've currently set the limit at 32KB, which allows for typically 200 or so urls. So far, that looks like a decent compromise, but we welcome feedback. Do you tend to find the things you're looking for in the awesome bar?



In the coming weeks, look for history sync to make its way onto other platforms that we support.

18 comments:

  1. Thank you for the Firefox 4 support!

    ReplyDelete
  2. Nice. That's a good addition.

    We still need something for addon's. You don't have to auto install addon's, but on new computers, I'd like to install xmarks and then get at least a list of addon's that I've installed on my other machines. It can even be a bookmarks list that is auto updated when I install or uninstall addon's. That would make for a nice fast setup of new computers.

    Thanks!

    ReplyDelete
  3. Firefox 4 Beta at the end of the release. what intensity, what majesty. Thaks for the nice tips.

    ReplyDelete
  4. Xmarks works like a champ on FF4!

    ReplyDelete
  5. Been running History sync for a day and it still hasn't finished.

    Really looking forward to this working.

    ReplyDelete
  6. @beachbum: If it takes more than a few seconds, it probably isn't working. Please contact us through support so we can get more information from you and diagnose the problem.

    ReplyDelete
  7. thanks for the download, was a great help.

    justin

    ReplyDelete
  8. Wow, this is big!!! sync-ing history is one of the most welcome new feature of X-Marks... Last time I have to use MozBackup whenever I want to reinstall my machine, but now I can just use X-Marks... just hope u can raise the limit to, let's say about 500 urls. Would it be an overkill for the bandwidth? thank you.

    ReplyDelete
  9. Will there be a BYOS version for FF4.0b* too?

    ReplyDelete
  10. History backup is great news!!! Thanks for adding this feature.

    I hope that, eventually, you will allow for larger (complete) history file synching as I rely on my history as a record of sites I have visited, almost as a secondary set of bookmarks. I read many websites and can't know if I'll ever need a site again in the future.

    The sites I think or know I'll need again, I bookmark. The rest, I rely on history.

    ReplyDelete
  11. @verbatim, you might check out Read It Later for that instead of having a huge history file. It's easy enough to mark something to "read later" & then mark it read when you're done. That moves it to your Read Archive so it's there for the look-up if you need it again later...

    ReplyDelete
  12. The correct German translation for "History" is "Chronik" in Firefox, not "Vergangenheit" which means "Past". Maybe you can correct it in the next release.

    ReplyDelete
  13. Wow. This makes me think that at some point Xmarks will be able to sync all the auto fill fields -like in Outlook. I have to use Outlook at work and my tech like to delete my profile every few months (SIGH!) so I keep losing all my shortcuts. At least I can get back what Xmarks saves!!!

    ReplyDelete
  14. Great update!

    Have you considered adding "search engines" synchronization? You know, those search engines that you can select using the drop-down box in the top right corner of Firefox...

    ReplyDelete
  15. This is really great!

    I am also wondering if you will use safari extension api to build a new extension for safari that will run in every OS instead of the current OS based xmarks for safari?

    I would be nice if you could do this and add history sync to safari and chrome.

    Thanks!

    ReplyDelete
  16. You guys rock. Keep it up.

    ReplyDelete
  17. I enabled history sync, but couldn't see any data in my.xmarks.com. Am i doing something wrong?

    ReplyDelete