You have probably read Google’s white paper-redefining 38-page comic describing their new browser, Google Chrome. The item that captured my attention the most (aside from myriad digs against what I do at my day job) came on page 9:
Admirable. They test each new build against an impressive corpus of web pages. I wondered exactly how they validated the results. The answer shows up on page 11– the rendering engine produces a schematic of what it thinks the page ought to look like, rather than rendering a bitmap and performing some kind of checksum on the result. I don’t know what the schematic looks like but I wouldn’t be surprised to see some kind of XML at work. It’s still awesome to know that the browser is so aggressively tested against such a broad corpus. It has been suggested many times that FATE should try to test all of the samples in the MPlayer samples archive.
It also makes me wonder about the possibility of FFmpeg outputting syntax parsing and validating that rather than a final bitmap (or audio waveform). That would be one way around the bit inexactness problems for certain perceptual codecs. Ironically, though, such syntax parsings would be far bulkier that just raw video frames or audio waveforms.
Obviously, Google’s most brazen paradigm with this project concerns the process vs. thread concept. I remember once being tasked with a simple cleaning job. The supervisor told me to use one particular tool for the chore. After working at it for a bit, I thought of a much better tool to use for the chore. But I struggled with whether I should use the better tool because, I thought: “It seems too obvious of a solution. There must be a reason that the supervisor did not propose it.” I used the better solution anyway and when the supervisor returned, they were impressed by my solution which they hadn’t considered.
The purpose for that vague anecdote: I wonder if people have been stuck in the threaded browser model for so long that they have gotten to thinking, “There must be a good reason that all the web browsers use a threaded model rather than a process-driven model. I’m sure the browser authors have thought a lot more about this than I have and must have good reasons not to put each page in a separate process.”
I’m eager to see what Google Chrome does for the web. And I’m eager to see what their expository style does for technology white papers.
Apparently the Internet Explorer 8 beta uses the process per tab approach as well. When I initially spammed the links in the FFmpeg developer IRC channel on the night of the leak, MÃ¥ns’ response was that the approach of a process per tab was obvious but it’s good that someone is finally implementing the idea.
My issues with it so far:
– The flash plugin is a bit flakey and has died on me a couple of times whereas it is quite stable in Firefox 3.
– I had it randomly hang for some seconds and then continue.
– Sometimes it seems to take an age to resolve a page but I didn’t compare to Firefox at the same time and my connection/DNS can be a bit crappy sometimes so this point can be ignored. ;)
Someone benchmarked the javascript performance and posted their results on the Doom9 forums. The improvement was impressive – some 2 seconds in Opera and Firefox versus 0.6 seconds in Chrome. I could probably dig out the link but I suspect there are better benchmarks all over the web by now.
By the way, I randomly felt like sending this message from Chrome on my Windows box. :)
It is great to do regression test with lot’s of data, but first you need to find a way to get reference data. Use reference decoder/ check by hand/…
For example ffmpeg does some testing by encoding and decoding and checking if the result is near the same. But it you introduce a bug in the common part, the test is still okay with the symetry.
Also for me the process per tab is not a clear win. It as drawback (performance; http://weblogs.mozillazine.org/roc/archives/2008/09/chrome.html) and doesn’t solve all the problem (a corrupt process can attach to another process with stuff like ptrace; there is still a monitor to manage IPC between the process that can crash and make all the process hang; there is still share resource can can hang a process because an other one don’t release it; shared memory can be corrupt, …)
PS : some browser aren’t even thread per tab because otherwise the tab won’t freeze….
i think that this per-website thread will enable bloat. call it a gut feeling.
chrome/chromium loads a 10mb dll file each time it starts a new tab. i wonder if it would be better to load each individual renderer/plugin/whatnot per site? or keep the dll in memory and read it from there?