Webkit源码探究DOM -- 获取元素之getElementById

wusiqi111 2019-09-30

展开全文

跟

按照ID获取元素 -- `getElementById`

标准

DOM 1，定义在HTMLDocument Interface 中，原型Element getElementById(in DOMString elementId)，当不存在拥有对应ID的元素时返回null，该方法不会抛出任何异常。
DOM 2，移动到了Document（原HTMLDocument的Parent Interface），原型不变。
DOM 3 特别声明浏览器应当使用Attr.isId判断 Attr 是否为 ID，同时加了一句“Attributes with the name "ID" or "id" are not of type ID unless so defined.”，这是针对IE7-会将name为"id"的元素也一并返回的错误实现增加的说明
WHATWG 将getElementById 放到了 NonElementParentNode里，因此实现了NonElementParentNode的DocumentFragment也拥有这个方法（而W3C的标准里，DocumentFragment仅仅继承了Node，不应该有此方法）
DOM 4 目前与 WHATWG 相同

注意点

注意 getElementById 的名字里没有全大写的 ID，而是id。
目前浏览器中 getElementById 仅定义在Document 和 DocumentFragment上，WHATWG的文档里提到没有添加到Element是为这个特性会挂掉未使用sizzle前的jQuery（<=1.2.6）的单元测试（旧版jQuery使用了elem.getElementById来判断元素是否为Document），参见邮件列表上的讨论。
没有插入 DOM （如用appendChild）的元素是无法用该方法搜索到的。由于前面提到的WHATWG与W3C标准的不同，现实中浏览器里的DocumentFragment也可以用此方法搜索元素。
一些浏览器会将带有id的元素创建成全局变量（比如id="foo"的元素会以window.foo出现在javascript runtime），并且为了向后兼容所以一直保留这个特性，但是它不在标准里，而且全局变量很容易被覆盖，应该尽量避免使用。
标准里写明了当存在多个拥有对应 id 的元素时，浏览器的行为是未定义的，但是大多数浏览器都选择返回第一个拥有该 id 的元素。至于什么是“第一个”，就要看浏览器实现中的DOM树是怎么遍历的了。 WHATWG 里描述的是tree-order，即先序的DFS。
检查方法：
```
<!DOCTYPE html><html lang="en"><head>
    <meta charset="UTF-8">
    <title>Document</title></head><body>
    <div>
        <div id="foo" class="a"></div>
    </div>
    <div id="foo" class="b"></div></body></html>
```
浏览器 console 运行
```
document.getElementById('foo')
```
Chrome，FireFox及IE均返回class="a"的div，即 WHATWG 中规定的先序DFS。

兼容性

IE7- 会将有与查询id相同的name的元素也算入，因此如果需要兼容 IE7-，可能需要做elem.id === id的判定。

其他

DOM 2 中，id 定义在 HTMLElement上（DOM 3 没有 DOM HTML 标准），WHATWG 将 id 放在了 Element，定义都是attribute DOMString id，并说明了 id 属于global attribute。总之，在 HTML 里只要是元素就可以有id，并且可以通过 elem.id 的方式直接获取。

Webkit 相关代码分析

Document 继承自 TreeScope，即WHATWG里提到的NonElementParentNode （参见WebCore/dom/Document.h），getElementById其实实现在TreeScope里，调用私有变量DocumentOrderedMap指针m_elementsById 的getElementById（WebCore/dom/TreeScope.h， WebCore/dom/TreeScope.cpp）。

DocumentOrderedMap的getElementById其实是get的包装，最终实现参见 WebCore/dom/DocumentOrderedMap.cpp

迭代器的继承结构是 descendantsOfType -> TypedElementDescendantIteratorAdapter -> TypedElementDescendantIterator，TypedElementDescendantIterator<ElementType>::operator++ 调用 ElementIterator<ElementType>::traverseNext ，参见TypedElementDescendantIterator.h。

ElementIterator<ElementType>::traverseNext 再调用 Traversal<ElementType>::next（WebCore/dom/ElementIterator.h），这里多态的ElementTraversal（继承Traversal）实际上又调用了NodeTraversal，NodeTraversal中 next的重载利用traverseNextTemplate实现。最后通过WebCore/dom/NodeTraversal.h可以看出访问顺序是 firstChild -> nextSibling -> nextAncestorSibling，也就是先序DFS，符合WHATWG里的描述。