Frequently Misunderstood JavaScript Concepts

icecity1306 2015-02-14

展开全文

Frequently Misunderstood JavaScript Concepts

by Michael Bolin, October 28, 2013

This is a complete "reprint" of Appendix B from my book, Closure: The Definitive Guide. Even though my book was designed to focus on Closure rather than JavaScript in general, there were a number of pain points in the language that I did not think were covered well in other popular JavaScript books, such as JavaScript: The Good Parts or JavaScript: The Definitive Guide. So that the book did not lose its focus, I relegated this and my essay, "Inheritance Patterns in JavaScript," to the back of the book among the appendices. However, many people have told me anecdotally that Appendix B was the most valuable part of the book for them, so it seemed like this was worth sharing more broadly in hopes that it helps demystify a language that I have enjoyed so much.

This book is not designed to teach you JavaScript, but it does recognize that you are likely to have taught yourself JavaScript and that there are some key concepts that you may have missed along the way. This section is particularly important if your primary language is Java as the syntactic similarities between Java and JavaScript belie the differences in their respective designs.

JavaScript Objects are Associative Arrays whose Keys are Always Strings

Every object in JavaScript is an associative array whose keys are strings. This is an important difference from other programming languages, such as Java, where a type such as java.util.Map is an associative array whose keys can be of any type. When an object other than a string is used as a key in JavaScript, no error occurs: JavaScript silently converts it to a string and uses that value as the key instead. This can have surprising results:

var foo = new Object();
var bar = new Object();
var map = new Object();

map[foo] = "foo";
map[bar] = "bar";

// Alerts "bar", not "foo".
alert(map[foo]);

In the above example, map does not map foo to "foo" and bar to "bar". When foo and bar are used as keys for map, they are converted into strings using their respective toString() methods. This results in mapping the toString() of foo to "foo" and the toString() of bar to "bar". Because both foo.toString() and bar.toString() are "[object Object]", the above is equivalent to:

var map = new Object();
map["[object Object]"] = "foo";
map["[object Object]"] = "bar";

alert(map["[object Object]"]);

Therefore, map[bar] = "bar" replaces the mapping of map[foo] = "foo" on the previous line.

There are Several Ways to Look Up a Value in an Object

There are several ways to look up a value in an object, so if you learned JavaScript by copy-and-pasting code from other web sites, it may not be clear that the following code snippets are equivalent:

// (1) Look up value by name:
map.meaning_of_life;

// (2) Look up value by passing the key as a string:
map["meaning_of_life"];

// (3) Look up value by passing an object whose toString() method returns a
// string equivalent to the key:
var machine = new Object();
machine.toString = function() { return "meaning_of_life"; };
map[machine];

Note that the first approach, "Look up value by name," can only be used when the name is a valid JavaScript identifier. Consider the example from the previous section where the key was "[object Object]":

alert(map.[object Object]); // throws a syntax error

This may lead you to believe that it is safer to always look up a value by passing a key as a string rather than by name. In Closure, this turns out not to be the case because of how variable renaming works in the Compiler. This will be explained in more detail in Chapter 13.

Single-quoted Strings and Double-quoted Strings are Equivalent

In some programming languages, such as Perl and PHP, double-quoted strings and single-quoted strings are interpreted differently. In JavaScript, both types of strings are interpreted in the same way; however, the convention in the Closure Library is to use single-quoted strings. (By comparison, Closure Templates mandate the use of single-quoted strings.) The consistent use of quotes makes it easier to perform searches over the codebase, but they make no difference to the JavaScript interpreter or the Closure Compiler.

The one caveat is that the JSON specification requires that strings be double-quoted, so data that is passed to a strict JSON parser (rather than the JavaScript eval() method) must use double-quoted strings.

There are Several Ways to Define an Object Literal

In JavaScript, the following statements are equivalent methods for creating a new, empty object:

// This syntax is equivalent to the syntax used in Java (and other C-style
// languages) for creating a new object.
var obj1 = new Object();

// Parentheses are technically optional if no arguments are passed to a
// function used with the 'new' operator, though this is generally avoided.
var obj2 = new Object;

// This syntax is the most succinct and is used exclusively in Closure.
var obj3 = {};

The third syntax is called an "object literal" because the properties of the object can be declared when the object is created:

// obj4 is a new object with three properties.
var obj4 = {
  'one': 'uno',
  'two': 'dos',
  'three': 'tres'
};

// Alternatively, each property could be added in its own statement:
var obj5 = {};
obj5['one'] = 'uno';
obj5['two'] = 'dos';
obj5['three'] = 'tres';

// Or some combination could be used:
var obj6 = { 'two': 'dos' };
obj6['one'] = 'uno';
obj6['three'] = 'tres';

Note that when using the object literal syntax, each property is followed by a comma except for the last one. Care must be taken to keep track of commas, as it is often forgotten when later editing code to add a new property:

// Suppose the declaration of obj4 were changed to include a fourth property.
var obj4 = {
  'one': 'uno',
  'two': 'dos',
  'three': 'tres'   // Programmer forgot to add a comma to this line...
  'four': 'cuatro'  // ...when this line was added.
};

The above will result in an error from the JavaScript interpreter because it cannot parse the object literal due to the missing comma. Currently, all browsers other than Internet Explorer allow a trailing comma in object literals to eliminate this issue (support for the trailing comma is mandated in ES5, so it should appear in IE soon):

var obj4 = {
  'one': 'uno',
  'two': 'dos',
  'three': 'tres', // This extra comma is allowed on Firefox, Chrome, and Safari.
};

Unfortunately, the trailing comma produces a syntax error in Internet Explorer, so the Closure Compiler will issue an error when it encounters the trailing comma.

Because of the popularity of JSON, it is frequent to see the keys of object literals as double quoted strings. The quotes are required in order to be valid JSON, but they are not required in order to be valid JavaScript. Keys in object literals can be expressed in any of the following three ways:

var obj7 = {
  one: 'uno',      // No quotes at all
  'two': 'dos',    // Single-quoted string
  "three": 'tres'  // Double-quoted string
};

Using no quotes at all may seem odd at first, particularly if there is a variable in scope with the same name. Try to predict what happens in the following case:

var one = 'ONE';
var obj8 = { one: one };

The above creates a new object, obj8, with one property whose name is one and whose value is 'ONE'. When one is used on the left of the colon, it is simply a name, but when it is used on the right of the colon, it is a variable. This is perhaps more obvious if obj8 were defined in the following way:

var obj8 = {};
obj8.one = one;

Here it is clearer that obj8.one identifies the property on obj8 named one which is distinct from the variable one to the right of the equals sign.

The only time that quotes must be used with a key in an object literal is when the key is a JavaScript keyword (note this is no longer a restriction in ES5):

var countryCodeMap = {
  fr: 'France',
  in: 'India',  // Throws a syntax error because 'in' is a JavaScript keyword
  ru: 'Russia'
};

Despite this edge case, keys in object literals are rarely quoted in Closure. This has to do with variable renaming, which is explained in more detail in Chapter 13 on the Compiler. As a rule of thumb, only quote keys that would sacrifice the correctness of the code if they were renamed. For example, if the code were:

var translations = {
  one: 'uno',
  two: 'dos',
  three: 'tres'
};

var englishToSpanish = function(englishWord) {
  return translations[englishWord];
};

englishToSpanish('two'); // should return 'dos'

Then the Compiler might rewrite this code as:

var a = {
  a: 'uno',
  b: 'dos',
  c: 'tres'
};

var d = function(e) {
  return a[e];
};

d('two'); // should return 'dos' but now returns undefined

In this case, the behavior of the compiled code is different from that of the original code, which is a problem. This is because the keys of translations do not represent properties that can be renamed, but strings whose values are significant. Because the Compiler cannot reduce string literals, defining translations as follows would result in the compiled code having the correct behavior:

var translations = {
  'one': 'uno',
  'two': 'dos',
  'three': 'tres'
};

The "prototype" Property is Not the Prototype You are Looking For

For all the praise for its support of prototype-based programming, manipulating an object's prototype is not straightforward in JavaScript.

Recall that every object in JavaScript has a link to another object called its prototype. Cycles are not allowed in a chain of prototype links, so a collection of JavaScript objects and prototype relationships can be represented as a rooted tree where nodes are objects and edges are prototype relationships. Many modern browsers (though not all) expose an object's prototype via its __proto__ property. (This causes a great deal of confusion because an object's __proto__ and prototype properties rarely refer to the same object.) The root of such a tree will be the object referenced by Object.prototype in JavaScript. Consider the following JavaScript code:

// Rectangle is an ordinary function.
var Rectangle = function() {};

// Every function has a property named 'prototype' whose value is an object
// with a property named 'constructor' that points back to the original
// function. It is possible to add more properties to this object.
Rectangle.prototype.width = 3;
Rectangle.prototype.height = 4;

// Creates an instance of a Rectangle, which is an object whose
// __proto__ property points to Rectangle.prototype. This is discussed
// in more detail in Chapter 5 on Classes and Inheritance.
var rect = new Rectangle();

Figure B-1 contains the corresponding object model:

In the diagram, each box represents a JavaScript object and each circle represents a JavaScript primitive. Recall that JavaScript objects are associative arrays whose keys are always strings, so each arrow exiting a box represents a property of that object, the target being the property's value. For simplicity, the closed, shaded arrows represent a __proto__ property while closed, white arrows represent a prototype property. Open arrows have their own label indicating the name of the property.

The prototype chain for an object can be found by following the __proto__ arrows until the root object is reached. Note that even though Object.prototype is the root of the graph when only __proto__ edges are considered, Object.prototype also has its own values, such as the built-in function mapped to hasOwnProperty.

When resolving the value associated with a key on a JavaScript object, each object in the prototype chain is examined until one is found with a property whose name matches the specified key. If no such property exists, the value returned is undefined. This is effectively equivalent to the following:

var lookupProperty = function(obj, key) {
  while (obj) {
    if (obj.hasOwnProperty(key)) {
      return obj[key];
    }
    obj = obj.__proto__;
  }
  return undefined;
};

For example, to evaluate the expression rect.width, the first step is to check whether a property named width is defined on rect. From the diagram, it is clear that rect has no properties of its own because it has no outbound arrows besides __proto__. The next step is to follow the __proto__ property to Rectangle.prototype which does have an outbound width arrow. Following that arrow leads to the primitive value 3, which is what rect.width evaluates to.

Because the prototype chain always leads to Object.prototype, any value that is declared as a property on Object.prototype will be available to all objects, by default. For example, every object has a property named hasOwnProperty that points to a native function. That is, unless hasOwnProperty is reassigned to some other value on an object, or some object in its prototype chain. For example, if Rectangle.prototype.hasOwnProperty were assigned to alert, then rect.hasOwnProperty would refer to alert because Rectangle.prototype appears earlier in rect's prototype chain than Object.prototype. Although this makes it possible to grant additional functionality to all objects by modifying Object.prototype, this practice is discouraged and error-prone as explained in Chapter 4.

Understanding the prototype chain is also important when considering the effect of removing properties from an object. JavaScript provides the delete keyword for removing a property from an object: using delete can only affect the object itself, but not any of the objects in its prototype chain. This may sometimes yield surprising results:

rect.width = 13;
alert(rect.width); // alerts 13
delete rect.width;
alert(rect.width); // alerts 3 even though delete was used
delete rect.width;
alert(rect.width); // still alerts 3

When rect.width = 13 is evaluated, it creates a new binding on rect with the key width and the value 13. When alert(rect.width) is called on the following line, rect now has its own property named width, so it displays its associated value, 13. When delete rect.width is called, the width property defined on rect is removed, but the width property on Rectangle.prototype still exists. This is why the second call to alert yields 3 rather than undefined. To remove the width property from every instance of Rectangle, delete must be applied to Rectangle.prototype:

delete Rectangle.prototype.width;
alert(rect.width); // now this alerts undefined

It is possible to modify rect so that it behaves as if it did not have a width property without modifying Rectangle.prototype by setting rect.width to undefined. It can be determined whether the property was overridden or deleted by using the built-in hasOwnProperty method:

var obj = {};
rect.width = undefined;

// Now both rect.width and obj.width evaluate to undefined even though obj
// never had a width property defined on it or on any object in its prototype
// chain.

rect.hasOwnProperty('width'); // evaluates to true
obj.hasOwnProperty('width');  // evaluates to false

Note that the results would be different if Rectangle were implemented as follows:

var Rectangle2 = function() {
  // This adds bindings to each new instance of Rectangle2 rather than adding
  // them once to Rectangle2.prototype.
  this.width = 3;
  this.height = 4;
};

var rect1 = new Rectangle();
var rect2 = new Rectangle2();

rect1.hasOwnProperty('width'); // evaluates to false
rect2.hasOwnProperty('width'); // evaluates to true

delete rect1.width;
delete rect2.width;

rect1.width; // evaluates to 3
rect2.width; // evaluates to undefined

Finally, note that the __proto__ properties in the diagram are not set explicitly in the sample code. These relationships are managed behind the scenes by the JavaScript runtime.

The Syntax for Defining a Function is Significant

There are two common ways to define a function in JavaScript:

// Function Statement
function FUNCTION_NAME() {
  /* FUNCTION_BODY */
}

// Function Expression
var FUNCTION_NAME = function() {
  /* FUNCTION_BODY */
};

Although the function statement is less to type and is commonly used by those new to JavaScript, the behavior of the function expression is more straightforward. (Despite this, the Google style guide advocates using the function , so Closure uses it in almost all cases.) The behavior of the two types of function definitions is not the same, as illustrated in the following examples:

function hereOrThere() {
  return 'here';
}

alert(hereOrThere()); // alerts 'there'

function hereOrThere() {
  return 'there';
}

It may be surprising that the second version of hereOrThere is used before it is defined. This is due to a special behavior of function statements called hoisting which allows a function to be used before it is defined. In this case, the last definition of hereOrThere() wins, so it is hoisted and used in the call to alert().

By comparison, a function expression associates a value with variable, just like any other assignment statement. Because of this, calling a function defined in this manner uses the function value most recently assigned to the variable:

var hereOrThere = function() {
  return 'here';
};

alert(hereOrThere()); // alerts 'here'

hereOrThere = function() {
  return 'there';
};

For a more complete argument of why function expressions should be favored over function statements, see Appendix B of Douglas Crockford's JavaScript: The Good Parts (O'Reilly).

What "this" Refers to When a Function is Called

When calling a function of the form foo.bar.baz(), the object foo.bar is referred to as the receiver. When the function is called, it is the receiver that is used as the value for this:

var obj = {};

obj.value = 10;

/** @param {...number} additionalValues */
obj.addValues = function(additionalValues) {
  for (var i = 0; i < arguments.length; i++) {
    this.value += arguments[i];
  }
  return this.value;
};

// Evaluates to 30 because obj is used as the value for 'this' when
// obj.addValues() is called, so obj.value becomes 10 + 20.
obj.addValues(20);

If there is no explicit receiver when a function is called, then the global object becomes the receiver. As explained in "goog.global", window is the global object when JavaScript is executed in a web browser. This leads to some surprising behavior:

var f = obj.addValues;

// Evaluates to NaN because window is used as the value for 'this' when
// f() is called. Because and window.value is undefined, adding a number to
// it results in NaN.
f(20);

// This also has the unintentional side-effect of adding a value to window:
alert(window.value); // Alerts NaN

Even though obj.addValues and f refer to the same function, they behave differently when called because the value of the receiver is different in each call. For this reason, when calling a function that refers to this, it is important to ensure that this will have the correct value when it is called. To be clear, if this were not referenced in the function body, then the behavior of f(20) and obj.addValues(20) would be the same.

Because functions are first-class objects in JavaScript, they can have their own methods. All functions have the methods call() and apply() which make it possible to redefine the receiver (i.e., the object that this refers to) when calling the function. The method signatures are as follows:

/**
 * @param {*=} receiver to substitute for 'this'
 * @param {...} parameters to use as arguments to the function
 */
Function.prototype.call;

/**
 * @param {*=} receiver to substitute for 'this'
 * @param {Array} parameters to use as arguments to the function
 */
Function.prototype.apply;

Note that the only difference between call() and apply() is that call() receives the function parameters as a individual arguments whereas apply() receives them as a single array:

// When f is called with obj as its receiver, it behaves the same as calling
// obj.addValues(). Both of the following increase obj.value by 60:
f.call(obj, 10, 20, 30);
f.apply(obj, [10, 20, 30]);

The following calls are equivalent, as f and obj.addValues refer to the same function:

obj.addValues.call(obj, 10, 20, 30);
obj.addValues.apply(obj, [10, 20, 30]);

However, the following will not work: neither call() nor apply() uses the value of its own receiver to substitute for the receiver argument when it is unspecified.

// Both statements evaluate to NaN
obj.addValues.call(undefined, 10, 20, 30);
obj.addValues.apply(undefined, [10, 20, 30]);

The value of this can never be null or undefined when a function is called. When null or undefined is supplied as the receiver to call() or apply(), the global object is used as the value for receiver instead. Therefore, the above has the same undesirable side-effect of adding a property named value to the global object.

It may he helpful to think of a function as having no knowledge of the variable to which it is assigned. This helps reinforce the idea that the value of this will be bound when the function is called rather than when it is defined.

The "var" Keyword is Significant

Many self-taught JavaScript programmers believe that the var keyword is optional because they do not observe different behavior when they omit it. On the contrary, omitting the var keyword can lead to some very subtle bugs.

The var keyword is significant because it introduces a new variable in local scope. When a variable is referenced without the var keyword, it uses the variable by that name in the closest scope. If no such variable is defined, a new binding for that variable is declared on the global object. Consider the following example:

var foo = 0;

var f = function() {
  // This defines a new variable foo in the scope of f.
  // This is said to "shadow" the global variable foo, whose value is 0.
  // The global value of foo could be referenced via window.foo, if desired.
  var foo = 42;

  var g = function() {
    // This defines a new variable bar in the scope of g.
    // It uses the closest declaration of foo, which is in f.
    var bar = foo + 100;
    return bar;
  };

  // There is no variable bar declared in the current scope, f, so this
  // introduces a new variable, bar, on the global object. Code in g has
  // access to f's scope, but code in f does not have access to g's scope.
  bar = 'DO NOT DO THIS!';

  // Returns a function that adds 100 to the local variable foo.
  return g;
};

// This alerts 'undefined' because bar has not been added to the global scope yet.
alert(typeof bar);

// Calling f() has the side-effect of introducing the global variable bar.
var h = f();
alert(bar);  // Alerts 'DO NOT DO THIS!'

// Even though h() is called outside of f(), it still has access to scope of
// f and g, so h() returns (foo + 100), or 142.
alert(h());  // Alerts 142

This gets even trickier when var is omitted from loop variables. Consider the following function, f(), which uses a loop to call g() three times. Calling g() uses a loop to call alert() three times, so you may expect nine alert boxes to appear when f() is called:

var f = function() {
  for (i = 0; i < 3; i++) {
    g(i);
  } 
};

var g = function() {
  for (i = 0; i < 3; i++) {
    alert(i);
  }
}

// This alerts 0, 1, 2, and then stops.
f();

Instead, alert() only appears three times because both f() and g() fail to declare the loop variable i with the var keyword. When g() is called for the first time, it uses the global variable i which has been initialized to 0 by f(). When g() exits, it has increased the value of i to 3. On the next iteration of the loop in f(), i is now 3, so the test for the conditional i < 3 fails, and f() terminates. This problem is easily solved by appropriately using the var keyword to declare each loop variable:

for (var i = 0; i < 3; i++)

Understanding var is important in avoiding subtle bugs related to variable scope. Enabling Verbose warnings from the Closure Compiler will help catch these issues.

Block Scope is Meaningless

Unlike most C-style languages, variables in JavaScript functions are accessible throughout the entire function rather than the block (delimited by curly braces) in which the variable is declared. This can lead to the following programming error:

/**
 * Recursively traverses map and returns an array of all keys that are found.
 * @param {Object} map
 * @return {Array.<string>}
 */
var getAllKeys = function(map) {
  var keys = [];
  for (var key in map) {
    keys.push(key);
    var value = map[key];
    if (typeof value == 'object') {
      // Here, "var map" does not introduce a new local variable named map
      // because such a variable already exists in function scope.
      var map = value;
      keys = keys.concat(getAllKeys(map));
    }
  }
  return keys;
};

var mappings = {
  'derivatives': { 'sin x': 'cos x', 'cos x': '-sin x'},
  'integrals': { '2x': 'x^2', '3x^2': 'x^3'}
};

// Evaluates to: ['derivatives', 'sin x', 'cos x', 'integrals']
getAllKeys(mappings);

The array returned by getAllKeys() is missing the values '2x' and '3x^2'. This is because of a subtle error where map is reused as a variable inside the if block. In languages that support block scoping, this would introduce a new variable named map that would be assigned to value for the duration of the if block, and upon exiting the if block, the recent binding for map would be discarded and the previous binding for map would be restored. In JavaScript, there is already a variable named map in scope because one of the arguments to getAllKeys() is named map. Even though declaring var map within getAllKeys() is likely a signal that the programmer is trying to introduce a new variable, the var is silently ignored by the JavaScript interpreter and execution proceeds without interruption.

When the Verbose warning level is used, the Closure Compiler issues a warning when it encounters code such as this. To appease the Compiler, either the var must be dropped (to indicate the existing variable is meant to be reused) or a new variable name must be introduced (to indicate that a separate variable is meant to be used). The getAllKeys() example falls into the latter case, so the if block should be rewritten as:

if (typeof value == 'object') {
  var newMap = value;
  keys = keys.concat(getAllKeys(newMap));
}

Interestingly, because the scope of a variable includes the entire function, the declaration of the variable can occur anywhere in the function, even after its first "use":

var strangeLoop = function(someArray) {
  // Will alert 'undefined' because i is in scope, but no value has been
  // assigned to it at this point.
  alert(i);

  // Assign 0 to i and use it as a loop counter.
  for (i = 0; i < someArray.length; i++) {
    alert('Element ' + i + ' is: ' + someArray[i]);
  }

  // Declaration of i which puts it in function scope.
  // The value 42 is never used.
  var i = 42;
};

Like the case where redeclaring a variable goes unnoticed by the interpreter but is flagged by the Compiler, the Compiler will also issue a warning (again, with the Verbose warnings enabled) if a variable declaration appears after its first use within the function.

It should be noted that even though blocks do not introduce new scopes, functions can be used in place of blocks for that purpose. The if block in getAllKeys() could be rewritten as follows:

if (typeof value == 'object') {
  var functionWithNewScope = function() {
    // This is a new function, and therefore a new scope, so within this
    // function, map is a new variable because it is prefixed with 'var'.
    var map = value;

    // Because keys is not prefixed with 'var', the existing value of keys
    // from the enclosing function is used.
    keys = keys.concat(getAllKeys(map));
  };
  // Calling this function will have the desired effect of updating keys.
  functionWithNewScope();
}

Although the above will work, it is less efficient than replacing var map with var newMap as described earlier.