Many mistakes in section 2. The author seems to fundamentally misunderstand block scoping vs lexical scoping, and interactions when deferring execution to the next run of the event loop.
In the first example:
for (let i = 0; i < 3; i++) {
setTimeout(() => console.log(i));
}
// prints "0 1 2" — as expected
let i = 0;
for (i = 0; i < 3; i++) {
setTimeout(() => console.log(i));
}
// prints "3 3 3" — what?
i's scope is outside the for loop in the second example, and the setTimeouts execute in the call stack (e.g. the next run of the event loop), after i has finished incrementing in the first event loop iteration
Consider that you'd have the same issue with the older `var` keyword which is lexically scoped
for (var i = 0; i < 3; i++) {
setTimeout(() => console.log(i));
}
// prints "3 3 3" because i is not block-scoped
If for some reason you really need some work run in the next call stack, and you need to use the value of a variable which is scoped outside the loop and modified inside the loop, you can also define a function (or use an iife) to pass the value of i in the current iteration into (rather than getting the reference of i in the event loop's next call stack)
let i = 0;
for (i = 0; i < 3; i++) {
(
(x)=>setTimeout(()=>console.log(x))
)(i)
}
// prints 1 2 3
This sort of stuff is very explicit and unsurprising in C++ (and to a lesser extent Rust), but it's always confusing in languages that leave the capturing details implicit. Even Go got bitten by this and it doesn't even JavaScript's broken `var`.
I don't think it's fair to call Go and Javascript's behavior "implicit", they just always capture variables by reference.
Rust variable capture is implicit though, but it can't cause the problems described in the article, since mutable references are required to be unique.
No, that's a mistake in the article. The variable is still captured by reference, but `let` is causing it to be re-declared on every iteration of the loop, not mutated.
The following code prints 1, 2, 3. It wouldn't do that if the variable was captured by value.
for (let i = 0; i < 3;) {
setTimeout(() => console.log(i));
i++;
}
The behavior of "let" with for loops where the variable is declared more times than it is initialized, despite the source code having one declaration that is also the only initialization, is not very explicit.
for (let i=0;i<3;i++) {
i+=10;
setTimeout(_=>console.log(i),30);
i-=10
}
Capture by value would print 10, 11, 12 that's the value when it was captured
Capture by reference would print 0,1,2
It's much easier to conceptualise it as
for (const i=0;i<3;i++) {
setTimeout(_=>console.log(i),30);
}
which is fine because i never changes. It is a different i each time.
fancier example
for (let x = 0, y = 0; y < 2; x=x++<3?x:y++,0){
x+=10;
y+=10;
console.log("inline:",x,y);
setTimeout(_=>console.log("timeout",x,y),30);
x-=10;
y-=10;
}
the argument is about things that are weird, any effect in a language that means you have to stop and think over scoping rules to figure out why it should be that way is obviously "weird" to my understanding of this word.
In short I'm not sure that they have misunderstood the scoping, they have probably understood it fine, they have remarked on the weirdness that different aspects of JavaScript enables.
Certainly with perfect understanding and knowledge of a language that you do not have to think about at all because it is so perfectly remembered nothing would ever be weird, it is the incidental behaviors of the language at time where you have to stop and think hey why is that, oh yeah, scoping rules and timeout in the call stack, damn!
Yes, it's about block scoping — but that doesn't make it less weird. In most languages this doesn't really make sense — a variable is a piece of memory, and a reference refers to it. JavaScript doesn't work like that, and that's weird to many.
What's the mistake that I made there? I just didn't explain why it happens. I briefly mentioned this in the later paragraphs — it makes sense to some people, but not to most.
JavaScript does work like that, but `for` creates a new block scope for each iteration, so variables declared with `let` in its initializer are redeclared each time. Some other languages ([1]) just make accessing mutable locals from a closure into a compiler error, which I think is also reasonable. Old-school JavaScript (`var`s) chose the worst-of-both-worlds option.
> JS loops pretend their variables are captured by value
This has to do with how for loops work with iterators, but also what `let` means in variable declaration. You talk about 'unrolling a for loop' but what you're doing is 'attempting to express the same loop with while'. Unrolling would look like this;
// original:
for (let i = 0; i < 3; i ++) { setTimeout(()=>console.log(i)) }
// unrolled:
{ let i = 0; setTimeout(()=>console.log(i)) };
{ let i = 1; setTimeout(()=>console.log(i)) };
{ let i = 2; setTimeout(()=>console.log(i)) };
// original:
let i = 0;
for (i = 0; i < 3; i++) { setTimeout(()=>console.log(i)) };
// unrolled:
let i = 0;
{ i = 0; setTimeout(()=>console.log(i)); };
{ i = 1; setTimeout(()=>console.log(i)); };
{ i = 2; setTimeout(()=>console.log(i)); };
Now you can begin to explain what's going wrong in the second example; 'i' is declared with 'let' outside of the block, and this means the callback passed to the setTimeout is placed in the next stack frame, but references i from the outer scope, which is modified by the time the next stack frame is running.
In the original example, a different 'i' is declared inside each block and the callback passed to setTimeout references the 'i' from its scope, which isn't modified in adjacent blocks. It's confusing that you're making this about how loops work when understanding what the loop is doing is only one part of it; understanding scoping and the event loop are 2 other important pieces here.
And then if you're going to compare a while loop to a for loop, I think a critical piece is that 'while' loops (as well as 'do .. while') take only expressions in their condition, and loop until the expression is false.
'for' loops take three-part statements, the first part of which is an initialization assignment (for which 'var' and 'let' work differently), and the second of which is an expression used as the condition. So you can declare a variable with 'let' in the initialization and modify it in the 'afterthought' (the third part of the statement), but it will be treated as if each iteration of the loop is declaring it within the block created for that iteration.
So yes, there are some 'for' loop semantics that are specific to 'for' loops, but rather than explain that, you appear to be trying to make a point about loops in general that I'm not following.
I'm not saying the examples won't help people avoid pitfalls with for and while loops, but I do think they'll be unable to generalize any lessons they take away to other situations in JS, since you're not actually explaining the principles of JS at play.
I mentioned that the title makes no sense in the sentence right after it:
> Yes, the title makes no sense, but you'll see what I mean in just a second.
And yes, I didn't explain the exact mechanics of the ES spec which make it happen — but I would argue that "variables can be modified until they're out-of-scope" is even more unintuitive than just remembering this edge case. And I'm not trying to be an ECMAScript lawyer with the post, rather I'd just show a bunch of "probably unexpected" behaviors of JavaScript.
FWIW, I think the parent meant "function scoping vs lexical scoping" rather than "block scoping vs lexical scoping". You're correct that function scoping is technically a form of lexical scoping (where the scope is the function), but if you want to be _really_ pedantic, the ecma262 considers let/const to be "lexical binding"[0] as opposed to var being a "variable declaration"[1], where the former declares in the "Lexical Environment" while the latter declares in the "Variable Environment". These happen to be the same environment record on function entry.
However, there's no notion of the first example operating on a single scope and the latter on three different, individual scopes. Which is why scope ranges and where you declare a variable with `let` matters.
Complaining about the for loop behaviour seems odd. Variables declared in the expression section of the loop are scoped to the loop body - this is generally reasonable and the least likely to produce errors.
Notably, Go initially decided to stick with the C-style approach and have the scope be outside the loop, and has since decided that the tiny performance improvement this provides isn't worth the many many times it's tripped people up, and has changed the semantics in recent versions: https://go.dev/blog/loopvar-preview
Go's behaviour never made much sense because, unlike JS/C#/Java, Go has pointers. So if you really want to capture the variable itself and not its current value at the specific iteration, you can (and should) just capture the pointer to it.
But even in C#/Java/JS, it never made much sense either since a) you almost always want to capture the current value, not variable itself; b) if you really want the variable itself, you can put it into a 1-element array and capture the array (which how people evade Java's requirement that the captured variables must be "final" anyway) and use arr[0] everywhere instead.
I remember properly learning JS from “The Good Parts” book, which makes it known from the start that JS is a nasty language but if you ignore many sharp edges it can be nice and elegant. I think this is especially true with (a subset of) TS. All you need is a very, very strict linter, and then you get to pretend you’re working in a mostly solid language.
I didn't read JS The Good Parts until it was well outdated, and I was glad to see that a lot of the sharp edges that Crockford lists have largely been eliminated. The book was written circa ES3, so some problematic features were removed in strict mode, we have replacements for some (let/const, for-of loops, etc), and we can sweep prototypes under the rug with ES6 classes, sometimes arrow functions even avoid awkward this-binding issues. The rest, like you said, TypeScript + a linter takes care of.
I find these oddities far more realistic than the Wat video from a long time ago. Many of the things in that video had me asking, "sure but...what programmer would blindly try these things and then be shocked when they didn't work?" The examples in this article are actual "gotchas" that could silently bite someone.
The eval thing is listed as a potential performance cost, but it's actually super important for performance, because it allows the parser to statically know that sloppy eval is never called inside a function, and that variables can therefore be optimized away.
The list of "other weirdness" at the end mentions:
> +0 vs. -0
Which feels kind of odd, since that's mostly floating point weirdness, not JS weirdness. Unless they mean the fact that you can force V8 to initialize a zero-value as a double using -0 as a literal (it tends to optimize 0 to integers). But that has no effect on real-world code unless you use Math.sign, or divide by minus zero.
Fun fact: JavaScript 1.2 did feature -0 and 0 = +0. Which has quite a number of rather confusing aspects and effects, if you run a script in this context.
While interesting and possibly helpful to new coders, are these quirks of the language still relevant when most development in Javascript is done using a framework (React, Vue, etc) these days? How often do these "gotchas" factor into "modern" Javascript development, especially in production? These type of articles seem to critique mechanics of the language that don't come up as often in practice.
The issue with variables and loops that OP described is worse with React, since you create closure inside the render function for event handlers, and if that closure captures any state variables (rather than a getter function for example), then you'll end up referencing stale state. React relies on linters to protect against this, but that only goes so far and the API design makes it easy to screw up, so you have to be on your toes.
Edit: To be clear, this is specifically a React hooks problem, not the old React with classes.
That's a fair point. Adding that to the original post would help provide context about why some of these quirks are still relevant to consider even when using a framework. I believe the assumption is often that frameworks abstract the Javascript "weirdness" away.
Quirks in this article and others like it are not something that would be encountered under normal circumstances unless the programmer is doing something silly (like skipping semicolons, ever touching eval, or relying on type conversions). Now, second point about loops (actually scopes and value declaration) is a core part of the language and needs to be learned, framework or no framework.
You are right, these quirks are not something you struggle with very often. The only one that has been troublesome at some point during my now 8 years as a professional, mainly Javascript with Vue, web developer is the automatic semicolon insertion example.
The simplest fixes for it is to just insert semicolons yourself, always use const, and not start any lines with [ or (.
Variable declared outside the loop construct lives outside the loop.
Variable declared inside the loop construct lives inside the loop.
Seems intuitive to me.
The complex thing here (and what seems to have confused the author) is the distinction between when the reference to the variable is captured vs. when the setTimeout() callback occurs.
Actually, this article shows what good shape Javascript is in. As it says, commonly deployed linters catch the really bad stuff, effectively deprecating those things.
Pretty much the rest of it to do with specific domains outside of javascript, like what actually is a character and IEEE floating point, or are rather out-of-the-way things like document.all and sparse arrays (not that people don't use sparse arrays, but it's entirely optional and if you're going to voluntarily go into that cave, I guess you must be happy to tangle with the bears living there.)
I'd forgive a few of those. Unicode is Unicode. The for loop capture behaviour makes sense to me. Missing semis should also be in your linter. Sparse arrays is the sort of feature you'd read up on if you use and not rely on intuition. It makes sense that if you loop over a sparse thing the looping is sparse too.
Since I started using Prettier, I've moved permanently into the no-semicolons camp. Prettier catches ASI hazards and inserts semicolons where needed, and I've never seen it fail. Whichever camp you're in though, put it in your linter, don't leave it to chance. React code is full of array destructuring, that particular hazard is prone to bite you if you ignore it (tho it's still a little contrived if your usual style is avoiding mutable variables).
I can't think of any typical case where you're destructuring arrays in React without const/let.
The only time you start a line with a delimiter in JS that I can think of is a rare case like `;[1,2,3].forEach(...)` which also isn't something you do in React.
While I still hate semis, these days my approach to formatting is just `echo {} > .prettierrc` and using the defaults. It's a nice balance where I never write a semicolon myself, and I never have to dick around with config.
I dont need to memorize anything, the 2 rules are inferable from how js works and you will get typescript (not even linter) flagging lines starting with those brackets anyway.
And yet, per the spec, new syntax features are allowed to break ASI:
> As new syntactic features are added to ECMAScript, additional grammar productions could be added that cause lines relying on automatic semicolon insertion preceding them to change grammar productions when parsed.
So really, the rules are “there are currently 2 exceptions and an infinite number allowed to be added at any time”. To me, that’s worth letting prettier auto-insert semicolons when I hit save.
I'd like to understand why `document.all` is slower than `getElementById`. Couldn't any even somewhat decent optimizing compiler trivially compile the first to the latter? Like, I don't mean in weird cases like `const all = document.all; return all[v]`, or iterating over it, just the general one where someone directly does `document.all.foo` or `document.all[v]`, ie the 99.99% case. When faced with the choice to compile those accesses to getElementById calls, or patch the ecmascript standard to put IE-compat workarounds in there, it seems kinda nuts to me that they would choose the latter, so I bet there'a good reason that I'm missing.
There was a time where there weren't optimizing compilers in JS engines, at least not anywhere near the level of sophistication they are at today.
In V8, not too long ago, any function over 400 characters, including comments, would bail out of optimization. We had lint rules to disallow these "long" functions.
Regarding that choice: Given that this is really a different library (the DOM and its individual browser implementation), it's probably quite sane to just define a certain object to evaluate as falsy, as compared to any attempts to check for a certain implementation in this external library for any call.
(Even more so, since any access using `document.all` retrieves an object from a live collection, while the other access method is a function call, which is a totally different thing.)
This was like 2004. Chrome, Safari, and Firefox all had getElementById in their first versions in like 2003, Opera had it in version 7, Internet Explorer was the odd one out.
This was IE6 days, the real bad old days. Remember that we were still mostly constrained to XMLHTTPRequests for calls to APIs
Anything actually important to be done in a web browser didn't use javascript, it used an ActiveX Component/extension, a java applet, or Flash or Shockwave (by Macromedia at the time!)
I don't mind '0' == 0 when it's used for scripts and dumb stuff. That's literally how shellscript works, and I love shellscript, so I can't complain about that.
But I would never use shellscript to build an entire business's user interface with a giant shellscript framework. That would be insane. A language that was designed as a throwaway scripting thing for doing some miscellaneous tasks, and never designed for full application purposes? No sane person would use that for a business's core product.
These are the well known bad parts every JS learner is taught about from numerous sources. I guess its OK to repeat the basics, but the linked article looks more like a blog spam rehash.
I've a few opinions on the content, but I'm most interested in which unicode analyzer tool generated that nifty text tree diagram.
I can deal with JS's warts because the tooling is so much better. The other language I make my living with is PHP, and while it's much improved from before, the global function namespace is still a garbage fire, and you need extra linters like phpstan-safe-rule to force you to use sane wrappers.
For backend work, I'd recommend giving C# a look. Syntactically similar to TypeScript[0] but for any serious work, having runtime types and some of the facilities of .NET (e.g. Roslyn, expression trees, Entity Framework Core, threads, etc.) is really nice.
I recently moved from a C# backend startup to a TS backend startup and the lack of runtime types is a real friction point I feel every day with how the code has to be written (e.g. end up writing a lot of Zod; other projects have different and varying types of drudgery with regards to meta-typing for runtime)
I prefer Java because I know it better, but Kotlin is nice. I'd prefer it over .NET, which is still kind of messy unless you're doing some quite specific things where the multi platform efforts have actually succeeded. F# is fine for small tools and some CLI stuff, but big frameworky things tend to be a mess or MICROS~1 specific.
Some people are likely to claim it's not the case anymore and so on, but it's my recent experience on Debian. The JVM environment has its warts and Maven sometimes feels like a huge dumb golem but at least it doesn't come across as whiny or make you feel like a second class citizen.
Yes, plenty of experience in the first two, but the last two are better contenders. At any rate I work primarily with TS when I do front-end code, it's very rare for me to write raw JS, so a lot of footguns automatically go away.
These weird cases are not because JavaScript is bad, but because it has to be backwards compatible. So where other languages can just delete old quirks from the language, JavaScript has to keep them in.
“JavaScript sucks because '0' == 0!”
- literally everyone ever
I never really understood the hate for this, given that everything is a string in HTTP, and that SQL does the same damn thing. There are far more annoying things about JS (both the language and the ecosystem).
Moreover, it was kind of a standard for any scripting language at that time. In other words, this was generally expected behavior. (E.g., compare AWK, Perl, etc.)
The article didn't even mention it. If you actually read, it's more about things that are not necessarily too annoying from a programmer's pespective, but is very much for anyone working on the platform.
> In any programming language, when you capture values with a lambda/arrow function
It seems like just a few years ago that few programmers knew what these concepts are: mostly just the few that were exposed to Lisp or Scheme in college.
Now it's in "any language" and we have to be exposed to incorrect mansplaining about it from a C++ point of view.
> there are two ways to pass variables: By value (copy) or by reference (passing a pointer). Some languages, like C++, let you pick:
Lexical capture isn't "pass".
What this person doesn't understand is that C++ lambdas are fake lambdas, which do not implement environment capture. C++ lambdas will not furnish you with a correct understanding of lambdas.
(Furthermore, no language should ever imitate what C++ has done to lambdas.)
Capture isn't the passage of arguments to parameters, which can be call by value or reference, etc.
Capture is a retention of the environment of bindings, itself.
The issue here is simply that
1. The Javascript lambda is correctly implementing lexical capture.
2. The Javascript loop is not creating a fresh binding for the loop variable in each iteration. It binds one variable, and mutates its value.
Mutating the value is the correct thing to do for a loop construct which lets the program separately express initialization, guard testing and increment. The semantics of such a loop requires that the next iteration's guard have access to the previous iteration's value. We can still have hacks under the hood so that a lexical closure will capture a different variable in each iteration, but it's not worth it, and the program can do that itself. Javascript is doing the right thing here, and it cannot just be fixed. In any case, vast numbers of programs depend on the variable i being a single instance that is created and initialized once and then survives from one iteration to the next.
Now lambdas can in fact be implemented by copy. Compilers for languages with lambda can take various strategies for representing the environment and how it is handled under capture. One possible mechanism is conversion to a flattened environment vector, whereby every new lambda
gets a new copy of such a vector.
The entire nested lexical scope group becomes one object in which every variable has a fixed offset that the compiled code can refer to. You then have to treat individual variables in that flat environment according to whether any given variable is shared, mutated or both.
The worst case is when multiple closures capture the same variable (it is shared) and the variable is mutated such that one closure changes it and another one must see the change. This is the situation with the loop index i variable in the JS loop. This means that under a flat, copied environment strategy, the variable will have to be implemented as a reference cell in the environment vector.
Variables which are not mutated can just be values in the vector. Variables which are mutated, but not shared among closures, likewise.
This is all under the hood though; there are no programmer-visible annotations for indicating how to treat each captured variable. It always looks as if the entire environment at the point of capture is being taken by reference. The compiler generates reference semantics for those variables which need it.
At the implementation level, with particular strategies for handling environments under lambda, we can think about capturing references or value copies. C++ lambdas imitate this sort of implementation-level thinking and define the language construct around it, in a way that avoids the master concept of capturing the environment.
I can pretty confidently say that every article I have seen in the past 5 years that complain about weirdness of JavaScript is about things that you would never do in a modern, production-level codebase. Many of these issues are about pre-ES6, outdated practice (e.g. eval, using == operator) that are almost certainly going to be flagged by a linter. Very occasionally, you do get hit by some weirdness, but likely doesn't take more than a few minutes to figure out. And if you write in TypeScript, like almost every serious new project created these days, most of these questions don't exist at all.
Which is why I don't bother reading these posts any more.
If you actually opened the article, I think you would find it interesting. I agree that a well-configured eslint catches 99% of the weirdness, but this article tries to be about the remaining 1%.
> If the language needs a linter to keep otherwise-competent developers from introducing potentially maddening bugs into the codebase, it's weird.
This is an odd way to phrase things. A better way to look at this is that there are programming tasks which can be easily automated and tasks which cannot be easily automated. The ones which can be easily automated should be, so that humans can focus on the ones which cannot be easily automated. Do you not run automated tests, since those same tests could be done over and over by hand?
Javascript could not have been released perfect for all current and future uses. Thus it either needed to change or fall out of use.
It changed.
The linting is simply part of a migration mechanism that allows for the change to happen. Compatibility is maintained for existing code, while on-going and future development can avoid the bad/obsolete parts. It's not random that eslint has such fine-grained configurability -- it allows each code base to migrate at whatever rate makes sense for it.
So all this is simply a product of javascript's longevity, which itself is a testament to its utility and flexibility. You call it "weird" but it would be a damn shame if it wasn't as useful and flexible as it is.
If weird means out of norm, that's not the case. Every language either has a linter or would benefit from a linter because all languages have warts or idiosyncratic behavior.
Some languages are so error-prone and hard to use they even ship with static analysis and type-checking built in, like Rust and C! (And they still have linters on top of that!)
Perhaps post a language you think is exceptional to this?
Given how accessible and widespread it is, it's hard not to make mistakes in JS without a linter.
Can you shoot your foot off with C or Rust? Sure. But they're systems programming languages for the most part, and aren't realistically proposed as tools in the "move fast and break things" world of web development where JS rules supreme. Ruby and Python are also used in web dev, albeit on the back-end, and they're not as idiosyncratic as JS.
yes `== null` is quite convenient for the null or undefined check.
In general, the whole `==` versus `===` is a silly argument in a typescript codebase, because if you know the types of the arguments, `==` behaves predictably.
Many mistakes in section 2. The author seems to fundamentally misunderstand block scoping vs lexical scoping, and interactions when deferring execution to the next run of the event loop.
In the first example:
i's scope is outside the for loop in the second example, and the setTimeouts execute in the call stack (e.g. the next run of the event loop), after i has finished incrementing in the first event loop iterationConsider that you'd have the same issue with the older `var` keyword which is lexically scoped
If for some reason you really need some work run in the next call stack, and you need to use the value of a variable which is scoped outside the loop and modified inside the loop, you can also define a function (or use an iife) to pass the value of i in the current iteration into (rather than getting the reference of i in the event loop's next call stack)This sort of stuff is very explicit and unsurprising in C++ (and to a lesser extent Rust), but it's always confusing in languages that leave the capturing details implicit. Even Go got bitten by this and it doesn't even JavaScript's broken `var`.
I don't think it's fair to call Go and Javascript's behavior "implicit", they just always capture variables by reference.
Rust variable capture is implicit though, but it can't cause the problems described in the article, since mutable references are required to be unique.
In JavaScript, a 'let' inside the initializer of a for loop is captured by value, all the others are captured by reference.
I think it's fair to call that semantics "implicit".
No, that's a mistake in the article. The variable is still captured by reference, but `let` is causing it to be re-declared on every iteration of the loop, not mutated.
The following code prints 1, 2, 3. It wouldn't do that if the variable was captured by value.
The behavior of "let" with for loops where the variable is declared more times than it is initialized, despite the source code having one declaration that is also the only initialization, is not very explicit.
Fair, but that's a property of for loops. Variable in closures are still always captured by reference.
consider this
Capture by value would print 10, 11, 12 that's the value when it was capturedCapture by reference would print 0,1,2
It's much easier to conceptualise it as
which is fine because i never changes. It is a different i each time.fancier example
> which is fine because i never changes. It is a different i each time.
This is clearly a super weird hack to make closure capture behave more like you'd want. There's a reason this level of weirdness isn't needed in C++.
the argument is about things that are weird, any effect in a language that means you have to stop and think over scoping rules to figure out why it should be that way is obviously "weird" to my understanding of this word.
In short I'm not sure that they have misunderstood the scoping, they have probably understood it fine, they have remarked on the weirdness that different aspects of JavaScript enables.
Certainly with perfect understanding and knowledge of a language that you do not have to think about at all because it is so perfectly remembered nothing would ever be weird, it is the incidental behaviors of the language at time where you have to stop and think hey why is that, oh yeah, scoping rules and timeout in the call stack, damn!
Yes, it's about block scoping — but that doesn't make it less weird. In most languages this doesn't really make sense — a variable is a piece of memory, and a reference refers to it. JavaScript doesn't work like that, and that's weird to many.
What's the mistake that I made there? I just didn't explain why it happens. I briefly mentioned this in the later paragraphs — it makes sense to some people, but not to most.
JavaScript does work like that, but `for` creates a new block scope for each iteration, so variables declared with `let` in its initializer are redeclared each time. Some other languages ([1]) just make accessing mutable locals from a closure into a compiler error, which I think is also reasonable. Old-school JavaScript (`var`s) chose the worst-of-both-worlds option.
[1]: https://stackoverflow.com/q/54340101
OK so for one the title of that section is off:
> JS loops pretend their variables are captured by value
This has to do with how for loops work with iterators, but also what `let` means in variable declaration. You talk about 'unrolling a for loop' but what you're doing is 'attempting to express the same loop with while'. Unrolling would look like this;
Now you can begin to explain what's going wrong in the second example; 'i' is declared with 'let' outside of the block, and this means the callback passed to the setTimeout is placed in the next stack frame, but references i from the outer scope, which is modified by the time the next stack frame is running.In the original example, a different 'i' is declared inside each block and the callback passed to setTimeout references the 'i' from its scope, which isn't modified in adjacent blocks. It's confusing that you're making this about how loops work when understanding what the loop is doing is only one part of it; understanding scoping and the event loop are 2 other important pieces here.
And then if you're going to compare a while loop to a for loop, I think a critical piece is that 'while' loops (as well as 'do .. while') take only expressions in their condition, and loop until the expression is false.
'for' loops take three-part statements, the first part of which is an initialization assignment (for which 'var' and 'let' work differently), and the second of which is an expression used as the condition. So you can declare a variable with 'let' in the initialization and modify it in the 'afterthought' (the third part of the statement), but it will be treated as if each iteration of the loop is declaring it within the block created for that iteration.
So yes, there are some 'for' loop semantics that are specific to 'for' loops, but rather than explain that, you appear to be trying to make a point about loops in general that I'm not following.
I'm not saying the examples won't help people avoid pitfalls with for and while loops, but I do think they'll be unable to generalize any lessons they take away to other situations in JS, since you're not actually explaining the principles of JS at play.
I mentioned that the title makes no sense in the sentence right after it:
> Yes, the title makes no sense, but you'll see what I mean in just a second.
And yes, I didn't explain the exact mechanics of the ES spec which make it happen — but I would argue that "variables can be modified until they're out-of-scope" is even more unintuitive than just remembering this edge case. And I'm not trying to be an ECMAScript lawyer with the post, rather I'd just show a bunch of "probably unexpected" behaviors of JavaScript.
For anyone who wants to see some more explanations/experimentations around for loop semantics, this Chrome Developers video is great:
https://www.youtube.com/watch?v=Nzokr6Boeaw
The author's explanation seems perfectly correct to me. Where does he "misunderstand block scoping vs lexical scoping"? By the Wikipedia definition:
> lexical scope is "the portion of source code in which a binding of a name with an entity applies".
...both `let` and `var` are lexically scoped, the scopes are just different.
FWIW, I think the parent meant "function scoping vs lexical scoping" rather than "block scoping vs lexical scoping". You're correct that function scoping is technically a form of lexical scoping (where the scope is the function), but if you want to be _really_ pedantic, the ecma262 considers let/const to be "lexical binding"[0] as opposed to var being a "variable declaration"[1], where the former declares in the "Lexical Environment" while the latter declares in the "Variable Environment". These happen to be the same environment record on function entry.
[0] https://tc39.es/ecma262/#sec-let-and-const-declarations [1] https://tc39.es/ecma262/#sec-variable-statement
Thanks for the links, that adds a lot of important context.
However, there's no notion of the first example operating on a single scope and the latter on three different, individual scopes. Which is why scope ranges and where you declare a variable with `let` matters.
Complaining about the for loop behaviour seems odd. Variables declared in the expression section of the loop are scoped to the loop body - this is generally reasonable and the least likely to produce errors.
Notably, Go initially decided to stick with the C-style approach and have the scope be outside the loop, and has since decided that the tiny performance improvement this provides isn't worth the many many times it's tripped people up, and has changed the semantics in recent versions: https://go.dev/blog/loopvar-preview
Go's behaviour never made much sense because, unlike JS/C#/Java, Go has pointers. So if you really want to capture the variable itself and not its current value at the specific iteration, you can (and should) just capture the pointer to it.
But even in C#/Java/JS, it never made much sense either since a) you almost always want to capture the current value, not variable itself; b) if you really want the variable itself, you can put it into a 1-element array and capture the array (which how people evade Java's requirement that the captured variables must be "final" anyway) and use arr[0] everywhere instead.
I remember properly learning JS from “The Good Parts” book, which makes it known from the start that JS is a nasty language but if you ignore many sharp edges it can be nice and elegant. I think this is especially true with (a subset of) TS. All you need is a very, very strict linter, and then you get to pretend you’re working in a mostly solid language.
I didn't read JS The Good Parts until it was well outdated, and I was glad to see that a lot of the sharp edges that Crockford lists have largely been eliminated. The book was written circa ES3, so some problematic features were removed in strict mode, we have replacements for some (let/const, for-of loops, etc), and we can sweep prototypes under the rug with ES6 classes, sometimes arrow functions even avoid awkward this-binding issues. The rest, like you said, TypeScript + a linter takes care of.
I find these oddities far more realistic than the Wat video from a long time ago. Many of the things in that video had me asking, "sure but...what programmer would blindly try these things and then be shocked when they didn't work?" The examples in this article are actual "gotchas" that could silently bite someone.
wat video was intended to be funny and tongue in cheek.
The eval thing is listed as a potential performance cost, but it's actually super important for performance, because it allows the parser to statically know that sloppy eval is never called inside a function, and that variables can therefore be optimized away.
The list of "other weirdness" at the end mentions:
> +0 vs. -0
Which feels kind of odd, since that's mostly floating point weirdness, not JS weirdness. Unless they mean the fact that you can force V8 to initialize a zero-value as a double using -0 as a literal (it tends to optimize 0 to integers). But that has no effect on real-world code unless you use Math.sign, or divide by minus zero.
Fun fact: JavaScript 1.2 did feature -0 and 0 = +0. Which has quite a number of rather confusing aspects and effects, if you run a script in this context.
While interesting and possibly helpful to new coders, are these quirks of the language still relevant when most development in Javascript is done using a framework (React, Vue, etc) these days? How often do these "gotchas" factor into "modern" Javascript development, especially in production? These type of articles seem to critique mechanics of the language that don't come up as often in practice.
The issue with variables and loops that OP described is worse with React, since you create closure inside the render function for event handlers, and if that closure captures any state variables (rather than a getter function for example), then you'll end up referencing stale state. React relies on linters to protect against this, but that only goes so far and the API design makes it easy to screw up, so you have to be on your toes.
Edit: To be clear, this is specifically a React hooks problem, not the old React with classes.
That's a fair point. Adding that to the original post would help provide context about why some of these quirks are still relevant to consider even when using a framework. I believe the assumption is often that frameworks abstract the Javascript "weirdness" away.
Quirks in this article and others like it are not something that would be encountered under normal circumstances unless the programmer is doing something silly (like skipping semicolons, ever touching eval, or relying on type conversions). Now, second point about loops (actually scopes and value declaration) is a core part of the language and needs to be learned, framework or no framework.
You are right, these quirks are not something you struggle with very often. The only one that has been troublesome at some point during my now 8 years as a professional, mainly Javascript with Vue, web developer is the automatic semicolon insertion example.
The simplest fixes for it is to just insert semicolons yourself, always use const, and not start any lines with [ or (.
It's worse in a framework, in the framework you need to know the oddities of the language as well as how the framework manages them.
What I don't understand is why, after twenty years, we still haven't versioned Javascript. A simple:
'v2';
At the top of every file could let us eliminate all this 20-year old cruft (like document.all hacks to support Internet Explorer).
Yet, despite the already established `use strict`; (which is basically 'v1.5'), the community seems completely against modernizing the language.
Variable declared outside the loop construct lives outside the loop.
Variable declared inside the loop construct lives inside the loop.
Seems intuitive to me.
The complex thing here (and what seems to have confused the author) is the distinction between when the reference to the variable is captured vs. when the setTimeout() callback occurs.
Actually, this article shows what good shape Javascript is in. As it says, commonly deployed linters catch the really bad stuff, effectively deprecating those things.
Pretty much the rest of it to do with specific domains outside of javascript, like what actually is a character and IEEE floating point, or are rather out-of-the-way things like document.all and sparse arrays (not that people don't use sparse arrays, but it's entirely optional and if you're going to voluntarily go into that cave, I guess you must be happy to tangle with the bears living there.)
I'd forgive a few of those. Unicode is Unicode. The for loop capture behaviour makes sense to me. Missing semis should also be in your linter. Sparse arrays is the sort of feature you'd read up on if you use and not rely on intuition. It makes sense that if you loop over a sparse thing the looping is sparse too.
Since I started using Prettier, I've moved permanently into the no-semicolons camp. Prettier catches ASI hazards and inserts semicolons where needed, and I've never seen it fail. Whichever camp you're in though, put it in your linter, don't leave it to chance. React code is full of array destructuring, that particular hazard is prone to bite you if you ignore it (tho it's still a little contrived if your usual style is avoiding mutable variables).
I can't think of any typical case where you're destructuring arrays in React without const/let.
The only time you start a line with a delimiter in JS that I can think of is a rare case like `;[1,2,3].forEach(...)` which also isn't something you do in React.
While I still hate semis, these days my approach to formatting is just `echo {} > .prettierrc` and using the defaults. It's a nice balance where I never write a semicolon myself, and I never have to dick around with config.
The rules are not that many, you can omit semicolons everywhere except 1. Before open square bracket 2. Before open parenthesis.
That's it, those are the only 2 edge cases.
No, there are quite a lot of other edge cases. E.g. you also need them before backticks and in many places in class bodies.
Instead of learning a rule and then memorizing exceptions to it, you could just learn a rule with no exceptions.
I dont need to memorize anything, the 2 rules are inferable from how js works and you will get typescript (not even linter) flagging lines starting with those brackets anyway.
And yet, per the spec, new syntax features are allowed to break ASI:
> As new syntactic features are added to ECMAScript, additional grammar productions could be added that cause lines relying on automatic semicolon insertion preceding them to change grammar productions when parsed.
So really, the rules are “there are currently 2 exceptions and an infinite number allowed to be added at any time”. To me, that’s worth letting prettier auto-insert semicolons when I hit save.
I'd like to understand why `document.all` is slower than `getElementById`. Couldn't any even somewhat decent optimizing compiler trivially compile the first to the latter? Like, I don't mean in weird cases like `const all = document.all; return all[v]`, or iterating over it, just the general one where someone directly does `document.all.foo` or `document.all[v]`, ie the 99.99% case. When faced with the choice to compile those accesses to getElementById calls, or patch the ecmascript standard to put IE-compat workarounds in there, it seems kinda nuts to me that they would choose the latter, so I bet there'a good reason that I'm missing.
There was a time where there weren't optimizing compilers in JS engines, at least not anywhere near the level of sophistication they are at today.
In V8, not too long ago, any function over 400 characters, including comments, would bail out of optimization. We had lint rules to disallow these "long" functions.
Regarding that choice: Given that this is really a different library (the DOM and its individual browser implementation), it's probably quite sane to just define a certain object to evaluate as falsy, as compared to any attempts to check for a certain implementation in this external library for any call.
(Even more so, since any access using `document.all` retrieves an object from a live collection, while the other access method is a function call, which is a totally different thing.)
This was like 2004. Chrome, Safari, and Firefox all had getElementById in their first versions in like 2003, Opera had it in version 7, Internet Explorer was the odd one out.
This was IE6 days, the real bad old days. Remember that we were still mostly constrained to XMLHTTPRequests for calls to APIs
Anything actually important to be done in a web browser didn't use javascript, it used an ActiveX Component/extension, a java applet, or Flash or Shockwave (by Macromedia at the time!)
I don't mind '0' == 0 when it's used for scripts and dumb stuff. That's literally how shellscript works, and I love shellscript, so I can't complain about that.
But I would never use shellscript to build an entire business's user interface with a giant shellscript framework. That would be insane. A language that was designed as a throwaway scripting thing for doing some miscellaneous tasks, and never designed for full application purposes? No sane person would use that for a business's core product.
Right?
These are the well known bad parts every JS learner is taught about from numerous sources. I guess its OK to repeat the basics, but the linked article looks more like a blog spam rehash.
I've a few opinions on the content, but I'm most interested in which unicode analyzer tool generated that nifty text tree diagram.
I can deal with JS's warts because the tooling is so much better. The other language I make my living with is PHP, and while it's much improved from before, the global function namespace is still a garbage fire, and you need extra linters like phpstan-safe-rule to force you to use sane wrappers.
For backend work, I'd recommend giving C# a look. Syntactically similar to TypeScript[0] but for any serious work, having runtime types and some of the facilities of .NET (e.g. Roslyn, expression trees, Entity Framework Core, threads, etc.) is really nice.
I recently moved from a C# backend startup to a TS backend startup and the lack of runtime types is a real friction point I feel every day with how the code has to be written (e.g. end up writing a lot of Zod; other projects have different and varying types of drudgery with regards to meta-typing for runtime)
[0] https://typescript-is-like-csharp.chrlschn.dev/
I'm thinking Kotlin for my next backend project. Or maybe Unison ;)
I prefer Java because I know it better, but Kotlin is nice. I'd prefer it over .NET, which is still kind of messy unless you're doing some quite specific things where the multi platform efforts have actually succeeded. F# is fine for small tools and some CLI stuff, but big frameworky things tend to be a mess or MICROS~1 specific.
Some people are likely to claim it's not the case anymore and so on, but it's my recent experience on Debian. The JVM environment has its warts and Maven sometimes feels like a huge dumb golem but at least it doesn't come across as whiny or make you feel like a second class citizen.
What are the things where multi-platform efforts did not succeed?
Both JS and PHP are rather footgun-rich languages; have you tried Python, Java, Kotlin, or C#?
> have you tried Python, Java, Kotlin, or C#?
Yes, plenty of experience in the first two, but the last two are better contenders. At any rate I work primarily with TS when I do front-end code, it's very rare for me to write raw JS, so a lot of footguns automatically go away.
These weird cases are not because JavaScript is bad, but because it has to be backwards compatible. So where other languages can just delete old quirks from the language, JavaScript has to keep them in.
Moreover, it was kind of a standard for any scripting language at that time. In other words, this was generally expected behavior. (E.g., compare AWK, Perl, etc.)
The article didn't even mention it. If you actually read, it's more about things that are not necessarily too annoying from a programmer's pespective, but is very much for anyone working on the platform.
> The article didn't even mention it.
Literal first thing in the article - it's a block quote right under the title.
> In any programming language, when you capture values with a lambda/arrow function
It seems like just a few years ago that few programmers knew what these concepts are: mostly just the few that were exposed to Lisp or Scheme in college.
Now it's in "any language" and we have to be exposed to incorrect mansplaining about it from a C++ point of view.
> there are two ways to pass variables: By value (copy) or by reference (passing a pointer). Some languages, like C++, let you pick:
Lexical capture isn't "pass".
What this person doesn't understand is that C++ lambdas are fake lambdas, which do not implement environment capture. C++ lambdas will not furnish you with a correct understanding of lambdas.
(Furthermore, no language should ever imitate what C++ has done to lambdas.)
Capture isn't the passage of arguments to parameters, which can be call by value or reference, etc. Capture is a retention of the environment of bindings, itself.
The issue here is simply that
1. The Javascript lambda is correctly implementing lexical capture.
2. The Javascript loop is not creating a fresh binding for the loop variable in each iteration. It binds one variable, and mutates its value.
Mutating the value is the correct thing to do for a loop construct which lets the program separately express initialization, guard testing and increment. The semantics of such a loop requires that the next iteration's guard have access to the previous iteration's value. We can still have hacks under the hood so that a lexical closure will capture a different variable in each iteration, but it's not worth it, and the program can do that itself. Javascript is doing the right thing here, and it cannot just be fixed. In any case, vast numbers of programs depend on the variable i being a single instance that is created and initialized once and then survives from one iteration to the next.
Now lambdas can in fact be implemented by copy. Compilers for languages with lambda can take various strategies for representing the environment and how it is handled under capture. One possible mechanism is conversion to a flattened environment vector, whereby every new lambda gets a new copy of such a vector.
The entire nested lexical scope group becomes one object in which every variable has a fixed offset that the compiled code can refer to. You then have to treat individual variables in that flat environment according to whether any given variable is shared, mutated or both.
The worst case is when multiple closures capture the same variable (it is shared) and the variable is mutated such that one closure changes it and another one must see the change. This is the situation with the loop index i variable in the JS loop. This means that under a flat, copied environment strategy, the variable will have to be implemented as a reference cell in the environment vector. Variables which are not mutated can just be values in the vector. Variables which are mutated, but not shared among closures, likewise.
This is all under the hood though; there are no programmer-visible annotations for indicating how to treat each captured variable. It always looks as if the entire environment at the point of capture is being taken by reference. The compiler generates reference semantics for those variables which need it.
At the implementation level, with particular strategies for handling environments under lambda, we can think about capturing references or value copies. C++ lambdas imitate this sort of implementation-level thinking and define the language construct around it, in a way that avoids the master concept of capturing the environment.
Js trivia, no need to know this stuff.
My favorite one ever:
017 == '17' // false
018 == '18' // true
> but every JS setup these days contains a linter that yells at you for code like that.
Yes, it can find some cases, but in general would require solving the halting problem.
I can pretty confidently say that every article I have seen in the past 5 years that complain about weirdness of JavaScript is about things that you would never do in a modern, production-level codebase. Many of these issues are about pre-ES6, outdated practice (e.g. eval, using == operator) that are almost certainly going to be flagged by a linter. Very occasionally, you do get hit by some weirdness, but likely doesn't take more than a few minutes to figure out. And if you write in TypeScript, like almost every serious new project created these days, most of these questions don't exist at all.
Which is why I don't bother reading these posts any more.
If you actually opened the article, I think you would find it interesting. I agree that a well-configured eslint catches 99% of the weirdness, but this article tries to be about the remaining 1%.
Yes, and the first one is eval, something that one should not use outside extraordinary situations.
If the language needs a linter to keep otherwise-competent developers from introducing potentially maddening bugs into the codebase, it's weird.
JS was developed in a hurry and has been extended into doing things it was never meant to do, and it shows.
> If the language needs a linter to keep otherwise-competent developers from introducing potentially maddening bugs into the codebase, it's weird.
This is an odd way to phrase things. A better way to look at this is that there are programming tasks which can be easily automated and tasks which cannot be easily automated. The ones which can be easily automated should be, so that humans can focus on the ones which cannot be easily automated. Do you not run automated tests, since those same tests could be done over and over by hand?
That's not really a useful way to think of it.
Javascript could not have been released perfect for all current and future uses. Thus it either needed to change or fall out of use.
It changed.
The linting is simply part of a migration mechanism that allows for the change to happen. Compatibility is maintained for existing code, while on-going and future development can avoid the bad/obsolete parts. It's not random that eslint has such fine-grained configurability -- it allows each code base to migrate at whatever rate makes sense for it.
So all this is simply a product of javascript's longevity, which itself is a testament to its utility and flexibility. You call it "weird" but it would be a damn shame if it wasn't as useful and flexible as it is.
If weird means out of norm, that's not the case. Every language either has a linter or would benefit from a linter because all languages have warts or idiosyncratic behavior.
Some languages are so error-prone and hard to use they even ship with static analysis and type-checking built in, like Rust and C! (And they still have linters on top of that!)
Perhaps post a language you think is exceptional to this?
Every language could benefit from a linter.
Given how accessible and widespread it is, it's hard not to make mistakes in JS without a linter.
Can you shoot your foot off with C or Rust? Sure. But they're systems programming languages for the most part, and aren't realistically proposed as tools in the "move fast and break things" world of web development where JS rules supreme. Ruby and Python are also used in web dev, albeit on the back-end, and they're not as idiosyncratic as JS.
You should have probably read the post.
I did.
Maybe it is outdated practice but I still use == operator for null or undefined check.
yes `== null` is quite convenient for the null or undefined check.
In general, the whole `==` versus `===` is a silly argument in a typescript codebase, because if you know the types of the arguments, `==` behaves predictably.
[flagged]