Mastering Node.js(Second Edition)
上QQ阅读APP看书,第一时间看更新

Exploring streams

According to Bjarne Stoustrup in his book, The C++ Programming Language, (third edition):

"Designing and implementing a general input/output facility for a programming language is notoriously difficult... An I/O facility should be easy, convenient, and safe to use; efficient and flexible; and, above all, complete."

It shouldn't surprise anyone that a design team, focused on providing efficient and easy I/O, has delivered such a facility through Node. Through a symmetrical and simple interface, which handles data buffers and stream events so that the implementer does not have to, Node's Stream module is the preferred way to manage asynchronous data streams for both internal modules, and the module's developers will create.

A stream in Node is simply a sequence of bytes. At any time, a stream contains a buffer of bytes, and this buffer has a zero or greater length:

As each character in a stream is well-defined, and because every type of digital data can be expressed in bytes, any part of a stream can be redirected, or piped, to any other stream, different chunks of the stream can be sent to different handlers, and so on. In this way, stream input and output interfaces are both flexible and predictable, and can be easily coupled.

Node also offers a second type of streams: object streams. Instead of chunks of memory flowing through the stream, an object stream shuttles JavaScript objects. Byte streams pass around serialized data like streaming media, while object streams are the right choice for parsed, structured data like JSON records.

Digital streams are well described using the analogy of fluids, where inpidual bytes (drops of water) are being pushed through a pipe. In Node, streams are objects representing data flows that can be written to and read from asynchronously.

The Node philosophy is a non-blocking flow, I/O is handled via streams, and so the design of the Stream API naturally duplicates this general philosophy. In fact, there is no other way of interacting with streams except in an asynchronous, evented manner—Node prevents developers, by design, from blocking I/O.

Five distinct base classes are exposed via the abstract Stream interface: Readable, Writable, Duplex, Transform, and PassThrough. Each base class inherits from EventEmitter, which we know of as an interface to which event listeners and emitters can be bound.

As we will learn, and here will emphasize, the Stream interface is an abstract interface. An abstract interface functions as a kind of blueprint or definition, describing the features that must be built into each constructed instance of a Stream object. For example, a Readable stream implementation is required to implement a public read method which delegates to the interface's internal _read method.

In general, all stream implementations should follow these guidelines:

  • As long as data exists to send, write to a stream until that operation returns false, at which point the implementation should wait for a drain event, indicating that the buffered stream data has emptied.
  • Continue to call read until a null value is received, at which point wait for a readable event prior to resuming reads.
  • Several Node I/O modules are implemented as streams. Network sockets, file readers and writers, stdin and stdout, zlib, and so on are all streams. Similarly, when implementing a readable data source, or data reader, one should implement that interface as a Stream interface.

It is important to note that over the history of Node, the Stream interface changed in some fundamental ways. The Node team has done its best to implement compatible interfaces, so that (most) older programs will continue to function without modification. In this chapter, we will not spend any time discussing the specific features of this older API, focusing on the current design. The reader is encouraged to consult Node's online documentation for information on migrating older programs. As often happens, there are modules that wrap streams with convenient, reliable interfaces. A good one is: https://github.com/rvagg/through2.