I/O Stream Memory Overhead

No related posts found.

Each PrintStream uses about 25kb of memory. This might seem reasonable if we only have System.out and System.err. But what happens if we try create millions? And why do they use so much memory?

A couple of weeks ago, my colleague John Green and I were experimenting with virtual threads (Project Loom). Our server would receive text messages, change their case, and echo them back. Our client simulated loads of users. We had spun the experiment up to 100k sockets per JVM, which worked out at a total of 200k virtual threads. Both server and client components were humming along fine, but we did notice that the memory usage on the client was magnitudes higher. But why? The server task looked like this:

import java.io.*;
import java.net.*;

class TransmogrifyTask implements Runnable {
  private final Socket socket;

  public TransmogrifyTask(Socket socket) throws IOException {
    this.socket = socket;
  }

  public void run() {
    try (socket;
         InputStream in = socket.getInputStream();
         OutputStream out = socket.getOutputStream()
    ) {
      while (true) {
        int val = in.read();
        if (Character.isLetter(val))
          val ^= ' '; // change case of all letters
        out.write(val);
      }
    } catch (IOException e) {
      // connection closed
    }
  }
}

The client side task conveniently used PrintStream and BufferedReader to communicate with the server:

import java.io.*;
import java.net.*;
import java.util.concurrent.*;

class ClientTaskWithIOStreams implements Runnable {
  private final Socket socket;
  private final boolean verbose;

  public ClientTaskWithIOStreams(Socket socket, boolean verbose) {
    this.socket = socket;
    this.verbose = verbose;
  }

  private static final String message = "John 3:16";

  public void run() {
    try (socket;
         BufferedReader in = new BufferedReader(
             new InputStreamReader(
                 socket.getInputStream()));
         PrintStream out = new PrintStream(
             socket.getOutputStream(), true)
    ) {
      while (true) {
        out.println(message);
        TimeUnit.SECONDS.sleep(2);
        String reply = in.readLine();
        if (verbose) System.out.println(reply);
        TimeUnit.SECONDS.sleep(2);
      }
    } catch (Exception consumeAndExit) {}
  }
}

After running jmap’s histogram on both JVMs, we noticed that the biggest memory hog was the PrintStream, followed by the BufferedReader. We thus changed the client task to instead send and receive individual bytes. Not all the clients are verbose, and thus we only create a StringBuilder when it is necessary. Futhermore, by default each ClientTask shares the same static Appendable, which returns a StringBuilder if it is a verbose client.

import java.io.*;
import java.net.*;
import java.util.concurrent.*;

class ClientTask implements Runnable {
  private final Socket socket;
  private final boolean verbose;

  public ClientTask(Socket socket, boolean verbose) {
    this.socket = socket;
    this.verbose = verbose;
  }

  private static final byte[] message = "John 3:16\n".getBytes();

  private final static Appendable INITIAL = new Appendable() {
    public Appendable append(CharSequence csq) {
      return new StringBuilder().append(csq);
    }

    public Appendable append(CharSequence csq, int start, int end) {
      return new StringBuilder().append(csq, start, end);
    }

    public Appendable append(char c) {
      return new StringBuilder().append(c);
    }
  };

  public void run() {
    Appendable appendable = INITIAL;
    try (socket;
         InputStream in = socket.getInputStream();
         OutputStream out = socket.getOutputStream()
    ) {
      while (true) {
        for (byte b : message) {
          out.write(b);
        }
        out.flush();
        TimeUnit.SECONDS.sleep(2);

        for (int i = 0; i < message.length; i++) {
          int b = in.read();
          if (verbose) {
            appendable = appendable.append((char) b);
          }
        }
        if (verbose) {
          System.out.print(appendable);
          appendable = INITIAL;
        }
        TimeUnit.SECONDS.sleep(2);
      }
    } catch (Exception consumeAndExit) {}
  }
}

This worked much better and the memory usage on the server and the client was roughly the same. We ran our experiment a bit longer and eventually had 2 million sockets open on the server JVM, serviced by 2 million virtual threads, serviced by just 12 carrier threads. Our client simulation had the same number of sockets and virtual threads, with a total of 4 million sockets and threads. The memory usage of all that came to under 3GB per JVM. Incredible technology and I cannot wait until it becomes mainstream in Java.

We performed another experiment to determine how much memory each of the Input- and OutputStreams, as well as the Readers and Writers, used. This was on our machine and your mileage might vary.

  • OutputStream
    • PrintStream 25064
    • BufferedOutputStream 8312
    • DataOutputStream 80
    • FileOutputStream 176
    • GZIPOutputStream 768
    • ObjectOutputStream 2264
  • InputStream
    • BufferedInputStream 8296
    • DataInputStream 328
    • FileInputStream 176
    • GZIPInputStream 1456
    • ObjectInputStream 2256
  • Writer
    • PrintWriter 80
    • BufferedWriter 16480
    • FileWriter 8608
    • OutputStreamWriter 8480
  • Reader
    • BufferedReader 16496
    • FileReader 8552
    • InputStreamReader 8424

As convenient as virtual threads are, we will need to change our coding practices. Who would have imagined that one day we would be able to create millions of threads in our JVMs? Even the Phaser has a maximum limit of 65535 parties. It is possible to compose Phasers, but I can imagine the inventors thinking that no one would ever have more than 64k threads. The ForkJoinPool has a similar limitation on the maximum length of their work queues. These numbers are reasonable when we have thousands of threads, but not so much when we have millions.

Kind regards from a wobbly Crete

Heinz

P.S. I have not answered the obvious question of why these objects use so much memory. It is mostly empty space in the form of buffers. For example, the BufferedReader has an 8k char[]. Since each char is two bytes, this comes to 16kb. The PrintStream contains an OutputStreamWriter (8kb) and a BufferedWriter (16kb), resulting in its roughly 25kb. Just lots and lots of empty nothingness.

STAY TUNED!

JOIN OUR NEWSLETTER

Behind the Tracks

Software Architecture & Design
Software innovation & more
Microservices
Architecture structure & more
Agile & Communication
Methodologies & more
DevOps & Continuous Delivery
Delivery Pipelines, Testing & more
Big Data & Machine Learning
Saving, processing & more

JOIN OUR UPCOMING EVENTS IN LONDON!