Draft

Printing Objects and Protocols in Clojure

Author
Affiliation

The Clojure default for printing objects is noisy. Clojure’s print-method for Object delegates to clojure.core/print-object

(defmethod print-method Object [x ^java.io.Writer w]
  (#'clojure.core/print-object x w))
#object [MultiFn]
(Object.)
#object [Object]

The syntax is #object[CLASS-NAME HASH toString())] and as you can see, the toString of an Object is CLASS-NAME@HASH. This can get pretty ugly:

(async/chan)
#object [ManyToManyChannel]

clojure-plus provides print-methods to improve printing many things.

(comment
  (require 'clojure+.print)
  (clojure+.print/install-printers!))

Once activated, we can print functions, atoms, namespaces, and more sensibly. Clojure Plus adds printers for many types, but no printer is provided for Object, which remains as Clojure’s default printing method. There are plenty of objects left over that print messily.

It’s not hard to provide an Object print-method:

(defmethod print-method Object [x ^java.io.Writer w]
  (.write w "#object [")
  (.write w (.getName (class x)))
  (.write w "]"))
#object [MultiFn]
(async/chan)
#object [ManyToManyChannel]

Much nicer! In my opinion this is a big improvement. Especially in the world of notebooks where we like to show things as we go, but also just keeping a tidy REPL or looking into data that contains objects.

asynctopolis/flow
#object [create-flow$reify]

Hmmmm. not so nice. We’ll dig into this further below. But we also need to be aware that Clojure munges it’s names to make Java valid names. This matters for some things:

(-> ((fn %% [] (fn %%% [])))
    (class)
    (.getName))
"clojure_PLUS_.print.objects_and_protocols$eval112755$_PERCENT__PERCENT___112756$_PERCENT__PERCENT__PERCENT___112757"

Whoa, that’s pretty gross. We’d prefer to demunge the names at least.

(defn class-name
  [x]
  (-> x class .getName Compiler/demunge))
(-> ((fn %% [] (fn %%% [])))
    (class-name))
"clojure+.print.objects-and-protocols/eval112762/%%--112763/%%%--112764"

Notice the /evalNNNNN/ part? To create a function, Clojure creates a new class. The /evalNNNNN/ counts every time it evaluates. This is useful in the sense that it identifies the class for that evaluation. But we almost never care for that detail (more on that later). For the same reason our strangely named functions have --NNNNN appended to them, because they are sub evaluations of the top-level evaluation.

Let’s do away with that noise for the moment:

(defn remove-extraneous
  "Clojure compiles with unique names that include things like `/eval32352/` and `--4321`.
  These are rarely useful when printing a function.
  They can still be accessed via (class x) or similar."
  [s]
  (-> s
      (str/replace #"/eval\d+/" "/")
      (str/replace #"--\d+(/|$)" "$1")))
(-> ((fn %% [] (fn %%% [])))
    (class-name)
    (remove-extraneous))
"clojure+.print.objects-and-protocols/%%/%%%"

Looking better, I can actually see the (strange) name of the functions.

(defn format-class-name ^String [s]
  (let [[ns-str & names] (-> (remove-extraneous s)
                             (str/split #"/"))]
    (if (and ns-str names)
      (str (str/join "$" names))
      (-> s (str/split #"\.") (last)))))
(-> (((fn aaa [] (fn bbb [] (fn ccc [])))))
    (class-name)
    (format-class-name))
"aaa$bbb$ccc"

Let’s hook this up to the print-method for Object:

(defmethod print-method Object [x ^java.io.Writer w]
  (.write w "#object [")
  (.write w (-> (class-name x) (format-class-name)))
  (.write w "]"))
#object [MultiFn]
*ns*
#object [Namespace]
(((fn aaa [] (fn bbb [] (fn ccc [])))))
#object [aaa$bbb$ccc]
asynctopolis/flow
#object [create-flow$reify]

What is this? It’s a reified object that implements protocols. We can see this by the $reify part at the end. The description is not terrible, at least we know where it was made, which hints that it must be a flow. Can we do better?

AFAIK the only way to check what protocols an object satisfies is to call satisfies? for every possible protocol:

(defn all-protocol-vars [x]
  (->> (all-ns)
       (mapcat ns-publics)
       (vals)
       (keep #(-> % meta :protocol))
       (distinct)
       (filter #(satisfies? @% x))))

On the one hand, this is concerning for performance. On the other hand, at my REPL I don’t care about that, it’s faster than I can notice. Leaving aside those concerns, it returns quite a long list…

(all-protocol-vars asynctopolis/flow)
(#'charred.api/PToJSON
 #'emmy.function/IArity
 #'emmy.value/Numerical
 #'emmy.value/Value
 #'emmy.ratio/IRational
 #'clojure.core.match.protocols/ISyntaxTag
 #'emmy.differential/IPerturbed
 #'taoensso.nippy/IFreezable
 #'taoensso.nippy/IFreezableWithMeta
 #'clojure.core.reducers/CollFold
 #'clojure.core.async.flow.impl.graph/Graph
 #'hiccup.compiler/HtmlRenderer
 #'sci.impl.vars/DynVar
 #'clojure.test.check.results/Result
 #'clj-yaml.core/YAMLCodec
 #'fastmath.protocols/DistributionIdProto
 #'malli.core/RegexSchema
 #'malli.core/DistributiveSchema
 #'malli.core/FunctionSchema
 #'clojure.tools.reader.reader-types/PushbackReaderCoercer
 #'clojure.tools.reader.reader-types/ReaderCoercer
 #'clojure.java.io/IOFactory
 #'tech.v3.io.protocols/IOProvider
 #'tech.v3.io.protocols/ICopyObject
 #'clojure.data.json/JSONWriter
 #'hiccup.util/ToString
 #'hiccup.util/URLEncode
 #'com.rpl.specter.navs/FastEmpty
 #'com.rpl.specter.navs/AddExtremes
 #'com.rpl.specter.navs/MapTransformProtocol
 #'com.rpl.specter.navs/UpdateExtremes
 #'com.rpl.specter.navs/GetExtremes
 #'com.rpl.specter.navs/AllTransformProtocol
 #'tech.v3.dataset.protocols/PMissing
 #'tech.v3.dataset.protocols/PReducerCombiner
 #'tech.v3.dataset.protocols/PDataset
 #'tech.v3.dataset.protocols/PColumnName
 #'tech.v3.dataset.protocols/PColumn
 #'tech.v3.dataset.protocols/PColumnCount
 #'tech.v3.dataset.protocols/PDatasetReducer
 #'tech.v3.dataset.protocols/PRowCount
 #'clojure.spec.alpha/Specize
 #'ham-fisted.protocols/Reduction
 #'ham-fisted.protocols/BulkSetOps
 #'ham-fisted.protocols/SetOps
 #'ham-fisted.protocols/BitSet
 #'ham-fisted.protocols/ToIterable
 #'ham-fisted.protocols/Finalize
 #'ham-fisted.protocols/ParallelReduction
 #'ham-fisted.protocols/ToCollection
 #'ham-fisted.protocols/ParallelReducer
 #'tech.v3.datatype.index-algebra/PIndexAlgebra
 #'sci.impl.types/Eval
 #'sci.impl.types/Stack
 #'clojure.core.protocols/Navigable
 #'clojure.core.protocols/Datafiable
 #'clojure.core.protocols/CollReduce
 #'clojure.core.protocols/InternalReduce
 #'clojure.core.protocols/IKVReduce
 #'nrepl.bencode/BencodeSerializable
 #'tech.v3.datatype.protocols/PToReader
 #'tech.v3.datatype.protocols/POperator
 #'tech.v3.datatype.protocols/PCopyRawData
 #'tech.v3.datatype.protocols/PToNativeBuffer
 #'tech.v3.datatype.protocols/PSetConstant
 #'tech.v3.datatype.protocols/PToBuffer
 #'tech.v3.datatype.protocols/PTensor
 #'tech.v3.datatype.protocols/PSubBuffer
 #'tech.v3.datatype.protocols/POperationalElemwiseDatatype
 #'tech.v3.datatype.protocols/PConstantTimeMinMax
 #'tech.v3.datatype.protocols/PToWriter
 #'tech.v3.datatype.protocols/PElemwiseReaderCast
 #'tech.v3.datatype.protocols/PToArrayBuffer
 #'tech.v3.datatype.protocols/PDatatype
 #'tech.v3.datatype.protocols/PECount
 #'tech.v3.datatype.protocols/PToBitmap
 #'tech.v3.datatype.protocols/PRangeConvertible
 #'tech.v3.datatype.protocols/PElemwiseCast
 #'tech.v3.datatype.protocols/PClone
 #'tech.v3.datatype.protocols/PApplyUnary
 #'tech.v3.datatype.protocols/PElemwiseDatatype
 #'tech.v3.datatype.protocols/PShape
 #'tech.v3.datatype.protocols/PToTensor
 #'malli.generator/Generator
 #'com.rpl.specter.impl/PathComposer
 #'com.rpl.specter.impl/CoercePath
 #'fipp.ednize/IEdn)

But notice that one of them; #'clojure.core.async.flow.impl.graph/Graph just feels like it is the one we care about most. Furthermore, it shares a similar namespace with the classname. Let’s try matching by the namespace…

(defn var-ns-name [v]
  (-> (meta v) (:ns) (ns-name)))
(defn ns-match? [p x]
  (-> (var-ns-name p)
      (str/starts-with? (.getPackageName (class x)))))
(defn protocol-ns-matches [x]
  (filter #(ns-match? % x) (all-protocol-vars x)))
(protocol-ns-matches asynctopolis/flow)
(#'clojure.core.async.flow.impl.graph/Graph)

Nice. In my opinion this is more representative of the object. The #' out front is unnecessary and can be removed…

(defn var-sym [v]
  (let [m (meta v)]
    (symbol (str (ns-name (:ns m))) (str (:name m)))))
(defn protocol-ns-match-names [x]
  (->> (protocol-ns-matches x)
       (map var-sym)))
(protocol-ns-match-names asynctopolis/flow)
(clojure.core.async.flow.impl.graph/Graph)

The other protocol of interest is Datafiable, because it indicates I can get a data representation if I would like to.

(datafy/datafy asynctopolis/flow)
{:procs
 {:Randomius
  {:args {:min 0, :max 12, :wait 500},
   :proc
   {:step core.async.flow.example.asynctopolis/Randomius,
    :desc
    {:params
     {:min "Min value to generate",
      :max "Max value to generate",
      :wait "Time in ms to wait between generating"},
     :outs {:out "Output channel for stats"}}}},
  :Tallystrix
  {:args {:min 1, :max 10},
   :proc
   {:step core.async.flow.example.asynctopolis/Tallystrix,
    :desc
    {:params
     {:min "Min value, alert if lower",
      :max "Max value, alert if higher"},
     :ins
     {:stat "Channel to receive stat values",
      :poke
      "Channel to poke when it is time to report a window of data to the log"},
     :outs
     {:alert
      "Notify of value out of range {:val value, :error :high|:low"},
     :workload :compute}}},
  :Chronon
  {:args {:wait 3000},
   :proc
   {:step core.async.flow.example.asynctopolis/Chronon,
    :desc
    {:params {:wait "Time to wait between pokes"},
     :outs
     {:out "Poke channel, will send true when the alarm goes off"}}}},
  :Claxxus
  {:args {:prefix "Alert: "},
   :proc
   {:step core.async.flow.example.asynctopolis/Claxxus,
    :desc
    {:params {:prefix "Log message prefix"},
     :ins {:in "Channel to receive messages"}}},
   :chan-opts
   {:in {:buf-or-n {:type SlidingBuffer, :count 0, :capacity 3}}}}},
 :conns
 [[[:Randomius :out] [:Tallystrix :stat]]
  [[:Chronon :out] [:Tallystrix :poke]]
  [[:Tallystrix :alert] [:Claxxus :in]]],
 :execs {:mixed nil, :io nil, :compute nil},
 :chans {}}

I think this one is so helpful that it should always be shown on objects, regardless of their type of other protocols, as a hint that it is possible to get more information. I wouldn’t want to print them as data by default, because it would be too spammy. And checking Datafiable is much less of a performance concern.

(satisfies? clojure.core.protocols/Datafiable asynctopolis/flow)
true

But there is a big problem… everything is Datafiable…

(satisfies? clojure.core.protocols/Datafiable (Object.))
true

So there is no way for us to know whether datafy/datafy will do anything useful or not. Sad. But we can improve the print-method to show protocols, bearing in mind it is a performance concern.

Showing the reified protocol isn’t a big improvement, and probably not worth the performance. Probably not worth including in clojure-plus.

Even if we don’t care to improve reify (due to performance), I think the Object printer should still be improved to align with the other printers.

Are we giving up anything? Remember we removed the unique identifiers like /evalNNNNN/. When would those be useful? Hold onto your hats! We are about to try to find an Object by a class-name:

(defn find-class [class-name]
  (try
    (Class/forName class-name false (clojure.lang.RT/baseLoader))
    (catch ClassNotFoundException _ nil)))
(defn ddd [x] (inc x))
(type (find-class (-> ddd (class) (.getName))))
java.lang.Class

Why would you want to do that? I don’t know, but it’s pretty cool you have to admit. What’s also interesting is that we can get all Clojure classes: https://danielsz.github.io/2021-05-12T13_24.html

(defn class-cache []
  (some-> (.getDeclaredField clojure.lang.DynamicClassLoader "classCache")
          (doto (.setAccessible true))
          (.get nil)))
(key (first (class-cache)))
"emmy.numsymb$sym_COLON_one_QMARK_"

And we can find them in memory a similar way:

(defn find-in-memory-class
  "Finds a class by name in the DynamicClassLoader's memory cache"
  [class-name]
  (let [method (.getDeclaredMethod clojure.lang.DynamicClassLoader
                                   "findInMemoryClass"
                                   (into-array Class [String]))
        _ (.setAccessible method true)]
    (.invoke method nil (into-array Object [class-name]))))

Right, but why would you want to do that? Honestly I can’t imagine a reason. All of that to say, do we really want those unique identifiers printed out? No! If we need to find them, we can always look them up another way. We don’t need them polluting our REPL output.

source: src/clojure+/print/objects_and_protocols.clj