Why map None to empty string in JSON codec?

During a discussion about alternative ways of representing empty results on single-valued endpoints, we realized that TapirJsonCirce (via Codec#json() and Codec#anyString()) explicitly overrides the Circe default of encoding Scala None as JSON null and produces the empty string in the response body instead.

I have found #3043, but this only covers decoding, and it still will accept null as an alternative to the empty string, so I’d count this as “being liberal in what you accept”. But why produce invalid JSON from a JSON codec? :thinking:

Of course we can change this behavior by tinkering with codec implicits, if need be. I’m just curious what the rationale behind this choice is…?

I think you’re right that this behavior is incorrect, if the response code is 200 (I just verified it). I would expect tapir to at least set code 204 in the response if it’s None. Should we create an issue, or maybe there are some additional considerations I didn’t take into account, @adamw?

This happens for optional bodies only - meaning that there might be no body. So the logic is that if the body is optional, and we get a None (we might of course have other types expressing optionality which isn’t covered by this check, but that would need special handling), we return an empty body.

So if you have an whateverBody[Option[X]], if the result is None, then there’s simply no body, regardless of the underlying logic of the whateverBody codec.

I understand the problem with JSON bodies, but maybe you should then modify its schema, so that it isn’t optional? (if you always want to produce some kind of representation of the value, if it’s “empty”)

…but what (other) representation and value? :thinking: The natural encoding for “emptiness”/absence of a value in Scala side is the Option type, None being exactly the “empty” value. The straightforward translation would be JSON null, and that’s what Circe #encodeOption() implements.

Now I see, however, there’s an ambivalence wrt the Option semantics here - it could mean “potentially no body at all” or it could be “the body representing the Option value in the respective codec”. Naively, I’d tend to the latter interpretation, though - a List output type doesn’t designate multiple bodies, either. :wink:

Another issue with the “no body” interpretation is that the response still declares content type application/json, whereas the empty body is no valid JSON.

So, in order to get the null JSON encoding for None, I see three options with the status quo:

  1. Implement a custom Tapir Codec (e.g. a facade around #circeCodec() output) and shadow the default JSON codec implicit.
  2. Use a custom output type that’s basically the structural equivalent of Option, with the equivalent Circe Codec.
  3. Use Option[Option[Foo]] instead of Option[Foo] as the output type, always wrapping the actual value into a Some.

Am I missing any nicer option(s)?

As you write, the current implementation goes along the None == “no body at all” route. And you’re right about the List analogy, although on the other hand, if None would end up being serialised as null, we would need some alternative way to represent an “empty body”.

One thing that we might indeed have to fix is omitting the content-type header if there’s no content - maybe you can create an issue on GH to fix that?

As for the work-arounds, I think both 1 and 3 are valid - they both use standard types which won’t surprise readers too much. 2 looks a bit like a hack, but would of course technically work as well :slight_smile:

I have created #3623, but kept it pretty open/generic for now.

Personally I feel that some alternative way to represent the absence of a body would be preferable to omitting the content type header. I’d argue that in the 200/OK case, there always should be a Content-type header and a body that’s consistent with the content type. If the empty body is a valid representation in the declared content type (e.g. text/plain), that’s fine and the underlying codec will simply create it, otherwise (e.g. application/json), it’s just an illegal state that ideally shouldn’t be representable. (The body “null” would be perfectly legal for the JSON content type, though.) The absence of a body, however, should ideally be restricted to 204/No Content and imply the absence of a codec, as well. This case should explicitly have to be declared in the endpoint API, though.

…but this probably would lead down the rabbit hole and ask for some major redesign, for an edge case that obviously isn’t triggered that often in practice - I’m merely being philosophical here. :slightly_smiling_face:

This does sound reasonable - however, I think that changing these defaults now could silently break some people’s code, so might not be the best idea. So maybe this is something to shelve until tapir v2?

As for 204/No Content - again I agree, that this would be a good solution, however we would still need some value (& codec) to represent “empty body” - for example, when you have alternatives, and an endpoint might return “oneOf”: either a json body, or an empty body. Both need to be somehow represented at the value level.

1 Like