[OpenAFS] sprintf -> snprintf...

Sat, 28 Jun 2003 13:03:42 -0400 (EDT)

I'm joining this thread a bit late, since I don't get to read the list on
a regular basis.  I'll try to keep up with this, though.

On Thu, 26 Jun 2003, Russ Allbery wrote:

> I'm not entirely sure where you got your snprintf from; I have another
> version with slightly different floating point support (it doesn't use the
> native sprintf and is therefore safe, but it doesn't work for floating
> point numbers whose integer component won't fit into the largest available
> integer type) that's used for INN and wget, among other packages.  Let me
> know if you want a copy.

Actually, I wrote it from scratch a while back, for use in a couple of
projects I was working on, both personally and for work.  The goal was to
provide a well-behaved drop-in replacement for vsnprintf and its
derivatives that would support any formats that were thrown at it, and
which could be counted on to correctly limit its output to the requested
length.  It replaced the "compatibility" version we were previously using
here, which simply called sprintf and ignored the length argument.

Sadly, I've not been quite as good as I should at keeping the various
copies in sync.  I'll attempt to correct this by publishing a reference
version, containing all the bug fixes I know about to date, and whatever
additional features we end up deciding are needed.

The use of the system sprintf for floating-point formats was purely an
expediency; I needed something portable and didn't have the time or
inclination to roll my own floating-point formatter.  I'd be quite
interested in using yours, provided there are no licensing problems.
I think the limitation on the size of the integer component is probably
something we can live with, at least for a while.

I'd also be interested in your test suite; when I wrote this, I didn't
have any systematic way of testing it, and it showed -- somehow, it
managed to escape for quite some time emitting octal representations for
%x !   In fact, I see that you've already posted it.

As has been pointed out, we don't implement a number of newer features:

- a,A for hex-formatted floating point numbers
- C,S, and the use of 'l' on c,s to convert multibyte characters
- Length specifiers other than h,l.  There are many of these, not all
  of which are standard, but hh,ll,L,q are widely recognized, and I
  see no reason not to support them.  Also of interest is 'j', which is
  the "largest integer size we support"  specifier.  This addresses
  precisely the problem that Lindsay Todd described.
- The single-quote flag, which specifies the use of local-dependent
  thousands separators in decimal and floating-point conversions.
- The glibc extension flag 'I', which specifies the use of
  locale-dependent alternate digits for decimal integer conversions.
- p for formatting pointers, because though old, it's non-portable

However, we do support an additional format specifier of our own, also
'I', for formatting dotted-quad IP addresses.  I'm actually a little
disappointed at glibc's choice of this character for an extension _flag_,
but there's not much to do about that.  I'm open to suggestions for an
alternate specifier to be used for IPv4 addresses (and perhaps someday,,
with an 'l' length specifier, IPv6 addresses).

I chose not to support positional argument notation because it's somewhat
of a pain to get right, and because at the time, the only use I could see
for it was an in inherently-insecure situation where a user with no
control over the arguments nonetheless got to specify the format string.
I now see its applications in internationalization, but I'm not sure I
know how to go about properly handling these.  I suspect I'd have to do
two complete passes over the format string.

Finally, the issue of return values.  As the Linux man page points out,
C99 and SUSv2 differ on this point.  C99 specifies a return value
indicating the length of the formatted string, whether or not it fits in
the buffer provided, while SUSv2 specifies a return value less than 0 if
the string does not fit.  At the time, I chose the SUSv2 interpretation
because it was what most of the platforms (and all of the code) that I
cared about expected, and because it was slightly easier to implement. I
do see the advantage of the C99 approach, and I'd be happy to switch to
that model.

I'm going to propose the following changes:
- merge in any bug fixes from CMU and OpenAFS
- add russ's floating-point code, subject to licensing constraints
- add support for hh,ll,L,q,j
- add compile-time conditionals to control whether the IP address
  specifier is support, and whether it appears as 'I', some new
  character, or both
- add support for positional parameter notation
- add a compile-time option to select return value semantics
- fix any more bugs found by the test suite

Most of these should take relatively little time; I suspect I can do
everything except the positional-parameter implementation in a day or so.
Once I have, I'll publish a copy somewhere on grand.central.org, and
forward it to Derrick so he can update what's in openafs.

Commants?

-- Jeff