Monday 11th May 2015 6.12pm
Mon, May 11, 2015Link shared: http://cppnow2015.sched.org/event/37beb4ec955c082f70729e4f6d1a1a05#.VUuMqvkUUuU
As part of publicising my C++ Now 2015 talk next week, here is part 8 of 20 from its accompanying Handbook of Examples of Best Practice for C++ 11⁄14 (Boost) libraries:
8. DESIGN: (Strongly) consider using constexpr semantic wrapper transport types to return states from functions
Thanks to constexpr and rvalue refs, C++ 11 codebases have much superior ways of returning states from functions. Let us imagine this C++ 11 function:
std::shared_ptr<handle_type> openfile(std::filesystem::path path)
{
int fd;
while(-1==(fd=::open(path.c_str(), O_RDWR|O_EXCL)) && EINTR==errno);
if(-1==fd)
{
int code=errno;
std::error_code ec(code, generic_category());
std::string errstr(strerror(code));
throw std::system_error(ec, std::move(errstr));
}
return std::make_shared<handle_type>(fd);
}
This is a highly simplified example, but an extremely common pattern in one form or another: when C++ code calls something not C++ and it returns an error, convert it into an exception and throw it. Else construct and return a RAII holding smart pointer to manage the resource just acquired.
The really nice thing about this highly simple design is that its API nicely matches its semantic meaning: if it succeeds you always get a shared_ptr. If it fails you always get an exception throw. Easy.
Unfortunately, throwing exceptions has unbounded time guarantees due to RTTI lookups, so for any code which worries about complexity guarantees the above is unacceptable: throwing exceptions should be exceptional as the purists would put it. So traditionally speaking the 03 pattern is to provide an additional overload capable of writing into an error_code, this being the pattern traditionally used by ASIO and most Boost libraries. That way if the error_code taking overload is chosen, you get an error code instead of exception but code is still free to use the always throwing overload above:
std::shared_ptr<handle_type> openfile(std::filesystem::path path, std::error_code &ec)
{
int fd;
while(-1==(fd=::open(path.c_str(), O_RDWR|O_EXCL)) && EINTR==errno);
if(-1==fd)
{
int code=errno;
ec=std::error_code(code, generic_category());
return std::shared_ptr<handle_type>(); // Return a null pointer on error
}
return std::make_shared<handle_type>(fd); // This function can't be noexcept as it can throw bad_alloc
}
This pushes the problem of checking for error conditions and interpreting error codes onto the caller, which is okay if a little potentially buggy if the caller doesn't catch all the outcomes. Note that code calling this function must still be exception safe in case bad_alloc is thrown. One thing which is lost however is semantic meaning of the result, so above we are overloading a null shared_ptr to indicate when the function failed which requires the caller to know that fact instead of instantly being able to tell from the API return type. Let's improve on that with a std::optional<T>:
namespace std { using std::experimental; }
std::optional<std::shared_ptr<handle_type>> openfile(std::filesystem::path path, std::error_code &ec)
{
int fd;
while(-1==(fd=::open(path.c_str(), O_RDWR|O_EXCL)) && EINTR==errno);
if(-1==fd)
{
int code=errno;
ec=std::error_code(code, generic_category());
return std::nullopt;
}
return std::make_optional(std::make_shared<handle_type>(fd));
}
So far, so good, though note we can still throw exceptions and all of the above worked just fine in C++ 03 as Boost provided an optional<T> implementation for 03. However the above is actually semantically suboptimal now we have C++ 11, because C++ 11 lets us encapsulate far more semantic meaning which is cost free at runtime using a monadic transport like Boost.Expected:
namespace std { using std::experimental; }
std::expected<std::shared_ptr<handle_type>, std::error_code> openfile(std::filesystem::path path)
{
int fd;
while(-1==(fd=::open(path.c_str(), O_RDWR|O_EXCL)) && EINTR==errno);
if(-1==fd)
{
int code=errno;
return std::make_unexpected(std::error_code(code, generic_category());
}
return std::make_shared<handle_type>(fd);
}
The expected outcome is a shared_ptr to a handle_type, the unexpected outcome is a std::error_code, and the catastrophic outcome is the throwing of bad_alloc. Code using openfile() can either manually check the expected (its bool operator is true if the expected value is contained, false if the unexpected value) or simply unilaterally call expected<>.value() which will throw if the value is unexpected, thus converting the error_code into an exception. As you will immediately note, this eliminates the need for two openfile() overloads because the single monadic return based implementation can now perform both overloads with equal convenience to the programmer. On the basis of halving the number of APIs a library must export, use of expected is a huge win.
However I am still not happy with this semantic encapsulation because it is a poor fit to what opening files actually means. Experienced programmers will instantly spot the problem here: the open() call doesn't just return success vs failure, it actually has five outcome categories:
1. Success, returning a valid fd.
2. Temporary failure, please retry immediately: EINTR
3. Temporary failure, please retry later: EBUSY, EISDIR, ELOOP, ENOENT, ENOTDIR, EPERM, EACCES (depending on changes on the filing system, these could disappear or appear at any time)
4. Non-temporary failure due to bad or incorrect parameters: EINVAL, ENAMETOOLONG, EROFS
5. Catastrophic failure, something is very wrong: EMFILE, ENFILE, ENOSPC, EOVERFLOW, ENOMEM, EFAULT
So you can see the problem now: what we really want is for category 3 errors to only return with error_code, whilst category 4 and 5 errors plus bad_alloc to probably emerge as exception throws (these aren't actually the ideal outcomes, but we'll assume this mapping for the purposes of brevity here). That way the C++ semantics of the function would closely match the semantics of opening files. So let's try again:
namespace std { using std::experimental; }
std::expected<
std::expected<
std::shared_ptr<handle_type>, // Expected outcome
std::error_code>, // Expected unexpected outcome
std::exception_ptr> // Unexpected outcome
openfile(std::filesystem::path path) noexcept // Note the noexcept guarantee!
{
int fd;
while(-1==(fd=::open(path.c_str(), O_RDWR|O_EXCL)) && EINTR==errno);
try
{
if(-1==fd)
{
int code=errno;
// If a temporary failure, this is an expected unexpected outcome
if(EBUSY==code || EISDIR==code || ELOOP==code || ENOENT==code || ENOTDIR==code || EPERM==code || EACCES==code)
return std::make_unexpected(std::error_code(code, generic_category());
// If a non-temporary failure, this is an unexpected outcome
std::string errstr(strerror(code));
return std::make_unexpected(std::make_exception_ptr(std::system_error(ec, std::move(errstr))));
}
return std::make_shared<handle_type>(fd);
}
catch(…)
{
// Any exception thrown is truly unexpected
return std::make_unexpected(std::current_exception());
}
}
There are some very major gains now in this design:
1. Code calling openfile() no longer need to worry about exception safety - all exceptional outcomes are always transported by the monadic expected transport. This lets the compiler do better optimisation, eases use of the function, and leads to few code paths to test which means more reliable, better quality code.
2. The semantic outcomes from this function in C++ have a close mapping to that of opening files. This means code you write more naturally flows and fits to what you are actually doing.
3. Returning a monadic transport means you can now program monadically against the result e.g. value_or(), then() and so on. Monadic programming - if and only if there is no possibility of exception throws - is also a formal specification, so you could in some future world use a future clang AST tool to formally verify the mathematical correctness of some monadic logic if and only if all the monadic functions you call are noexcept. That's enormous for C++.
You may have noticed though the (Strongly) in the title of this section being in brackets, and if you guessed there are caveats in the above then you are right. The first big caveat is that the expected<T, E> implementation in Boost.Expected is very powerful and full featured, but unfortunately has a big negative effect on compile times, and that rather ruins it for the majority of people who only need about 10% of what it provides (and would rather like that to be quick to compile). The second caveat is that integration between Expected and Future-Promise especially with resumable functions in the mix is currently poorly defined, and using Expected now almost certainly introduces immediate technical debt into your code that you'll have to pay for later.
The third caveat is that I personally plan to write a much lighter weight monadic result transport which isn't as flexible as expected<T, E> (and probably hard coded to a T, error_code and exception_ptr outcomes) but would have negligible effects on compile times, and very deep integration with a non-allocating all-constexpr new lightweight future-promise implementation. Once implemented, my monadic transport may be disregarded by the community, evolved more towards expected<T, E>, or something else entirely may turn up.
In other words, I recommend you very strongly consider some mechanism for more closely and cleanly matching C++ semantics with what a function does now that C++ 11 makes it possible, but I unfortunately cannot categorically recommend one solution over another at the time of writing.
http://cppnow2015.sched.org/event/37beb4ec955c082f70729e4f6d1a1a05#.VUuMqvkUUuU